Deck 3: Measurement Reliability
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/28
Play
Full screen (f)
Deck 3: Measurement Reliability
1
Explain the relationship between true score, observed score, and random error.
No Answer.
2
Compare and contrast the test-retest and parallel forms techniques for assessing reliability.
No Answer.
3
Discuss some of the different ways that systematic vs. nonsystematic error can affect the outcome of an experiment.
No Answer.
4
What are some advantages and unique possibilities of latent factor analysis, compared to more classical test theory approaches?
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
5
In a study that investigated the relationship between television watching habits and work productivity, researchers asked respondents three items related to their TV watching:
a. Item 1: On an average week, do you watch television?
b. Item 2: How often do you watch television with others (rather than by yourself)?
c. Item 3: How many hours per day do you watch television?
However, when the researchers ran their item response analyses, they realized they never learned how to interpret them - so they called you. Your task is to describe what is happening in the figure below. Specifically: (a) briefly describe what we are looking at - what do "discrimination" and "ability (or theta)" represent in the context of the study? (b) interpret the three curves for someone at theta = -2, and describe to the researchers what that means; (c) describe how we would determine whether a curve reflects "good" discrimination; and (d) based on that information, suggest to the researchers which item(s) they should keep and which item(s) they should remove, and be sure to explain why you are offering those suggestions.

a. Item 1: On an average week, do you watch television?
b. Item 2: How often do you watch television with others (rather than by yourself)?
c. Item 3: How many hours per day do you watch television?
However, when the researchers ran their item response analyses, they realized they never learned how to interpret them - so they called you. Your task is to describe what is happening in the figure below. Specifically: (a) briefly describe what we are looking at - what do "discrimination" and "ability (or theta)" represent in the context of the study? (b) interpret the three curves for someone at theta = -2, and describe to the researchers what that means; (c) describe how we would determine whether a curve reflects "good" discrimination; and (d) based on that information, suggest to the researchers which item(s) they should keep and which item(s) they should remove, and be sure to explain why you are offering those suggestions.

Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
6
_____(a)_____ is the degree of relationship between an instrument and the construct it is intended to measure. _____(b)_____ is the consistency with which an instrument assesses a given construct.
A) reliability; validity
B) validity; reliability
C) internal validity; external validity
D) external validity; internal validity
A) reliability; validity
B) validity; reliability
C) internal validity; external validity
D) external validity; internal validity
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
7
A person's observed response pattern on any measurement scale is a combination of true score and error. In the context of this equation, "true score" refers to:
A) the score that represents a person's "true" underlying feelings or attitudes if they could somehow be measured in reality, free of error
B) the score that corresponds to the relevant underlying latent factor.
C) the score that represents the internal consistency of the concept being measured
D) the score that perfectly represents the "true" underlying construct, were it free from error
A) the score that represents a person's "true" underlying feelings or attitudes if they could somehow be measured in reality, free of error
B) the score that corresponds to the relevant underlying latent factor.
C) the score that represents the internal consistency of the concept being measured
D) the score that perfectly represents the "true" underlying construct, were it free from error
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
8
A person's observed response pattern on any measurement scale is a combination of true score and error. In the context of this equation, "error" refers to:
A) random error
B) systematic error
C) sampling error
D) combination of A and B
E) combination of A, B, and C
F) none of the above
A) random error
B) systematic error
C) sampling error
D) combination of A and B
E) combination of A, B, and C
F) none of the above
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
9
In an experiment on teen aggression and violent video games, some teens were driven to the study by their parents, while others drove themselves. It turned out that participants who were driven by their parents were generally more angry when they arrived (and thus exhibited a stronger relationship between aggression and video games), compared to participants who drove themselves. The issue of transportation to the study represents what potential problem?
A) random error
B) systematic error
C) sampling error
D) selection error
A) random error
B) systematic error
C) sampling error
D) selection error
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
10
Which of the following techniques are specifically designed to test the reliability of a scale or measurement instrument? (circle all that apply)
A) test-retest
B) factor analysis
C) split-half
D) parallel forms
A) test-retest
B) factor analysis
C) split-half
D) parallel forms
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
11
The concept of reliability includes the following factors:
A) internal consistency
B) temporal stability
C) generalizability
D) A and B only
E) B and C only
F) all of the above
A) internal consistency
B) temporal stability
C) generalizability
D) A and B only
E) B and C only
F) all of the above
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
12
Which of the following are methods of assessing the internal consistency of a scale? (circle all that apply)
A) test-retest
B) split-half
C) Cronbach's alpha
D) parallel forms
A) test-retest
B) split-half
C) Cronbach's alpha
D) parallel forms
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
13
Edward and Elvira developed a 50-item scale that measured the extent to which people were gullible to practical jokes. After collecting data from 1,000 participants, their scale demonstrated poor internal consistency (Cronbach's alpha = 0.40). What (if any) suggestions would have the potential to dramatically increase the overall reliability of their scale? (circle all that apply)
A) Review the item-total correlations for every item in the scale, and delete those items that are poorly correlated with the majority of other items.
B) Add 10 relatively "good" items to the scale.
C) Recruit another 1,000 participants to reduce the effect that random error has on the inter-item covariations.
D) Run a factor analysis for the total scale, and upon noticing that the scale was unintentionally multidimensional, eliminate all items that do not load on Factor 1.
A) Review the item-total correlations for every item in the scale, and delete those items that are poorly correlated with the majority of other items.
B) Add 10 relatively "good" items to the scale.
C) Recruit another 1,000 participants to reduce the effect that random error has on the inter-item covariations.
D) Run a factor analysis for the total scale, and upon noticing that the scale was unintentionally multidimensional, eliminate all items that do not load on Factor 1.
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
14
In confirmatory factor analysis, the measurement error terms represent what type(s) of error excluded from the latent factor? (circle all that apply)
A) random error
B) systematic error
C) sampling error
D) none of the above
A) random error
B) systematic error
C) sampling error
D) none of the above
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
15
Item characteristic curves can inform us of important underlying features of the measurement items that we use. One property of item characteristic curves is the guessing parameter. This parameter reflects:
A) the probability that respondents of the highest ability will be able to guess the correct response to an item
B) the proportion of error in respondents' responses to a specific item
C) the probability that respondents of the lowest ability will be able to guess the correct response to an item
D) the percentage of respondents who will guess the correct answer, regardless of ability level
A) the probability that respondents of the highest ability will be able to guess the correct response to an item
B) the proportion of error in respondents' responses to a specific item
C) the probability that respondents of the lowest ability will be able to guess the correct response to an item
D) the percentage of respondents who will guess the correct answer, regardless of ability level
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
16
Face validity differs from other forms of validity in that:
A) it does not deal with the content of specific scale items
B) it is the most difficult form of validity to achieve
C) it cannot be established in a rigorous, systematic manner
D) items low in face validity can still be of use to the researcher(s)
E) it addresses whether a scale or set of items correlates with another closely related measure
A) it does not deal with the content of specific scale items
B) it is the most difficult form of validity to achieve
C) it cannot be established in a rigorous, systematic manner
D) items low in face validity can still be of use to the researcher(s)
E) it addresses whether a scale or set of items correlates with another closely related measure
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
17
If a hypothetical scale meant to assess knowledge of zoology contained questions about animals, plants, and inanimate objects, the scale would be said to have low _______ validity.
A) convergent
B) predictive
C) content
D) concurrent
E) none of the above
A) convergent
B) predictive
C) content
D) concurrent
E) none of the above
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
18
Generating a large pool of items, consulting a panel of experts, and conducting a review of the literature, are all ways to improve:
A) content validity
B) predictive validity
C) convergent validity
D) discriminant validity
E) criterion validity
A) content validity
B) predictive validity
C) convergent validity
D) discriminant validity
E) criterion validity
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
19
Concurrent and predictive validity:
A) Are synonymous
B) Are two subtypes of criterion validity
C) Are both desirable, however concurrent validity is deemed the more rigorous of the two
D) Are both desirable, however predictive validity is deemed the more rigorous of the two
E) Are both necessary to establish discriminant validity
A) Are synonymous
B) Are two subtypes of criterion validity
C) Are both desirable, however concurrent validity is deemed the more rigorous of the two
D) Are both desirable, however predictive validity is deemed the more rigorous of the two
E) Are both necessary to establish discriminant validity
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
20
How can convergent and discriminant validation techniques be combined to establish construct validity?
(triangulation, different sources of error, MTMMM)
(triangulation, different sources of error, MTMMM)
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
21
Briefly define the following components of Campbell and Fiske's (1959) Multi-trait Multi-Method Matrix:
A) Heterotrait-heteromethod values
B) Monotrait-heteromethod values
C) Reliability diagonals
D) Heterotrait-monomethod values
A) Heterotrait-heteromethod values
B) Monotrait-heteromethod values
C) Reliability diagonals
D) Heterotrait-monomethod values
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
22
In the MTMMM, monotrait-heteromethod triangles:
A) assess convergent validity
B) indicate the reliability of multiple measures
C) represent the relationship between different methods of measurement of the same trait or construct
D) show the correlations between multiple traits measured using the same method
E) show the correlations between multiple traits and multiple methods of measurement
A) assess convergent validity
B) indicate the reliability of multiple measures
C) represent the relationship between different methods of measurement of the same trait or construct
D) show the correlations between multiple traits measured using the same method
E) show the correlations between multiple traits and multiple methods of measurement
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
23
Which of the following is NOT a requirement set forth by Campbell and Fiske (1959) for establishing construct validity using the MTMMM?
A) Establishing convergent validity across measurement methods
B) Statistically significant monotrait-heteromethod values
C) Higher values in monotrait-heteromethod triangles than in heterotrait-monomethod triangles
D) Patterns of trait correlations are the same in monomethod and heteromethod blocks
E) Heterotrait-heteromethod values exceed heterotrait-monomethod values
A) Establishing convergent validity across measurement methods
B) Statistically significant monotrait-heteromethod values
C) Higher values in monotrait-heteromethod triangles than in heterotrait-monomethod triangles
D) Patterns of trait correlations are the same in monomethod and heteromethod blocks
E) Heterotrait-heteromethod values exceed heterotrait-monomethod values
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
24
Imagine that Susan is a high school senior completing a survey for your research study. Presume that earlier that morning, Susan received a rejection letter from one of her top colleges. In addition, your survey involves the topic of racism in everyday life. Susan is also an international student, and English is not her first language. You look at Susan's completed survey and notice that all of her item responses lie close to the midpoint of your 9-point scale. Given this information, which of the following threats to validity may be at play?
A) acquiescence
B) extreme response sets - (other students may also respond in similar fashion)
C) mood - (individual differences in mood don't lead to systematic bias, not the same as weather example)
D) social desirability - (survey is on a sensitive topic that other respondents may also respond favorably on)
E) language difficulty - (individual language difficulty is different from difficult language used on the survey, not experienced by all respondents)
A) acquiescence
B) extreme response sets - (other students may also respond in similar fashion)
C) mood - (individual differences in mood don't lead to systematic bias, not the same as weather example)
D) social desirability - (survey is on a sensitive topic that other respondents may also respond favorably on)
E) language difficulty - (individual language difficulty is different from difficult language used on the survey, not experienced by all respondents)
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
25
Describe what is meant by the equation: O = T + ΣEr+s In your response, be sure to identify and describe each element of the equation.
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
26
Jackie and Jillian conducted a study investigating people's preference for sweet or savory foods, depending on the ambient temperature in the room. While they hypothesized that people in a warm room would prefer sweet foods, and people in a cold room would prefer savory foods, the results were not significant. Using the context of this study, (a) generate one example of random error that may have influenced the results; (b) generate one example of systematic error that may have influenced the results; and (c) describe how these examples that you created demonstrate the possible effects of random and systematic error on a study's findings.
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
27
William Smythe wanted to create a scale that measured the likelihood that people will burst out in laughter in a movie theater. So far, he has four items written. These three items yielded a Cronbach's alpha of 0.50 and the following output:
Given what you know about Cronbach's alpha and improving internal consistency, describe the ways in which Will Smythe could potentially increase the internal consistency of his scale (hint - there may be more than one way)?
Given what you know about Cronbach's alpha and improving internal consistency, describe the ways in which Will Smythe could potentially increase the internal consistency of his scale (hint - there may be more than one way)?
Given what you know about Cronbach's alpha and improving internal consistency, describe the ways in which Will Smythe could potentially increase the internal consistency of his scale (hint - there may be more than one way)?

Given what you know about Cronbach's alpha and improving internal consistency, describe the ways in which Will Smythe could potentially increase the internal consistency of his scale (hint - there may be more than one way)?
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck
28
In terms of reliability, describe how analyses with latent factors are different from classical test theory techniques like Cronbach's alpha or test-retest. Are they better or worse, and why?
Unlock Deck
Unlock for access to all 28 flashcards in this deck.
Unlock Deck
k this deck