Deck 5: Reliability and Validity
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/45
Play
Full screen (f)
Deck 5: Reliability and Validity
1
The reliability of a test refers to the extent that the test is free from
A) random error.
B) measurement bias.
C) systematic error.
D) individual traits.
A) random error.
B) measurement bias.
C) systematic error.
D) individual traits.
A
2
Consider the following item that might appear on a Likert scale: "I normally do not get along well with other people." Such an item has high __________ validity, but may not have __________ validity because respondents may not answer it honestly.
A) face; construct
B) construct; face
C) face; criterion
D) construct; criterion
A) face; construct
B) construct; face
C) face; criterion
D) construct; criterion
A
3
A researcher is developing a Likert scale to measure the self-esteem of a group of college students. He considers self-esteem to be a state, in the sense that it will change from day to day based on a person's experiences. Which of the following forms of reliability should he use to assess self-esteem?
A) Internal consistency
B) Interrater
C) Test-retest
D) Equivalent forms
A) Internal consistency
B) Interrater
C) Test-retest
D) Equivalent forms
A
4
A researcher has developed a new test of "interpersonal skills." It consists of measuring a person's shoe size. The test shows a test-retest reliability of .85. It appears that this test has __________ and ___________.
A) high reliability; high face validity
B) high reliability; low face validity
C) low reliability; high face validity
D) low reliability; low face validity
A) high reliability; high face validity
B) high reliability; low face validity
C) low reliability; high face validity
D) low reliability; low face validity
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
5
The Graduate Record Exam (GRE) uses which approach to assess reliability?
A) Interrater
B) Equivalent forms
C) Item to total
D) Unexpected item
A) Interrater
B) Equivalent forms
C) Item to total
D) Unexpected item
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
6
Which of the following two techniques can be used to assess the internal consistency of a scale?
A) Coefficient alpha, split-half reliability
B) Coefficient alpha, test-retest reliability
C) Test-retest reliability, equivalent forms reliability
D) Test-retest reliability, split-half reliability
A) Coefficient alpha, split-half reliability
B) Coefficient alpha, test-retest reliability
C) Test-retest reliability, equivalent forms reliability
D) Test-retest reliability, split-half reliability
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
7
A researcher has found that his new measure designed to assess the conceptual variable of "exoticism" correlates about +.85 with an existing measure designed to assess "unusualness." The researcher needs to worry about the __________ of his measure.
A) random error
B) measurement bias
C) face validity
D) discriminant validity
A) random error
B) measurement bias
C) face validity
D) discriminant validity
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
8
Which of the following statistics would be used to assess the reliability of a set of raters who used a nominal variable to code behavior?
A) Kappa
B) Coefficient alpha
C) Pearson R
D) The reliability statistic
A) Kappa
B) Coefficient alpha
C) Pearson R
D) The reliability statistic
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
9
A form of reliability that is used to assess the consistency of judgment when more than one person has observed a set of events is known as
A) judgment reliability.
B) interrater reliability.
C) construct reliability.
D) internal reliability.
A) judgment reliability.
B) interrater reliability.
C) construct reliability.
D) internal reliability.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
10
Which of the following equations represents the relationship between true score, actual score, and reliability?
A) Reliability equals true score divided by actual score.
B) Reliability equals true score plus actual score.
C) Reliability equals actual score divided by true score.
D) Reliability equals actual score minus true score.
A) Reliability equals true score divided by actual score.
B) Reliability equals true score plus actual score.
C) Reliability equals actual score divided by true score.
D) Reliability equals actual score minus true score.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
11
Changes in a person's current mood, a misreading or misunderstanding of the question, and measuring individuals on different days or in different places are all likely to contribute to
A) random error.
B) systematic error.
C) reliable error.
D) operational error.
A) random error.
B) systematic error.
C) reliable error.
D) operational error.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
12
A measured variable that contains a large proportion of random error is said to be
A) unreliable.
B) invalid.
C) a state variable.
D) a trait variable.
A) unreliable.
B) invalid.
C) a state variable.
D) a trait variable.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
13
Consider a researcher who developed a test of driving skills that involved measuring the ability to effectively shift gears in a manual transmission car. One might criticize this test for having low
A) content validity.
B) criterion validity.
C) predictive validity.
D) discriminant validity.
A) content validity.
B) criterion validity.
C) predictive validity.
D) discriminant validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
14
If it is found that people's self-esteem changes quickly-for instance, from day to day-then self-esteem should be considered to be
A) a trait variable.
B) a state variable.
C) an unreliable variable.
D) an invalid variable.
A) a trait variable.
B) a state variable.
C) an unreliable variable.
D) an invalid variable.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
15
Kappa is used
A) to test the construct validity of a measure.
B) to test the criterion validity of a measure.
C) to compute the reliability of a set of nominal variables.
D) to eliminate items from a Likert scale.
A) to test the construct validity of a measure.
B) to test the criterion validity of a measure.
C) to compute the reliability of a set of nominal variables.
D) to eliminate items from a Likert scale.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
16
Donna is in a two-part experiment. She completed a self-esteem measure one week ago and is now taking the same measure again. Which of the following problems may arise from this process?
A) Systematic error phenomenon
B) Violation of face validity
C) Equivalent form effects
D) Retesting effects
A) Systematic error phenomenon
B) Violation of face validity
C) Equivalent form effects
D) Retesting effects
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following is true about the difference between random and systematic error?
A) Random error is self-canceling, whereas systematic error tends to increase or decrease the scores on the measured variable.
B) Random error tends to increase or decrease the scores on the dependent variable, whereas systematic error is self-canceling.
C) Both random error and systematic error are self-canceling.
D) Because systematic error is self-canceling, it is less problematic in research than is random error.
A) Random error is self-canceling, whereas systematic error tends to increase or decrease the scores on the measured variable.
B) Random error tends to increase or decrease the scores on the dependent variable, whereas systematic error is self-canceling.
C) Both random error and systematic error are self-canceling.
D) Because systematic error is self-canceling, it is less problematic in research than is random error.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
18
Errors in recording the scores on a test that occur because the keypuncher is not paying attention are likely to produce
A) spurious relationships.
B) low face validity.
C) random error.
D) systematic error.
A) spurious relationships.
B) low face validity.
C) random error.
D) systematic error.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
19
Cronbach's coefficient alpha is a measure of which of the following?
A) The internal consistency of a scale
B) Test-retest reliability
C) Interrater reliability
D) Construct reliability
A) The internal consistency of a scale
B) Test-retest reliability
C) Interrater reliability
D) Construct reliability
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
20
Cronbach's coefficient alpha is a measure of which of the following?
A) The average validity of an item
B) The correlation between one item and the total scale score
C) The average correlation among the items
D) The face validity of a scale
A) The average validity of an item
B) The correlation between one item and the total scale score
C) The average correlation among the items
D) The face validity of a scale
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
21
A scientist is interested in developing a scale to measure people's mood states. Which of the following forms of validity is likely to be most appropriate?
A) Test-retest
B) Equivalent forms
C) Internal consistency
D) Content reliability
A) Test-retest
B) Equivalent forms
C) Internal consistency
D) Content reliability
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
22
If the average correlation among the items on an anxiety scale is r = .95, we can say that the scale has
A) high content validity.
B) low internal consistency.
C) high internal consistency.
D) high construct validity.
A) high content validity.
B) low internal consistency.
C) high internal consistency.
D) high construct validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
23
Which of the following is an example of systematic error?
A) The participant misreads the question.
B) The experimenter misprints the question.
C) The participant forgets to answer a question.
D) The participant displays socially desirable responding.
A) The participant misreads the question.
B) The experimenter misprints the question.
C) The participant forgets to answer a question.
D) The participant displays socially desirable responding.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
24
Which of the following is a correct statement?
A) A scale must be reliable in order to have construct validity.
B) A scale must have construct validity in order to be reliable.
C) A scale must have face validity in order to be reliable.
D) A scale must be reliable in order to have face validity.
A) A scale must be reliable in order to have construct validity.
B) A scale must have construct validity in order to be reliable.
C) A scale must have face validity in order to be reliable.
D) A scale must be reliable in order to have face validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
25
The nomological net refers to
A) the reliability of a measure.
B) the correlations among variables that measure different constructs.
C) the content validity of a measure.
D) the network of reliabilities among conceptual variables.
A) the reliability of a measure.
B) the correlations among variables that measure different constructs.
C) the content validity of a measure.
D) the network of reliabilities among conceptual variables.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
26
A measure of schizophrenia that measures only elements of auditory hallucinations, but not other symptoms of schizophrenia, would be said to have
A) low reliability.
B) high convergent validity.
C) low content validity.
D) low face validity.
A) low reliability.
B) high convergent validity.
C) low content validity.
D) low face validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
27
Which of the following would be likely to improve the reliability and validity of a measure?
A) Using only one measure to assess the conceptual variable
B) Reducing the number of questions on the test
C) Using pilot testing to select good questions
D) Decreasing the variability of measures
A) Using only one measure to assess the conceptual variable
B) Reducing the number of questions on the test
C) Using pilot testing to select good questions
D) Decreasing the variability of measures
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
28
Reliability is used to assess _____________, whereas validity is used to assess _____________.
A) the consistency of a measure; the accuracy of a measure
B) the accuracy of a measure; the consistency of a measure
C) the generalizability of a measure; the reactance of a measure
D) test reactance of a measure; the generalizability of a measure
A) the consistency of a measure; the accuracy of a measure
B) the accuracy of a measure; the consistency of a measure
C) the generalizability of a measure; the reactance of a measure
D) test reactance of a measure; the generalizability of a measure
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
29
A researcher who used a self-report measure of job interest to predict performance on a managerial task would be interested in the _____ of her self-report measure.
A) predictive validity
B) construct validity
C) content validity
D) face validity
A) predictive validity
B) construct validity
C) content validity
D) face validity
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
30
The nomological net refers to
A) the correlations among many different measured variables.
B) the Pearson correlation between two measured variables.
C) the correlation between a conceptual variable and a measured variable.
D) the Pearson correlation between two conceptual variables.
A) the correlations among many different measured variables.
B) the Pearson correlation between two measured variables.
C) the correlation between a conceptual variable and a measured variable.
D) the Pearson correlation between two conceptual variables.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
31
Which of the following statements is true?
A) A scientist who is studying construct validity will be more concerned about what conceptual variable is actually being measured than will a scientist who is concerned with criterion validity.
B) A scientist who is studying criterion validity will be more concerned with the conceptual variable being measured than will a scientist who is concerned with construct validity.
C) Construct validity is normally used to make predictions about a behavior that will occur in the future.
D) Criterion validity can be considered to be another form of face validity.
A) A scientist who is studying construct validity will be more concerned about what conceptual variable is actually being measured than will a scientist who is concerned with criterion validity.
B) A scientist who is studying criterion validity will be more concerned with the conceptual variable being measured than will a scientist who is concerned with construct validity.
C) Construct validity is normally used to make predictions about a behavior that will occur in the future.
D) Criterion validity can be considered to be another form of face validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
32
Stanley developed a measure of hopefulness that correlates about -.90 with a measure of despair. Stanley should now worry about the __________ of his new measure.
A) discriminant validity
B) face validity
C) content validity
D) predictive validity
A) discriminant validity
B) face validity
C) content validity
D) predictive validity
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
33
Of the following, which is the LEAST objective measure of construct validity?
A) Convergent validity
B) Discriminant validity
C) Criterion validity
D) Content validity
A) Convergent validity
B) Discriminant validity
C) Criterion validity
D) Content validity
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
34
A test designed to measure baseball performance skills would be said to have high predictive validity if a person who scores low on the test also
A) scores low on a test measuring basketball performance.
B) scores high on a test measuring basketball performance.
C) has a poor batting average over the baseball season.
D) scores high on the same test given two weeks later.
A) scores low on a test measuring basketball performance.
B) scores high on a test measuring basketball performance.
C) has a poor batting average over the baseball season.
D) scores high on the same test given two weeks later.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
35
The students in Dr. Miller's class are complaining that, although the exam was supposed to cover chapters 1 through 5, the test only covered chapters 1 and 2. Why are the students upset?
A) The test isn't reliable.
B) The test doesn't have face validity.
C) The test doesn't have content validity.
D) The test has incomplete validity.
A) The test isn't reliable.
B) The test doesn't have face validity.
C) The test doesn't have content validity.
D) The test has incomplete validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
36
Andrea is given a questionnaire concerning her shopping habits. Two weeks later, she is given a similar questionnaire that also assesses shopping habits, but which contains different questions. The researcher will likely assess which of the following?
A) interrater reliability
B) equivalent-forms reliability
C) face validity
D) construct validity
A) interrater reliability
B) equivalent-forms reliability
C) face validity
D) construct validity
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
37
A semantic differential scale that contains 15 percent random error would be considered
A) reliable.
B) unreliable.
C) valid.
D) invalid.
A) reliable.
B) unreliable.
C) valid.
D) invalid.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
38
In order to better understand how individuals react to a questionnaire before using the questionnaire in a research project, the researcher may wish to
A) pilot test.
B) compute coefficient alpha.
C) compute kappa.
D) delete all distractor items from the scale.
A) pilot test.
B) compute coefficient alpha.
C) compute kappa.
D) delete all distractor items from the scale.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
39
In general, when a measure has good convergent and discriminant validity
A) the correlations that measure convergent validity will be about r = 1.0 and correlations that assess discriminant validity will be about r = 0.00.
B) the correlations will not approach r = 1.0 or r = 0.00, due to random error and the fact that many different conceptual variables are likely to be at least somewhat related to each other,.
C) it will not be possible to differentiate the size of the correlations that represent convergent validity from the size of the correlations that represent discriminant validity.
D) the concurrent validity of the relationships will be greater than the criterion validity of the relationships.
A) the correlations that measure convergent validity will be about r = 1.0 and correlations that assess discriminant validity will be about r = 0.00.
B) the correlations will not approach r = 1.0 or r = 0.00, due to random error and the fact that many different conceptual variables are likely to be at least somewhat related to each other,.
C) it will not be possible to differentiate the size of the correlations that represent convergent validity from the size of the correlations that represent discriminant validity.
D) the concurrent validity of the relationships will be greater than the criterion validity of the relationships.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
40
Discriminant validity refers to
A) the extent to which a measured variable does not correlate with other variables designed to measure different conceptual variables.
B) the extent to which a measured variable correlates with other measured variables designed to assess different conceptual variables.
C) the extent to which a measured variable correlates with another measured variable over time.
D) the extent to which a measured variable discriminates against other measured variables.
A) the extent to which a measured variable does not correlate with other variables designed to measure different conceptual variables.
B) the extent to which a measured variable correlates with other measured variables designed to assess different conceptual variables.
C) the extent to which a measured variable correlates with another measured variable over time.
D) the extent to which a measured variable discriminates against other measured variables.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
41
The major advantage of assessing similar conceptual relationships in many different studies is that
A) it reduces the need for the nomological net.
B) random assignment is less important.
C) a larger p-value can be used.
D) the greater number of tested and confirmed predicted relationships, the greater the construct validity of the relationships.
A) it reduces the need for the nomological net.
B) random assignment is less important.
C) a larger p-value can be used.
D) the greater number of tested and confirmed predicted relationships, the greater the construct validity of the relationships.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
42
. Hillyer and Joynes (2009) created a new measure of locomotion in rats. Which of the following approaches did they use to determine the reliability of the new measure.
A) Coefficient Alpha.
B) Split half reliability
C) Inter-rater reliability
D) Test-retest reliability
A) Coefficient Alpha.
B) Split half reliability
C) Inter-rater reliability
D) Test-retest reliability
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
43
In the study by Hillkyer and Joynes (1995) on locomotion in rats, the researchers tested for productive validity by correlating an existing measures (the BBB) and a new measure (the HUK) with a physiological assessment of mobility. Which of the following was found?
A) The new measure was more sophisticated.
B) The new measure was better able to predict mobility.
C) The HiJK correlated negatively with the existing measure.
D) The HiJK was not reliable.
A) The new measure was more sophisticated.
B) The new measure was better able to predict mobility.
C) The HiJK correlated negatively with the existing measure.
D) The HiJK was not reliable.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
44
If the average correlation among the items on a scale of self-esteem is r = .15, we can say that the scale has
A) high content validity.
B) low internal consistency.
C) high internal consistency.
D) high construct validity.
A) high content validity.
B) low internal consistency.
C) high internal consistency.
D) high construct validity.
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck
45
Individual difference variables that are expected to vary over a period of time are called ________, while variables that are not expected to vary over time are called _________.
A) measured variables; conceptual variables
B) states; traits
C) moving variables; stable variables
D) traits; states
A) measured variables; conceptual variables
B) states; traits
C) moving variables; stable variables
D) traits; states
Unlock Deck
Unlock for access to all 45 flashcards in this deck.
Unlock Deck
k this deck