Deck 5: What Is Test Reliabilityprecision
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/56
Play
Full screen (f)
Deck 5: What Is Test Reliabilityprecision
1
When a test developer gives the same test to the same group of test takers on two different occasions, the developer is gathering evidence of ______.
A) internal consistency
B) internal reliability
C) test-retest reliability
D) scorer reliability
A) internal consistency
B) internal reliability
C) test-retest reliability
D) scorer reliability
C
2
What do we call changes in test scores resulting from the sequence in which the tests were taken?
A) practice effects
B) order effects
C) fatigue effects
D) alternate effects
A) practice effects
B) order effects
C) fatigue effects
D) alternate effects
B
3
What is Dr. Jonah using when she gives students a math achievement test today and a different version of the same test in 2 months?
A) supplemental forms
B) complementary forms
C) alternate forms
D) nonparallel forms
A) supplemental forms
B) complementary forms
C) alternate forms
D) nonparallel forms
C
4
When using the split halves method of estimating reliability/precision, which one of the following requires adjustment to compensate for splitting the test into halves?
A) assignment of the test questions
B) correlation method
C) the reliability coefficient
D) the test scores
A) assignment of the test questions
B) correlation method
C) the reliability coefficient
D) the test scores
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
5
What is the term used to describe the consistency of test scores?
A) validity
B) reliability/precision
C) distribution
D) standard deviation
A) validity
B) reliability/precision
C) distribution
D) standard deviation
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
6
Which one of the following pairs of math questions would be most likely to produce an internally consistent result?
A) 8 − 10 = ? and 500 + 224 = ?
B) 22 × 48 = ? and 48 + 22 =?
C) (−8) − (+10) = ? and 8 × 10 = ?
D) 8 × 10 = ? and 10 × 8 = ?
A) 8 − 10 = ? and 500 + 224 = ?
B) 22 × 48 = ? and 48 + 22 =?
C) (−8) − (+10) = ? and 8 × 10 = ?
D) 8 × 10 = ? and 10 × 8 = ?
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
7
As the interval between administrations lengthens, test-retest reliability will most likely ______.
A) increase
B) decrease
C) vary unpredictably
D) remain unchanged
A) increase
B) decrease
C) vary unpredictably
D) remain unchanged
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
8
What do we mean when we say that a test is internally consistent?
A) a test taker only takes the test once
B) a test taker's scores will remain similar over time
C) the test questions are measuring a similar concept
D) the scores of a group of test takers will be very similar
A) a test taker only takes the test once
B) a test taker's scores will remain similar over time
C) the test questions are measuring a similar concept
D) the scores of a group of test takers will be very similar
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
9
What is the formula that Cronbach proposed for calculating internal consistency for questions that have more than two possible responses called?
A) coefficient alpha
B) KR-20
C) product moment correlation
D) Spearman Brown
A) coefficient alpha
B) KR-20
C) product moment correlation
D) Spearman Brown
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
10
Which one of the following is used for calculating internal consistency for tests whose questions can be scored as either right or wrong?
A) split halves method
B) test-retest method
C) coefficient alpha
D) KR-20 formula
A) split halves method
B) test-retest method
C) coefficient alpha
D) KR-20 formula
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
11
Which one of the following methods of estimating reliability/precision requires dividing the test into halves and then correlating the set of individual test scores on the first half with the set of individual test scores on the second half?
A) test-retest method
B) coefficient alpha
C) split-half method
D) correlation method
A) test-retest method
B) coefficient alpha
C) split-half method
D) correlation method
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
12
What do we call two forms of a test that are comparable in every way?
A) identical forms
B) similar forms
C) parallel forms
D) supplementary forms
A) identical forms
B) similar forms
C) parallel forms
D) supplementary forms
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
13
Evidence of reliability/precision indicates that ______.
A) test scores will likely be consistent across repeated measurements
B) the test taker is properly administering and using the test
C) the test is measuring what it is designed to measure
D) a test possesses an important but not essential property
A) test scores will likely be consistent across repeated measurements
B) the test taker is properly administering and using the test
C) the test is measuring what it is designed to measure
D) a test possesses an important but not essential property
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
14
Since the PAI requires test takers to provide ratings on a response scale that has four options--False, Not at all true, Slightly true, Mainly true, and Very true--the appropriate formula for estimating internal consistency is ______.
A) Spearman Brown
B) KR-20
C) product moment correlation
D) coefficient alpha
A) Spearman Brown
B) KR-20
C) product moment correlation
D) coefficient alpha
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
15
Estimating reliability/precision using methods of internal consistency is appropriate only for tests that are ______.
A) heterogeneous
B) homogeneous
C) unstandardized
D) standardized
A) heterogeneous
B) homogeneous
C) unstandardized
D) standardized
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
16
When researchers want to measure test-retest reliability/precision, they must assume test takers' ability will ______.
A) not change between the first administration and the second administration
B) increase between the first administration and the second administration
C) decrease between the first administration and the second administration
D) be affected by the order they took the tests
A) not change between the first administration and the second administration
B) increase between the first administration and the second administration
C) decrease between the first administration and the second administration
D) be affected by the order they took the tests
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
17
When tests are heterogeneous, estimates of internal consistency are likely to be ______.
A) low
B) high
C) comparable
D) homogeneous
A) low
B) high
C) comparable
D) homogeneous
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
18
An employment test for the job of sales manager that measures knowledge of sales theory, interpersonal skills, and ability to use text messaging is ______.
A) homogeneous
B) heterogeneous
C) reliable
D) generalizable
A) homogeneous
B) heterogeneous
C) reliable
D) generalizable
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
19
What is the best way to divide the test when using the split-half method?
A) put the first questions in one half and the last questions in the other half
B) put all multiple choice questions in one half and all essay questions in the other half
C) randomly assign each question to the first half or the second half
D) assign odd-numbered questions to one half and even-numbered questions to the other
A) put the first questions in one half and the last questions in the other half
B) put all multiple choice questions in one half and all essay questions in the other half
C) randomly assign each question to the first half or the second half
D) assign odd-numbered questions to one half and even-numbered questions to the other
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
20
Which one of the following is required to yield an accurate estimate of reliability/precision using the split halves method?
A) the two halves must be equivalent in length and content
B) the scores on each half must be equivalent
C) the first half must contain more questions than the second half
D) the estimate must be calculated using the coefficient alpha
A) the two halves must be equivalent in length and content
B) the scores on each half must be equivalent
C) the first half must contain more questions than the second half
D) the estimate must be calculated using the coefficient alpha
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
21
Which one of the following formulas is used when two halves of one test is used to adjust the reliability coefficient?
A) Spearman Brown
B) coefficient alpha
C) Pearson product moment correlation
D) KR-20
A) Spearman Brown
B) coefficient alpha
C) Pearson product moment correlation
D) KR-20
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
22
Which one of the following is used to provide an index of the strength and direction of the linear relationship between the two sets of scores?
A) coefficient alpha
B) correlation
C) KR-20
D) Spearman Brown
A) coefficient alpha
B) correlation
C) KR-20
D) Spearman Brown
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
23
What is the unexplained difference between the true score (T) and the obtained score (X) called?
A) systematic error
B) random error
C) internal consistency
D) coefficient alpha
A) systematic error
B) random error
C) internal consistency
D) coefficient alpha
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
24
The Spearman Brown formula is used when calculating ______.
A) only internal reliability
B) only split halves reliability
C) only scorer reliability
D) all forms of reliability
A) only internal reliability
B) only split halves reliability
C) only scorer reliability
D) all forms of reliability
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
25
Jonathan wanted to estimate the internal consistency of a multiple choice test. Which one of the following would be most appropriate for him to use?
A) Spearman Brown
B) coefficient alpha
C) Cohen's kappa
D) KR-20
A) Spearman Brown
B) coefficient alpha
C) Cohen's kappa
D) KR-20
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
26
Judy wants to estimate the internal consistency of a survey on which the respondent marks (1) Not at all, (2) Sometimes, (3) Most of the time, or (4) Always. Which one of the following is most appropriate for her to use?
A) Spearman Brown
B) coefficient alpha
C) Cohen's kappa
D) KR-20
A) Spearman Brown
B) coefficient alpha
C) Cohen's kappa
D) KR-20
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
27
Which one of the following statements is TRUE about error?
A) Systematic error increases the reliability of a test.
B) Systematic error lowers the reliability of a test.
C) Random error lowers the reliability of a test.
D) Both random and systematic error lower the reliability of a test.
A) Systematic error increases the reliability of a test.
B) Systematic error lowers the reliability of a test.
C) Random error lowers the reliability of a test.
D) Both random and systematic error lower the reliability of a test.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
28
Theoretically, if a test taker took a test, an infinite number of times and an average score was calculated from those administrations, the average test score would ______.
A) equal the true test score
B) contain the true score and error
C) contain only error
D) have no meaning
A) equal the true test score
B) contain the true score and error
C) contain only error
D) have no meaning
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
29
Cohen's kappa is a statistical method for ______.
A) estimating test-retest reliability
B) estimating interrater agreement
C) estimating internal consistency
D) correlating ratings by two judges
A) estimating test-retest reliability
B) estimating interrater agreement
C) estimating internal consistency
D) correlating ratings by two judges
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
30
Which one of the following statements regarding test reliability is TRUE?
A) Researchers must choose one method of estimating the reliability/precision of test scores.
B) Researchers can identify the causes of random error.
C) No measurement instrument is perfectly reliable or consistent.
D) A measurement instrument should be perfectly reliable or consistent.
A) Researchers must choose one method of estimating the reliability/precision of test scores.
B) Researchers can identify the causes of random error.
C) No measurement instrument is perfectly reliable or consistent.
D) A measurement instrument should be perfectly reliable or consistent.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
31
What do we call the statistic that reflects the amount of inconsistency or error expected in an individual's test score?
A) reliability coefficient
B) standard deviation
C) standard error of measurement
D) correlation coefficient
A) reliability coefficient
B) standard deviation
C) standard error of measurement
D) correlation coefficient
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
32
Julio wants to calculate the interrater agreement for an essay exam that is scored by giving each essay a pass (2) or fail (1) mark. Which one of the following would be most appropriate for Julio to use?
A) Cohen's kappa
B) coefficient alpha
C) KR-20
D) Correlation
A) Cohen's kappa
B) coefficient alpha
C) KR-20
D) Correlation
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
33
Scorer reliability and agreement concerns the consistency of what?
A) test scores
B) scorer judgments
C) different test forms
D) test takers' performance
A) test scores
B) scorer judgments
C) different test forms
D) test takers' performance
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
34
What do we call a single source of error that always increases or decreases the true score by the same amount?
A) the true score
B) the average score
C) random error
D) systematic error
A) the true score
B) the average score
C) random error
D) systematic error
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
35
Chris wants to calculate the interscorer reliability for an essay test that had two scorers. Which one of the following would be most appropriate for Chris to use?
A) Cohen's kappa
B) coefficient alpha
C) KR-20
D) Correlation
A) Cohen's kappa
B) coefficient alpha
C) KR-20
D) Correlation
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
36
As the reliability/precision of a test decreases, which one of the following items increases?
A) standard deviation
B) standard error of measurement
C) internal consistency
D) difficulty
A) standard deviation
B) standard error of measurement
C) internal consistency
D) difficulty
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
37
The range of scores that we feel comfortable and includes the true score is called a ______.
A) standard deviation
B) standard error of measurement
C) confidence interval
D) normal curve
A) standard deviation
B) standard error of measurement
C) confidence interval
D) normal curve
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
38
Which one of the following is most helpful to test developers who wish to increase the reliability/precision of a test's scores?
A) Spearman Brown formula
B) coefficient alpha
C) correlation
D) Cohen's kappa
A) Spearman Brown formula
B) coefficient alpha
C) correlation
D) Cohen's kappa
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
39
If a meteorologist uses a thermometer that always reads 1 degree higher than the actual temperature, then the error that results is ______. If a meteorologist is nearsighted and he reads the thermometer with a different amount and direction of inaccuracy each time, the error that results will be ______.
A) reliable; unreliable
B) unreliable; reliable
C) random; systematic
D) systematic; random
A) reliable; unreliable
B) unreliable; reliable
C) random; systematic
D) systematic; random
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
40
What is the amount of consistency among scorers' judgments called?
A) test-retest agreement
B) interscorer agreement
C) intrascorer agreement
D) internal agreement
A) test-retest agreement
B) interscorer agreement
C) intrascorer agreement
D) internal agreement
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
41
According to classical test theory, what would the reliability coefficient be when the variance of observed scores is equal to the variance of true scores?
A) 0
B) 0.5
C) 0.75
D) 1.0
A) 0
B) 0.5
C) 0.75
D) 1.0
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
42
When a test score is used for selection or classification of individuals, it is advisable to calculate the standard error of measurement at the ______.
A) score used to make the classification decision
B) average score on the test
C) highest score on the test
D) lowest score on the test
A) score used to make the classification decision
B) average score on the test
C) highest score on the test
D) lowest score on the test
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
43
Discuss Cohen's kappa. How is it used by test developers or researchers? Give an example.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
44
Effective test administration is likely to ______.
A) increase error and lower test reliability
B) increase error and raise test reliability
C) decrease error and lower test reliability
D) decrease error and raise test reliability
A) increase error and lower test reliability
B) increase error and raise test reliability
C) decrease error and lower test reliability
D) decrease error and raise test reliability
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
45
Rita scored 96 on an employment test, and Naomi scored 98 on the same test. Naomi believes that she has the highest score, but Rita disagrees. Which of the following would be most helpful in determining whether Naomi's score is statistically higher than Rita's score?
A) the standard deviation and mean of the test scores
B) the standard error of measurement of the test scores
C) the reliability coefficient of the test scores
D) the coefficient of determination of the test scores
A) the standard deviation and mean of the test scores
B) the standard error of measurement of the test scores
C) the reliability coefficient of the test scores
D) the coefficient of determination of the test scores
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
46
Define reliability and describe three methods for estimating the reliability of a psychological test and its scores.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
47
According to classical test theory, what will the reliability coefficient be when the observed score variance is greater than true score variance?
A) 0
B) less than 1.0
C) exactly 1.0
D) over 1.0
A) 0
B) less than 1.0
C) exactly 1.0
D) over 1.0
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
48
How do researchers and test developers identify systematic error in test scores?
A) using analysis of variance
B) using correlation
C) constructing a confidence interval
D) calculating the standard error of measurement
A) using analysis of variance
B) using correlation
C) constructing a confidence interval
D) calculating the standard error of measurement
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
49
Which one of the following is most likely to decrease the reliability of a test?
A) poorly written questions
B) unprepared test takers
C) to many test questions
D) overlapping confidence intervals
A) poorly written questions
B) unprepared test takers
C) to many test questions
D) overlapping confidence intervals
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
50
A nonparametric index for scorer agreement when the scores are nominal or ordinal is provided by ______.
A) coefficient alpha
B) KR-20
C) Cohen's kappa
D) Spearman Brown
A) coefficient alpha
B) KR-20
C) Cohen's kappa
D) Spearman Brown
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
51
Which one of the following proposes separating sources of systematic error from random error in order to eliminate systematic error?
A) classical test theory
B) generalizability theory
C) reliability theory
D) theory of the normal curve
A) classical test theory
B) generalizability theory
C) reliability theory
D) theory of the normal curve
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
52
Ivan conducted a reliability study for a test of computer skills. His results were a coefficient alpha of .95 and a test-retest r of .85 (4 weeks). Interpret these results including explaining why the reliability coefficients might have been different.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
53
What could a test developer do to increase internal consistency of a test?
A) accurately measure the test-retest reliability of the test
B) add well-written questions to each test form
C) add questions that measure the same concept
D) adjust the reliability coefficient using the KR-20 formula
A) accurately measure the test-retest reliability of the test
B) add well-written questions to each test form
C) add questions that measure the same concept
D) adjust the reliability coefficient using the KR-20 formula
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
54
Describe how to gather evidence of reliability using the split halves method. Start your discussion at the time the test is administered and continue until the final numerical value is calculated and interpreted.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
55
Explain what the standard error of measurement is and how to use it to construct a confidence interval around an observed score.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck
56
Identify and discuss the theory that explains why an observed test score is made up of the "true score" and "random error." Give examples.
Unlock Deck
Unlock for access to all 56 flashcards in this deck.
Unlock Deck
k this deck