Deck 5: Technical Adequacy: Reliability and Validity
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/51
Play
Full screen (f)
Deck 5: Technical Adequacy: Reliability and Validity
1
To evaluate the content validity of a portfolio assessment, it should be determined that the student's work that is included in the portfolio represents
A) the best work that the student has done in the domain.
B) all important dimensions within the domain.
C) all of the student's work during the previous year.
D) just those areas where the student continues to have difficulty.
A) the best work that the student has done in the domain.
B) all important dimensions within the domain.
C) all of the student's work during the previous year.
D) just those areas where the student continues to have difficulty.
B
2
Which of the following statements concerning test validity is most accurate?
A) A test cannot be valid unless it is reliable.
B) A test cannot be reliable unless it is valid.
C) A test cannot be standardized unless it is valid.
D) A test cannot be reliable unless it is standardized.
A) A test cannot be valid unless it is reliable.
B) A test cannot be reliable unless it is valid.
C) A test cannot be standardized unless it is valid.
D) A test cannot be reliable unless it is standardized.
A
3
The extent to which a person's score on a test is related to performance on a criterion measure is best described as evidence based on
A) test content.
B) test structure.
C) relations to other variables.
D) response processes.
A) test content.
B) test structure.
C) relations to other variables.
D) response processes.
C
4
When different groups of test takers consistently experience disparate levels of success on specific items, there is a problem in
A) differential item effectiveness.
B) group selection.
C) administration errors.
D) reliability.
A) differential item effectiveness.
B) group selection.
C) administration errors.
D) reliability.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
5
If a test measures something consistently but does not measure what it was designed to measure, then the test is
A) reliable but not valid.
B) reliable but not standardized.
C) standardized but not reliable.
D) valid but not reliable.
A) reliable but not valid.
B) reliable but not standardized.
C) standardized but not reliable.
D) valid but not reliable.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
6
The higher the reliability coefficient, the lower the
A) coefficient of regression.
B) standard error of measurement.
C) validity.
D) standard deviation.
A) coefficient of regression.
B) standard error of measurement.
C) validity.
D) standard deviation.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
7
In the context of assessment, "enabling behaviors" are those behaviors that
A) help the tester attract the subject's attention.
B) are extraneous to the requirements of the test situation.
C) focus the assessment on qualitative analyses of performance.
D) are required by the assessment to demonstrate the target knowledge.
A) help the tester attract the subject's attention.
B) are extraneous to the requirements of the test situation.
C) focus the assessment on qualitative analyses of performance.
D) are required by the assessment to demonstrate the target knowledge.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
8
Because a person's true abilities can change between two administrations of a test, it is generally true that
A) test-retest procedures cannot produce good reliability estimates.
B) the shorter the time between the two administrations, the higher the reliability.
C) the length of time between two administrations has little effect on reliability.
D) a test developer needs to calculate coefficient alpha to estimate stability.
A) test-retest procedures cannot produce good reliability estimates.
B) the shorter the time between the two administrations, the higher the reliability.
C) the length of time between two administrations has little effect on reliability.
D) a test developer needs to calculate coefficient alpha to estimate stability.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
9
An individual reported a reliability coefficient of 1.25 for an intelligence test. It was obtained by correlating the results of a given group on Form A with the group's results on Form B. This coefficient indicates that
A) the test is unusually reliable.
B) the test is unusually valid.
C) there are no errors of measurement.
D) a mistake was made in computing the coefficient.
A) the test is unusually reliable.
B) the test is unusually valid.
C) there are no errors of measurement.
D) a mistake was made in computing the coefficient.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
10
Coefficient alpha is most linked to
A) test-retest reliability.
B) percentage of agreement.
C) stability.
D) internal consistency.
A) test-retest reliability.
B) percentage of agreement.
C) stability.
D) internal consistency.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
11
To determine the stability of a test, the recommended interval between administrations of the test is __________.
A) 2 days
B) 2 weeks
C) 2 months
D) 2 years
A) 2 days
B) 2 weeks
C) 2 months
D) 2 years
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
12
A statistic that enables an examiner to establish confidence for the true scores of examinees is the
A) Kuder-Richardson predictive index.
B) validity coefficient.
C) standard error of measurement.
D) mode.
A) Kuder-Richardson predictive index.
B) validity coefficient.
C) standard error of measurement.
D) mode.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
13
Kiana is going to evaluate the concurrent criterion-related validity of a self-report assessment of classroom problem behaviors. The most appropriate criterion measure would be
A) a test of intelligence.
B) classroom observation.
C) grades in math.
D) court records.
A) a test of intelligence.
B) classroom observation.
C) grades in math.
D) court records.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
14
T-scores for student X on an achievement test battery standardized on the same population are spelling 35, math 62, social studies 50, and English grammar 52. Each test has a SEM of 2; the tests are not intercorrelated. We conclude that
A) X is strongest in spelling.
B) X is strongest in math.
C) there are no substantial differences in X's achievements in these four areas.
D) it is not possible to compare X's performance on these subtests.
A) X is strongest in spelling.
B) X is strongest in math.
C) there are no substantial differences in X's achievements in these four areas.
D) it is not possible to compare X's performance on these subtests.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
15
A stability coefficient is used for measuring the reliability of
A) a test administered at two different times.
B) the first 50 items, compared with the last 50 items in a 100-item test.
C) standard error of measurement.
D) alternate forms of a test.
A) a test administered at two different times.
B) the first 50 items, compared with the last 50 items in a 100-item test.
C) standard error of measurement.
D) alternate forms of a test.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
16
The results of an achievement test are considered to be invalid if
A) reliability is less than 0.95.
B) the teacher has not taught the content being tested.
C) the student did not listen when the subject matter was taught.
D) validity is less than 0.95.
A) reliability is less than 0.95.
B) the teacher has not taught the content being tested.
C) the student did not listen when the subject matter was taught.
D) validity is less than 0.95.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
17
The reliability of a test refers to its relative
A) validity.
B) power.
C) consistency.
D) inappropriateness.
A) validity.
B) power.
C) consistency.
D) inappropriateness.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
18
Method of measurement, enabling behaviors, and administrative errors are all considered to be
A) types of reliability.
B) signs of validity.
C) sources of systematic bias.
D) test development problems.
A) types of reliability.
B) signs of validity.
C) sources of systematic bias.
D) test development problems.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
19
The means for both Test A and Test B are 50. A 50% confidence interval for a score at the mean is 44-55 for Test A and 42-58 for Test B. Which of the following statements is true?
A) Test A is more reliable than Test B.
B) Test B is more reliable than Test A.
C) Test A has a larger SEM than Test B.
D) Test B has a larger SEM than Test A.
A) Test A is more reliable than Test B.
B) Test B is more reliable than Test A.
C) Test A has a larger SEM than Test B.
D) Test B has a larger SEM than Test A.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
20
The Dairy County School District appropriately uses a test that has a reliability of 0.89 to
A) place children in special education if they earn a score below an established criterion.
B) move children to another school building to receive services for gifted children if they score above a certain point.
C) decide whether students should be placed in the Rainbow reading group or the Rainstorm reading group.
D) decide to conduct further assessment procedures.
A) place children in special education if they earn a score below an established criterion.
B) move children to another school building to receive services for gifted children if they score above a certain point.
C) decide whether students should be placed in the Rainbow reading group or the Rainstorm reading group.
D) decide to conduct further assessment procedures.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
21
The absence of __________ required for performance on a test invalidates the rest results.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
22
For individual test data, where a test score is used to make a tracking or placement decision for an individual student, the recommended level of required reliability is __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
23
For group test data that are used for administrative purposes and reported only by group, the recommended level of required reliability is __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
24
If one wants to generalize to different times, one should examine the test's __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
25
Failure to administer a test according to standardized procedures is considered
A) appropriate if the subject is young.
B) a form of rapport building that may be necessary.
C) a source of systematic bias.
D) a random error that varies from one subject to the next.
A) appropriate if the subject is young.
B) a form of rapport building that may be necessary.
C) a source of systematic bias.
D) a random error that varies from one subject to the next.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
26
A test has a norm sample that is not representative of the population. Inferences made on the basis of a student's performance on this test are
A) likely to indicate lower performance than the true score.
B) invalid.
C) unreliable.
D) considered to be reasonable for qualitative comparisons.
A) likely to indicate lower performance than the true score.
B) invalid.
C) unreliable.
D) considered to be reasonable for qualitative comparisons.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
27
If one wants to generalize to different item samples, one should examine the test's __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
28
Both unreliability (unsystematic error) and systematic error (bias) threaten __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
29
The most likely explanation for items having __________ for different groups of people is differential exposure to test content.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
30
The validity of a particular test can never exceed the __________ of that test.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
31
When we test, we are interested in __________ what we see today under one set of conditions to other occasions.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
32
An estimate of the likelihood that a person's true score may be found within a range of scores is provided by the __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
33
The completeness of the item sample is one of the factors to consider in determining __________ validity.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
34
For individual test data, where a test score is used to make a screening decision, the recommended level of required reliability is __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
35
In order for evidence of high concurrent validity to be meaningful, the criterion measures must be__________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
36
A reliability coefficient of 1.00 indicates __________ reliability.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
37
Criteria for how high a test's reliability must be are determined in part by the specific __________ of assessment.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
38
Validity evidence based on _________ ___________ reflects the extent to which a test's items represent the domain or universe to be measured.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
39
A method of estimating the reliability of a test that does not have two forms is to calculate the __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
40
A test with a reliability coefficient of .97 has relatively little __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
41
The test manual for the Culture-Fair Intelligence Test reports correlations with the Stanford-Binet and the Goodenough-Harris tests. What type of validity is the author trying to demonstrate?
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
42
Dr. Qubert has developed a test for which there is not an adequate criterion measure or construct with which to evaluate validity. She therefore decides to present complete content validity data. What three factors must she consider when determining content validity?
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
43
To the extent that a norm sample is systematically unrepresentative, the inferences based on such scores are incorrect and __________.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
44
Sixty percent of a test's variance is caused by the variance of true scores, whereas 40% of the variance is caused by error. What is the test's reliability?
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
45
Test results would be of little value if we were unable to generalize what was observed in one situation to other situations. Identify and discuss three types of generalizations that can be made from reliable test results.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
46
Joseph was tested on an instrument for which the SEM was relatively high. How sure can we be of Joseph's score?
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
47
Explain in your own words the relationship between reliability and validity.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
48
Validity evidence based on the consequences of testing is a concept adopted by the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in education. However, it has been widely accepted in education. Discuss the reason evidence based on the consequences of testing has not been accepted in education.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
49
Compare and contrast the two major approaches to estimating the extent to which we can generalize from different samples of items.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
50
Unless a test is administered according to the __________ the results are invalid.
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck
51
Annette was tested on an instrument for which the SEM was quite small. How sure can we be of Annette's score?
Unlock Deck
Unlock for access to all 51 flashcards in this deck.
Unlock Deck
k this deck