Deck 6: Test Development and Item Analysis
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/68
Play
Full screen (f)
Deck 6: Test Development and Item Analysis
1
The statement of purpose for a test usually includes the variable or trait to be measured and the -
A) name of the author
B) intended test length
C) cost of the test
D) target audience
A) name of the author
B) intended test length
C) cost of the test
D) target audience
D
2
The first major step in test development is ___.
A) item analysis
B) item preparation
C) defining the test's purpose
D) preliminary design issues
A) item analysis
B) item preparation
C) defining the test's purpose
D) preliminary design issues
C
3
The text identifies a number of "preliminary design" issues to be considered when building a test? Which of these is NOT one of them?
A) length
B) number of scores
C) item format
D) defining the purpose
A) length
B) number of scores
C) item format
D) defining the purpose
D
4
According to the text, the Binet scales, the Wechsler scales, the MMPI, and the SAT all developed in response to ___.
A) political pressures
B) theoretical developments
C) economic needs
D) practical needs
A) political pressures
B) theoretical developments
C) economic needs
D) practical needs
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
5
According to the text, the Thematic Apperception Test, the Edwards Personal Preference Schedule, and the Primary Mental Abilities Test all developed in response to ___.
A) practical needs
B) theoretical needs
C) economic needs
D) political needs
A) practical needs
B) theoretical needs
C) economic needs
D) political needs
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
6
In the language of testing, the question to which an examinee responds is often called the ___.
A) format
B) stimulus
C) norm
D) protocol
A) format
B) stimulus
C) norm
D) protocol
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
7
Most group administered ability and achievement tests use a _____ response format.
A) true-false
B) multiple-choice
C) fill-in-the-blank
D) essay
A) true-false
B) multiple-choice
C) fill-in-the-blank
D) essay
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
8
The Likert format is most often be used with _____ measures.
A) vocational
B) intelligence
C) achievement
D) attitude
A) vocational
B) intelligence
C) achievement
D) attitude
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
9
In _____ scoring, the person scoring an essay rates that essay on several different dimensions.
A) automated
B) holistic
C) point system
D) analytic
A) automated
B) holistic
C) point system
D) analytic
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
10
Which of the following is NOT a major advantage of selected-response items?
A) scoring can occur very rapidly
B) can complete more items in a given amount of time
C) scoring requires little judgment
D) opportunity to observe test-taking behavior
A) scoring can occur very rapidly
B) can complete more items in a given amount of time
C) scoring requires little judgment
D) opportunity to observe test-taking behavior
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
11
What general category of test items includes true-false and multiple-choice items?
A) selected-response
B) constructed-response
C) free-response
D) stimulus-response
A) selected-response
B) constructed-response
C) free-response
D) stimulus-response
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
12
Which of the following is NOT an example of a selected-response item?
A) multiple-choice
B) fill-in-the-blank
C) true-false
D) Likert format
A) multiple-choice
B) fill-in-the-blank
C) true-false
D) Likert format
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
13
Which type of item is used to measure attitudes, with responses to items along a scale of Strongly Agree to Strongly Disagree?
A) Semantic differential
B) Graphic rating
C) Thurstone
D) Likert
A) Semantic differential
B) Graphic rating
C) Thurstone
D) Likert
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
14
As used in the world of testing, a portfolio is a ___.
A) collection of a person's work
B) set of art prints
C) group of essays
D) test with little reliability
A) collection of a person's work
B) set of art prints
C) group of essays
D) test with little reliability
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
15
Sentence completions and word association items are examples of what type of test item?
A) selected-response
B) constructed-response
C) limited-response
D) multiple-choice
A) selected-response
B) constructed-response
C) limited-response
D) multiple-choice
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
16
In the early 20th century, multiple-choice (MC) items became preferred to essay items because the MC items ___.
A) were easier to construct
B) yielded better reliability
C) had better validity
D) were easier to norm
A) were easier to construct
B) yielded better reliability
C) had better validity
D) were easier to norm
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
17
An instructor scores an essay by assigning a single score representing the overall quality of the essay. This is an application of __________ scoring.
A) analytical
B) holistic
C) objective
D) point-system
A) analytical
B) holistic
C) objective
D) point-system
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
18
The use of sophisticated computer programs that simulate the application of human judgment for scoring free-response items is called ___.
A) machine-scoring
B) computer scanning
C) judgmental scoring
D) automated scoring
A) machine-scoring
B) computer scanning
C) judgmental scoring
D) automated scoring
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
19
In automated scoring, a computer program attempts to ___.
A) simulate human judgment
B) score tests as rapidly as possible
C) combine results from two or more tests
D) correct errors in multiple-choice responses
A) simulate human judgment
B) score tests as rapidly as possible
C) combine results from two or more tests
D) correct errors in multiple-choice responses
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
20
Which of these is an advantage of selected-response items over constructed-response items? The selected-response items ___.
A) are less expensive to construct
B) are easier to construct
C) have better validity
D) have better scorer reliability
A) are less expensive to construct
B) are easier to construct
C) have better validity
D) have better scorer reliability
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
21
One of the advantages of selected-response (SR) items over constructed-response (CR) items is "temporal efficiency." This means
A) an examinee can usually complete more SR items in a given time period
B) an examinee can usually complete fewer SR items in a given time period
C) the SR items are usually easier
D) the SR items are usually more difficult
A) an examinee can usually complete more SR items in a given time period
B) an examinee can usually complete fewer SR items in a given time period
C) the SR items are usually easier
D) the SR items are usually more difficult
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
22
Which is NOT one of the rules for writing selected-response items?
A) use "all of the above" as an option frequently
B) keep the length of options fairly consistent
C) place options in logical order
D) include the central idea in the stem
A) use "all of the above" as an option frequently
B) keep the length of options fairly consistent
C) place options in logical order
D) include the central idea in the stem
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
23
You want your final test to have 50 items. What recommendation does the text make regarding the number of items to be prepared for item tryout?
A) 55-60
B) 75
C) 100-150
D) 500
A) 55-60
B) 75
C) 100-150
D) 500
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
24
What are the three phases included within the term item analysis?
A) item tryout, statistical analysis, item selection
B) item writing, item tryout, standardization
C) standardization, item selection, reliability analysis
D) statistical analysis, item review, item writing
A) item tryout, statistical analysis, item selection
B) item writing, item tryout, standardization
C) standardization, item selection, reliability analysis
D) statistical analysis, item review, item writing
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
25
For determining the "high" and "low" groups for purposes of item analysis, which is a commonly used split? Top and bottom ___.
A) 1%
B) 5%
C) 27%
D) 42%
A) 1%
B) 5%
C) 27%
D) 42%
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
26
The term "item difficulty" refers to ___.
A) percent of examinees who got the item wrong
B) percent of examinees who got the item right
C) reading level of the words in the item stem
D) reading level of the words in the item options
A) percent of examinees who got the item wrong
B) percent of examinees who got the item right
C) reading level of the words in the item stem
D) reading level of the words in the item options
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
27
In psychometric language, another name for "item difficulty" is ___.
A) ID value
B) D-value
C) p-value
D) PD-index
A) ID value
B) D-value
C) p-value
D) PD-index
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
28
According to our text, if you intend to have 75 items in the final test, you should prepare _____ items for the tryout.
A) 75-100
B) 100-150
C) 150-225
D) 500-750
A) 75-100
B) 100-150
C) 150-225
D) 500-750
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
29
Which of the following is the first phase in item analysis?
A) item selection
B) statistical analysis
C) item discrimination
D) item tryout
A) item selection
B) statistical analysis
C) item discrimination
D) item tryout
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
30
On a given test item, 75% of the students answered correctly, 20% answered incorrectly, and 5% did not answer the item. This item's p-value is -
A) .05
B) .20
C) .25
D) .75
A) .05
B) .20
C) .25
D) .75
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
31
The index of discrimination is often symbolized as -
A) D
B) DIF
C) IOD
D) I
A) D
B) DIF
C) IOD
D) I
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
32
Item analysis procedures assume that a test's purpose is to measure _____ among people.
A) similarities
B) demographics
C) differences
D) achievement
A) similarities
B) demographics
C) differences
D) achievement
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
33
If you try out 150 items to get the best 20 items for your test using a relatively small sample of cases (N = 30), what process should be used to obtain the true item statistics for the 20 items you have selected?
A) item selection
B) cross-validation
C) Spearman-Brown procedure
D) item analysis
A) item selection
B) cross-validation
C) Spearman-Brown procedure
D) item analysis
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
34
Which of these is the most important determinant of a test's reliability?
A) the total number of items on the test
B) the amount of people taking the test
C) the validity of the test questions
D) the specific content of the test
A) the total number of items on the test
B) the amount of people taking the test
C) the validity of the test questions
D) the specific content of the test
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
35
The average difficulty level of the test is a direct function of the ___.
A) ability of the test-takers
B) size of testing sample
C) number of test items
D) specific item p-values
A) ability of the test-takers
B) size of testing sample
C) number of test items
D) specific item p-values
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
36
An easy test will have an average p-value that is ___.
A) from .40 - .60
B) high
C) low
D) equal to the SD
A) from .40 - .60
B) high
C) low
D) equal to the SD
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
37
The value of D can be maximized by having the p-value = _____.
A) 1.00
B) 0.00
C) -1.00
D) .50
A) 1.00
B) 0.00
C) -1.00
D) .50
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
38
Theta () values usually range from ___.
A) -1.0 to 1.0
B) .0 to 1.0
C) 1.0 to 4.0
D) -4.0 to 4.0
A) -1.0 to 1.0
B) .0 to 1.0
C) 1.0 to 4.0
D) -4.0 to 4.0
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
39
An item analysis splits the "Upper" and "Lower" groups into the top and bottom 50% of cases. Here are the item statistics for four items (21, 22, 23, and 24).
Which item shows negative discrimination?
A) 21
B) 22
C) 23
D) 24

A) 21
B) 22
C) 23
D) 24
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
40
An item analysis splits the "Upper" and "Lower" groups into the top and bottom 50% of cases. Here are the item statistics for four items (21, 22, 23, and 24).
Which item has the highest discrimination value?
A) 21
B) 22
C) 23
D) 24

A) 21
B) 22
C) 23
D) 24
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
41
An item analysis splits the "Upper" and "Lower" groups into the top and bottom 50% of cases. Here are the item statistics for four items (21, 22, 23, and 24).
Which item has the highest p-value?
A) 21
B) 22
C) 23
D) 24

A) 21
B) 22
C) 23
D) 24
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
42
An item analysis splits the "Upper" and "Lower" groups into the top and bottom 50% of cases. Here are the item statistics for four items (21, 22, 23, and 24).
Which is the most difficult item?
A) 21
B) 22
C) 23
D) 24

A) 21
B) 22
C) 23
D) 24
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
43
In the "internal method" of determining item discrimination values, the groups contrasted as having more or less of the trait are defined in terms of ___.
A) total score on the test
B) clinical judgment
C) score on odd-numbered items
D) some other test
A) total score on the test
B) clinical judgment
C) score on odd-numbered items
D) some other test
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
44
In general, test reliability is maximized when item discrimination indexes are -
A) high
B) moderate
C) low
D) a mixture of high and low
A) high
B) moderate
C) low
D) a mixture of high and low
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
45
In item response theory, we usually represent the relationship between performance on an item and status on the trait of interest in terms of ___.
A) an item p-value
B) a Pearson correlation
C) an item characteristic curve
D) the reading level of the item
A) an item p-value
B) a Pearson correlation
C) an item characteristic curve
D) the reading level of the item
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
46
In item response theory, performance on an item is defined in terms of ___.
A) the item discrimination index
B) probability of passing
C) negative theta values
D) a correlation coefficient
A) the item discrimination index
B) probability of passing
C) negative theta values
D) a correlation coefficient
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
47
What is the technical term for the steepness of an ICC?
A) theta
B) slope
C) guessing
D) p-value
A) theta
B) slope
C) guessing
D) p-value
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
48
In the three-parameter IRT model, the lower asymptote of the ICC is often called the:
A) slope
B) theta
C) guessing parameter
D) difficulty parameter
A) slope
B) theta
C) guessing parameter
D) difficulty parameter
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
49
Item characteristic curves usually have the shape of what letter?
A) H
B) L
C) C
D) S
A) H
B) L
C) C
D) S
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
50
Which is the most popular one-parameter IRT model?
A) Likert
B) Rasch
C) Hogan
D) Rupp
A) Likert
B) Rasch
C) Hogan
D) Rupp
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
51
What appears along the base (i.e., the X-axis or horizontal axis) of the plot of an item characteristic curve?
A) item discrimination
B) slope
C) theta
D) probability of passing
A) item discrimination
B) slope
C) theta
D) probability of passing
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
52
The text discusses "technology-based items." How do these items fit into the distinction between SR and CR items?
A) They are mainly SR.
B) They are mainly CR.
C) They share characteristics of SR and CR.
D) They are completely different than both SR and CR.
A) They are mainly SR.
B) They are mainly CR.
C) They share characteristics of SR and CR.
D) They are completely different than both SR and CR.
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
53
In terms of the number of cases required to obtain useful item statistics, which statement is true about the comparison of IRT and CTT methods?
A) IRT requires more cases
B) CTT requires more cases
C) IRT and CTT require the same number of cases
D) IRT requires more for ability tests, but fewer for personality tests
A) IRT requires more cases
B) CTT requires more cases
C) IRT and CTT require the same number of cases
D) IRT requires more for ability tests, but fewer for personality tests
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
54
In terms of practical application in the development of contemporary, widely used tests, which statement is true regarding relative use of IRT and CTT methods?
A) IRT is used almost exclusively
B) CTT is used almost exclusively
C) IRT and CTT are both used routinely
D) IRT is used for personality tests, CTT for ability tests
A) IRT is used almost exclusively
B) CTT is used almost exclusively
C) IRT and CTT are both used routinely
D) IRT is used for personality tests, CTT for ability tests
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
55
When factor analysis is used as an item analysis technique, the test developer usually selects for inclusion in a scale or test those items with ___.
A) low loadings
B) moderate loadings
C) high loadings
D) negative loadings
A) low loadings
B) moderate loadings
C) high loadings
D) negative loadings
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
56
What term does the field use for test development procedures that attempt to minimize construct irrelevant variance especially for different subgroups of examinees?
A) CIV design
B) for-all design
C) universal design
D) subgroup-free design
A) CIV design
B) for-all design
C) universal design
D) subgroup-free design
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
57
The term "universal design" indicates test development efforts that attempt to minimize ___.
A) testing time
B) construct irrelevant variance
C) construct under-representation
D) a need for norms
A) testing time
B) construct irrelevant variance
C) construct under-representation
D) a need for norms
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
58
The item characteristic curve (ICC) relates status on the trait underlying the scale (defined in terms of ) to ___.
A) the p-value of the item
B) the D value of the item
C) overall ability of the sample
D) probability of passing the item
A) the p-value of the item
B) the D value of the item
C) overall ability of the sample
D) probability of passing the item
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
59
In a three parameter model, the ICC is determined by the ___.
A) item difficulty, validity, and guessing parameters
B) item discrimination, reliability, and difficulty parameters
C) item difficulty, discrimination, and guessing parameters
D) item discrimination, validity, and reliability parameters
A) item difficulty, validity, and guessing parameters
B) item discrimination, reliability, and difficulty parameters
C) item difficulty, discrimination, and guessing parameters
D) item discrimination, validity, and reliability parameters
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
60
Which is a crucial preliminary step in developing a test for computer-adaptive testing?
A) Have good norms.
B) Determine test-retest reliability.
C) Have good score reports.
D) Determine item statistics.
A) Have good norms.
B) Determine test-retest reliability.
C) Have good score reports.
D) Determine item statistics.
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
61
You are taking a test that uses computer-adaptive testing procedures. You answer item 23 correctly. What is likely to be true about item 24?
A) It will be easier than item 23.
B) It will be harder than item 23.
C) It will have about the same difficulty as item 23.
D) It will have the same kind of item stem as item 23.
A) It will be easier than item 23.
B) It will be harder than item 23.
C) It will have about the same difficulty as item 23.
D) It will have the same kind of item stem as item 23.
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
62
What is the relationship between the mean score on a test and the item p-values? The mean score is ___.
A) independent of the p-values
B) the sum of the p-values
C) the average p-value minus the average discrimination value
D) the average p-value plus the average discrimination value
A) independent of the p-values
B) the sum of the p-values
C) the average p-value minus the average discrimination value
D) the average p-value plus the average discrimination value
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
63
If you want a test that has the best discrimination at the lower end of the distribution of scores on the test, you should select items with ___.
A) high p-values
B) moderate p-values
C) low p-values
D) a mixture of very high and very low p-values
A) high p-values
B) moderate p-values
C) low p-values
D) a mixture of very high and very low p-values
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
64
As a general rule, when selecting items for a final test based on item analysis statistics, we want items with _________ discrimination indices.
A) high
B) moderate
C) low
D) negative
A) high
B) moderate
C) low
D) negative
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
65
As a general rule, the item discrimination index can take on its maximum value when the item p-value is ___.
A) high
B) moderate
C) low
D) negative
A) high
B) moderate
C) low
D) negative
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
66
As a general rule, test reliability will be maximized when item p-values are ___.
A) high
B) moderate
C) low
D) negative
A) high
B) moderate
C) low
D) negative
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
67
Which type of reliability is most easily determined as part of the standardization program for a test?
A) internal consistency
B) test-retest
C) alternate form
D) inter-scorer
A) internal consistency
B) test-retest
C) alternate form
D) inter-scorer
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck
68
What do we call the data collection program that yields the norms for a test?
A) standardization program
B) equating program
C) item analysis program
D) classical test theory (CTT) program
A) standardization program
B) equating program
C) item analysis program
D) classical test theory (CTT) program
Unlock Deck
Unlock for access to all 68 flashcards in this deck.
Unlock Deck
k this deck