Exam 13: Classical Test Theory Item Analysis
The item discrimination index that correlates how a test taker answers (i.e., correct or incorrect) a given test item and their overall test score, while correcting for the fact that you are artificially dichotomizing performance on the item, is known as a ________ correlation coefficient.
A
Briefly describe the advantage of first examining the item difficulty statistic.
Examining the p-values first gives the test user a sense of how respondents are performing on each item on the assessment. Very high (>90%) and very low (< 10%) p-values are of less value in distinguishing the test takers.
Shultz, Whitney, and Zickar (2021) note that, "Thus, every attempt should be made to revise an item before it is tossed out". Briefly describe some examples of appropriate ways to revise a test item to improve its perform in future administrations.
The item analysis statistics should provide clues as to how the item is performing. For example, in addition to examining the p-value for the correct answer, also examine the p-values for the distractor responses. Even if the p-value for the correct answer is a respectable 70%, if no respondent chose distractor option A, then it should be revised. In other cases, examining the biserial item total correlation may indicate that a distractor item is much to popular and as a result was a primary reason that the item-total correlation was negative. Thus, either the troublesome distractor or the test item itself may need to be revised.
From a purely psychometric standpoint of wanting to maximize variability and reliability, the ideal average p-value across items on a norm-referenced test would be _______.
Most item discrimination indexes correlate how a test taker responds to a given item with their ________.
Under Classical Test Theory (CTT) item analysis, the p-value for a given item is most likely to fluctuate based on ______.
One major drawback when using item-criterion data to construct a test is that the test tends to ___________________.
Briefly describe and distinguish the different types of item discrimination statistics. Which is preferred in most situations?
In addition to examining individual item difficulty and discrimination indexes, it is also important to evaluate the overall test performance. What key statistics will help you evaluate the overall test?
The use of educational technology (e.g., clicker response pads) in large lecture hall classrooms for "real time" evaluation is most associated with ___________ assessment.
When is it preferable to have a lower average p-value (e.g., 50%) versus a bit higher (e.g., 75 to 80%)?
If Tong has been tasked with developing a licensing exam for psychologists in her state, then she would be best off using a(n) __________ test development strategy.
Based on Module 13 in Shultz, Whitney, and Zickar (2021), if an item you spent a lot of time constructing performs poorly based on the item analysis statistics, typically your best bet is to _________.
Often times a multiple choice test item that has a negative item-total correlation may be improved by simply ________.
In practice, most tests in educational and employment contexts should have p-values that range from __________.
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)