Deck 11: How Do We Assess the Psychometric Quality of a Test

Full screen (f)
exit full mode
Question
Which one of the following provides important information for increasing the test's internal consistency?

A) discrimination index
B) difficulty level
C) interitem correlation matrix
D) coefficient of multiple correlation
Use Space or
up arrow
down arrow
to flip the card.
Question
Which one of the following outcomes indicates that a test item should be retained in a test?

A) High performers answered the item correctly and low performers answered the item incorrectly.
B) High performers answered the item correctly and low performers answered the item correctly.
C) High performers answered the item incorrectly and low performers answered the item correctly.
D) High performers answered the item incorrectly and low performers answered the item incorrectly.
Question
The percentage of test takers who respond correctly to a test item is a measure of the item's ______.

A) difficulty
B) bias
C) ability to discriminate
D) item-total correlation
Question
One advantage of empirically based tests is that ______.

A) they have strong validity coefficients
B) their internal reliability is high
C) test takers prefer them to other types of tests
D) it is more difficult for test takers to fake responses
Question
Which one of the following statistics can be used to make decisions about retaining or discarding an item based on how well the item discriminates between high- and low-scoring test takers?

A) item-total correlation
B) inter-item correlation
C) difficulty level
D) discrimination index
Question
When test developers examine the discrimination indexes of each item, which one of the following outcomes do they consider being most desirable?

A) low positive numbers
B) high positive numbers
C) average positive numbers
D) low negative numbers
Question
What is the discrimination index?

A) a comparison of the scores of respondents by sex, race, or other personal characteristics
B) an index of how difficult each test item is
C) a comparison of high performer scores with low performer scores on each item
D) cumulative results from an item analysis yielding an overall score for the test
Question
Which one of the following tests is an example of an empirically based test?

A) Mathematics Self-Efficacy Test
B) Myers-Briggs type indicator
C) Minnesota Multiphasic Personality Inventory
D) Graduate Record Exam
Question
Dividing the number of persons who answered correctly by the total number of persons who responded to the question is a measure of an item's ______.

A) discrimination index
B) phi coefficient
C) difficulty
D) bias
Question
How do researchers calculate the discrimination index?

A) They calculate the difference between the percentage of upper performers and the percentage of lower performers who responded correctly.
B) They calculate the percentage of test takers who answered the item correctly.
C) They compare the percentage of test takers who answered the item correctly by demographic group.
D) They develop a matrix that contains the results of the item analyses for each item and calculate a score by adding the matrix columns.
Question
Tests that are designed to classify individuals into two or more categories based on their scores on a criterion measure are called ______.

A) empirically based tests
B) pilot tests
C) diagnostic tests
D) screening tests
Question
How are phi coefficients interpreted?

A) the same as discrimination coefficients
B) the same as reliability coefficients
C) the same as validity coefficients
D) the same as Pearson product moment correlations
Question
When items have a very low or very high p value, test developers ______.

A) accept them as good items
B) know that they contribute to the variability of the test scores
C) rewrite or discard the items
D) have evidence that the item is valid
Question
The formula D = U − L is used to calculate which one of the following statistics? A. discrimination index
B) difficulty level
C) item-total correlation
D) item-total index
Question
To increase internal consistency, items that correlate well with other items measuring the same construct should be ______.

A) dropped
B) retained
C) rewritten
D) grouped together
Question
Which one of the following ranges of item p values yield distribution of test scores with the most variation?

A) 0-0.3
B) 0.4-0.6
C) 0.7-1
D) 1-3
Question
Test items that everyone gets "right" or everyone gets "wrong" provide ______.

A) evidence the test yields a wide range of scores
B) proof the test is not biased against minorities
C) support for the validity of the test questions
D) no basis for a comparison of test takers' abilities
Question
What are phi coefficients?

A) the correlation between two dichotomous variables
B) the correlation between two sets of test scores
C) the correlation between a test item and the total test score
D) they correlation of item difficulty and item bias
Question
An interitem correlation matrix displays the ______.

A) reliability and validity coefficients for each item
B) difficulty of each item on the test
C) correlation of each item with every other item on the test
D) correlation of each item with the total test score for all test takers
Question
What is a quantitative item analysis?

A) numerical data from respondent questionnaires about the test
B) analysis of data from respondent questionnaires about the test
C) statistical analyses of the responses test takers gave to individual items
D) statistical analyses of the test's validity
Question
When a test yields significantly different validity coefficients for different subgroups, we say it has ______.

A) single-group validity
B) differential validity
C) discriminant validity
D) nongroup validity
Question
The item characteristic curve can provide a picture of an item's ______.

A) distribution of responses
B) interitem and item-total correlation
C) level of difficulty and discrimination
D) reliability and validity
Question
Which one of the following is a characteristic of a good test item--one that should be retained in the final version of the test?

A) low discrimination index
B) item characteristic curves that have very little slope
C) difficulty level of 0.5
D) interitem correlation coefficient near 0
Question
What are cut scores?

A) scores that would have been higher had it not been for test bias
B) mean, median, and mode of the norm distribution
C) decision points for dividing test scores into pass/fail groupings
D) transformed scores, such as z and T scores
Question
How does computerized adaptive testing (CAT) choose items for individuals taking the test?

A) Questions are predetermined before the individual takes the test.
B) Software chooses items based on level of ability determined from previous responses.
C) Software chooses items based on level of ability determined from average of previous responses.
D) Software chooses items based on predicted level of individual's ability.
Question
The main purpose of the validation study is to ______.

A) get the reactions of test takers and test users
B) gather data on the construct(s) that the test measures
C) confirm the test's ability to yield meaningful and accurate results
D) comply with legal requirements that tests must have evidence of validity
Question
Which one of the following is a part of quantitative item analysis?

A) constructing an agenda for a test takers' group discussion
B) constructing one-on-one interviews for test takers
C) constructing item characteristic curves
D) constructing numerical rating scales for test-taker questionnaires
Question
What is an item characteristic curve?

A) a line that describes the probability of answering an item correctly plotted against the level of ability on the trait being measured
B) a line that describes the distribution of responses to a single item on the trait being measured
C) a histogram constructed for the responses to a single item on the trait being measured
D) a line similar to the normal curve that results from graphing the discrimination index against difficulty level for the trait being measured ______.
Question
To determine the maximum likelihood estimation, computerized adaptive testing (CAT) software weights all of the following EXCEPT ______.

A) pseudo-guessing parameter
B) difficulty
C) discrimination
D) ability
Question
What is an advantage of computerized adaptive testing?

A) It provides less data about the test taker.
B) It takes less time to complete for the test taker.
C) It is cheaper to administer.
D) It is easier to create tests.
Question
Test developers can easily find the difficulty of an item by ______.

A) looking at how steep the item characteristic curve is
B) locating the point at which the item characteristic curve indicates a probability of .5 of answering correctly.
C) subtracting the percentage of low performers who responded correctly from the percentage of high performers who responded correctly
D) asking test takers to complete a qualitative item analysis survey on item difficulty
Question
What is the purpose of test norms?

A) to provide a structure that makes it easy to identify test bias
B) to provide a reference point or structure for understanding one test taker's score
C) to show that many people have taken the test and their scores were normally distributed
D) to provide evidence of reliability and validity for the test
Question
A test question has item bias when it ______.

A) is easier for one group than for another group
B) has a high discrimination index
C) does not correlate with other test items
D) does not correlate with the test's raw score
Question
A major difficulty in setting cut scores is ______.

A) setting the score low enough that most test takers pass
B) setting the score high enough that only the best test takers pass
C) allowing for test error that may allow some to pass who should not pass
D) calculating the standard error measurement for the test scores
Question
What is item response theory?

A) a part of classical test theory that specifies that item difficulty and discrimination are related
B) a theory that relates the performance of each item to a statistical estimate of the test taker's ability on the construct being measured
C) a theory that describes the cognitive steps a respondent takes before answering an item
D) a theory that uses probability to estimate the test taker's honestly or motivation when answering an item
Question
What is single-group validity?

A) when validity coefficients for different subgroups differ
B) when a test is valid for one group but not for another group
C) when the target audience comprises only one type of test takers
D) when the test is valid for use only one time
Question
What is cross-validation?

A) a repeat of the validation study sometimes using another sample of test takers
B) a validation study whose participants represent all minority groups
C) carrying out the validation in locations throughout a state or country
D) an alternative form of validation that can be accomplished using a statistical formula
Question
Which one of the following is NOT a method for conducting a qualitative item analysis?

A) asking test takers to fill out a questionnaire about the test
B) asking test takers to attend group discussions
C) using item characteristic curves to assess item bias
D) using one-on-one interviews with test takers to find out how they interpreted the test questions
Question
The validity coefficients that result when a test is cross-validated are usually expected to be ______.

A) the same as the validity coefficients found in the original validation study
B) lower than the validity coefficients found in the original validation study
C) higher than the validity coefficients found in the original validation study
D) unrelated to the validity coefficients found in the original validation study
Question
Pierre conducted a validation study, and he found that the test was only valid for French-speaking males. What kind of validity did he identify?

A) between-group validity
B) single-group validity
C) differential validity
D) among-groups validity
Question
When a common regression line for two groups is used to predict performance, but the individual regression lines for the groups differ where they cross the y-axis, which one of the following types of predictive bias is present?

A) method bias
B) construct bias
C) slope bias
D) intercept bias
Question
What was the problem that testing experts pointed out in the Golden Rule case settlement with the Educational Testing Service (ETS)?

A) There are no laws that address test bias or discrimination among subgroups of test takers.
B) The exam developed by ETS had single group validity for Whites only.
C) Some items on the exam developed by ETS were easier for Whites than for Blacks.
D) Comparing item difficulty levels in the form of p values failed to take into consideration the test takers' level of ability.
Question
Explain the purpose of a cut score and describe two methods for identifying a cut score. Give examples for each method.
Question
Explain the concepts of predictive bias, differential validity, and single-group validity. Give an example of each.
Question
Recent research on the tests of cognitive validity have found that such tests are ______.

A) equally valid for minority and majority test takers
B) more valid for minority test takers than they are for majority test takers
C) more valid for majority test takers than they are for minority test takers
D) not valid for some groups of test takers
Question
Two first-year college students, Carlos and Carl, are used to taking different types of academic tests. Carlos has mostly taken essay tests, and Carl has mostly taken multiple choice tests. What problem may arise when they take the same tests in college?

A) Depending on the type of tests given, there may be construct bias present in the test scores.
B) Depending on the type of tests given, there may be method bias present in the test scores.
C) Depending on the type of tests given, there may be reliability bias present in the test scores.
Question
Describe the processes and expected outcomes of validation and cross-validation studies. Explain why each is important.
Question
Items for which the p value falls in the range of 0.90 to 1.00 are usually considered ______.

A) too difficult
B) somewhat difficult
C) somewhat easy
D) too easy
Question
Describe how we collect and interpret data for a qualitative item analysis. Discuss the types of information the test developer should seek and give examples of questions.
Question
What are item-total correlations? What is their purpose in an item analysis?
Question
Identify and explain the criteria for retaining and dropping items to revise a test. Make a matrix with faked data for 5 items. Explain why each item should or should not be retained.
Question
Describe how we collect, analyze, and interpret data for a quantitative item analysis. Give examples.
Question
What is the purpose of an item discrimination index and how is it calculated?
Question
While test validity is a statistical concept, test fairness is a ______.

A) social concept
B) mathematical concept
C) scientific concept
D) meaningless concept
Question
Which one of the following explanations was thought to be the most likely reason why recent research on tests of cognitive ability showed that there was differential validity on the tests when test takers from majority and minority groups were compared?

A) range restriction in the minority group
B) range restriction in the majority group
C) culture bias for the majority group
D) culture bias for the minority group
Question
What is the importance of item difficulty and how is it calculated?
Question
What is the line in this graph called? <strong>What is the line in this graph called?  </strong> A) normal curve B) regression line C) item characteristic curve D) probability curve <div style=padding-top: 35px>

A) normal curve
B) regression line
C) item characteristic curve
D) probability curve
Question
What are interitem correlations? What is their purpose in an item analysis?
Question
When the regression lines that predict performance for two groups have different slopes, which one of the following types of measurement bias is likely to occur?

A) single-group validity
B) differential validity
C) culture bias
D) method bias
Question
When a common regression line for two groups is used to predict performance, but the individual regressions lines for the groups differ where they cross the Y axis, which one of the following problems may occur?

A) The performance of the group whose regression line crosses the y-axis at a higher point will be overpredicted.
B) The performance of the group whose regression line crosses the y-axis at a lower point will be overpredicted.
C) The performance of both groups will be overpredicted.
D) The performance of neither group will be overpredicted.
Question
What is item bias? How do test developers and researchers identify item bias?
Question
Define and describe types of bias in psychological testing. Discuss different types of predictive bias. Give examples of each type.
Question
What is cross-validation and what is its purpose and importance? Describe two methods for finding the criterion-related validity coefficient for cross-validation.
Question
What are item-criterion correlations? Describe one way in which developers use them?
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/64
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 11: How Do We Assess the Psychometric Quality of a Test
1
Which one of the following provides important information for increasing the test's internal consistency?

A) discrimination index
B) difficulty level
C) interitem correlation matrix
D) coefficient of multiple correlation
C
2
Which one of the following outcomes indicates that a test item should be retained in a test?

A) High performers answered the item correctly and low performers answered the item incorrectly.
B) High performers answered the item correctly and low performers answered the item correctly.
C) High performers answered the item incorrectly and low performers answered the item correctly.
D) High performers answered the item incorrectly and low performers answered the item incorrectly.
A
3
The percentage of test takers who respond correctly to a test item is a measure of the item's ______.

A) difficulty
B) bias
C) ability to discriminate
D) item-total correlation
A
4
One advantage of empirically based tests is that ______.

A) they have strong validity coefficients
B) their internal reliability is high
C) test takers prefer them to other types of tests
D) it is more difficult for test takers to fake responses
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
5
Which one of the following statistics can be used to make decisions about retaining or discarding an item based on how well the item discriminates between high- and low-scoring test takers?

A) item-total correlation
B) inter-item correlation
C) difficulty level
D) discrimination index
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
6
When test developers examine the discrimination indexes of each item, which one of the following outcomes do they consider being most desirable?

A) low positive numbers
B) high positive numbers
C) average positive numbers
D) low negative numbers
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
7
What is the discrimination index?

A) a comparison of the scores of respondents by sex, race, or other personal characteristics
B) an index of how difficult each test item is
C) a comparison of high performer scores with low performer scores on each item
D) cumulative results from an item analysis yielding an overall score for the test
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
8
Which one of the following tests is an example of an empirically based test?

A) Mathematics Self-Efficacy Test
B) Myers-Briggs type indicator
C) Minnesota Multiphasic Personality Inventory
D) Graduate Record Exam
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
9
Dividing the number of persons who answered correctly by the total number of persons who responded to the question is a measure of an item's ______.

A) discrimination index
B) phi coefficient
C) difficulty
D) bias
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
10
How do researchers calculate the discrimination index?

A) They calculate the difference between the percentage of upper performers and the percentage of lower performers who responded correctly.
B) They calculate the percentage of test takers who answered the item correctly.
C) They compare the percentage of test takers who answered the item correctly by demographic group.
D) They develop a matrix that contains the results of the item analyses for each item and calculate a score by adding the matrix columns.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
11
Tests that are designed to classify individuals into two or more categories based on their scores on a criterion measure are called ______.

A) empirically based tests
B) pilot tests
C) diagnostic tests
D) screening tests
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
12
How are phi coefficients interpreted?

A) the same as discrimination coefficients
B) the same as reliability coefficients
C) the same as validity coefficients
D) the same as Pearson product moment correlations
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
13
When items have a very low or very high p value, test developers ______.

A) accept them as good items
B) know that they contribute to the variability of the test scores
C) rewrite or discard the items
D) have evidence that the item is valid
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
14
The formula D = U − L is used to calculate which one of the following statistics? A. discrimination index
B) difficulty level
C) item-total correlation
D) item-total index
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
15
To increase internal consistency, items that correlate well with other items measuring the same construct should be ______.

A) dropped
B) retained
C) rewritten
D) grouped together
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
16
Which one of the following ranges of item p values yield distribution of test scores with the most variation?

A) 0-0.3
B) 0.4-0.6
C) 0.7-1
D) 1-3
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
17
Test items that everyone gets "right" or everyone gets "wrong" provide ______.

A) evidence the test yields a wide range of scores
B) proof the test is not biased against minorities
C) support for the validity of the test questions
D) no basis for a comparison of test takers' abilities
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
18
What are phi coefficients?

A) the correlation between two dichotomous variables
B) the correlation between two sets of test scores
C) the correlation between a test item and the total test score
D) they correlation of item difficulty and item bias
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
19
An interitem correlation matrix displays the ______.

A) reliability and validity coefficients for each item
B) difficulty of each item on the test
C) correlation of each item with every other item on the test
D) correlation of each item with the total test score for all test takers
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
20
What is a quantitative item analysis?

A) numerical data from respondent questionnaires about the test
B) analysis of data from respondent questionnaires about the test
C) statistical analyses of the responses test takers gave to individual items
D) statistical analyses of the test's validity
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
21
When a test yields significantly different validity coefficients for different subgroups, we say it has ______.

A) single-group validity
B) differential validity
C) discriminant validity
D) nongroup validity
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
22
The item characteristic curve can provide a picture of an item's ______.

A) distribution of responses
B) interitem and item-total correlation
C) level of difficulty and discrimination
D) reliability and validity
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
23
Which one of the following is a characteristic of a good test item--one that should be retained in the final version of the test?

A) low discrimination index
B) item characteristic curves that have very little slope
C) difficulty level of 0.5
D) interitem correlation coefficient near 0
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
24
What are cut scores?

A) scores that would have been higher had it not been for test bias
B) mean, median, and mode of the norm distribution
C) decision points for dividing test scores into pass/fail groupings
D) transformed scores, such as z and T scores
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
25
How does computerized adaptive testing (CAT) choose items for individuals taking the test?

A) Questions are predetermined before the individual takes the test.
B) Software chooses items based on level of ability determined from previous responses.
C) Software chooses items based on level of ability determined from average of previous responses.
D) Software chooses items based on predicted level of individual's ability.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
26
The main purpose of the validation study is to ______.

A) get the reactions of test takers and test users
B) gather data on the construct(s) that the test measures
C) confirm the test's ability to yield meaningful and accurate results
D) comply with legal requirements that tests must have evidence of validity
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
27
Which one of the following is a part of quantitative item analysis?

A) constructing an agenda for a test takers' group discussion
B) constructing one-on-one interviews for test takers
C) constructing item characteristic curves
D) constructing numerical rating scales for test-taker questionnaires
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
28
What is an item characteristic curve?

A) a line that describes the probability of answering an item correctly plotted against the level of ability on the trait being measured
B) a line that describes the distribution of responses to a single item on the trait being measured
C) a histogram constructed for the responses to a single item on the trait being measured
D) a line similar to the normal curve that results from graphing the discrimination index against difficulty level for the trait being measured ______.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
29
To determine the maximum likelihood estimation, computerized adaptive testing (CAT) software weights all of the following EXCEPT ______.

A) pseudo-guessing parameter
B) difficulty
C) discrimination
D) ability
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
30
What is an advantage of computerized adaptive testing?

A) It provides less data about the test taker.
B) It takes less time to complete for the test taker.
C) It is cheaper to administer.
D) It is easier to create tests.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
31
Test developers can easily find the difficulty of an item by ______.

A) looking at how steep the item characteristic curve is
B) locating the point at which the item characteristic curve indicates a probability of .5 of answering correctly.
C) subtracting the percentage of low performers who responded correctly from the percentage of high performers who responded correctly
D) asking test takers to complete a qualitative item analysis survey on item difficulty
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
32
What is the purpose of test norms?

A) to provide a structure that makes it easy to identify test bias
B) to provide a reference point or structure for understanding one test taker's score
C) to show that many people have taken the test and their scores were normally distributed
D) to provide evidence of reliability and validity for the test
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
33
A test question has item bias when it ______.

A) is easier for one group than for another group
B) has a high discrimination index
C) does not correlate with other test items
D) does not correlate with the test's raw score
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
34
A major difficulty in setting cut scores is ______.

A) setting the score low enough that most test takers pass
B) setting the score high enough that only the best test takers pass
C) allowing for test error that may allow some to pass who should not pass
D) calculating the standard error measurement for the test scores
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
35
What is item response theory?

A) a part of classical test theory that specifies that item difficulty and discrimination are related
B) a theory that relates the performance of each item to a statistical estimate of the test taker's ability on the construct being measured
C) a theory that describes the cognitive steps a respondent takes before answering an item
D) a theory that uses probability to estimate the test taker's honestly or motivation when answering an item
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
36
What is single-group validity?

A) when validity coefficients for different subgroups differ
B) when a test is valid for one group but not for another group
C) when the target audience comprises only one type of test takers
D) when the test is valid for use only one time
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
37
What is cross-validation?

A) a repeat of the validation study sometimes using another sample of test takers
B) a validation study whose participants represent all minority groups
C) carrying out the validation in locations throughout a state or country
D) an alternative form of validation that can be accomplished using a statistical formula
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
38
Which one of the following is NOT a method for conducting a qualitative item analysis?

A) asking test takers to fill out a questionnaire about the test
B) asking test takers to attend group discussions
C) using item characteristic curves to assess item bias
D) using one-on-one interviews with test takers to find out how they interpreted the test questions
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
39
The validity coefficients that result when a test is cross-validated are usually expected to be ______.

A) the same as the validity coefficients found in the original validation study
B) lower than the validity coefficients found in the original validation study
C) higher than the validity coefficients found in the original validation study
D) unrelated to the validity coefficients found in the original validation study
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
40
Pierre conducted a validation study, and he found that the test was only valid for French-speaking males. What kind of validity did he identify?

A) between-group validity
B) single-group validity
C) differential validity
D) among-groups validity
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
41
When a common regression line for two groups is used to predict performance, but the individual regression lines for the groups differ where they cross the y-axis, which one of the following types of predictive bias is present?

A) method bias
B) construct bias
C) slope bias
D) intercept bias
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
42
What was the problem that testing experts pointed out in the Golden Rule case settlement with the Educational Testing Service (ETS)?

A) There are no laws that address test bias or discrimination among subgroups of test takers.
B) The exam developed by ETS had single group validity for Whites only.
C) Some items on the exam developed by ETS were easier for Whites than for Blacks.
D) Comparing item difficulty levels in the form of p values failed to take into consideration the test takers' level of ability.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
43
Explain the purpose of a cut score and describe two methods for identifying a cut score. Give examples for each method.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
44
Explain the concepts of predictive bias, differential validity, and single-group validity. Give an example of each.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
45
Recent research on the tests of cognitive validity have found that such tests are ______.

A) equally valid for minority and majority test takers
B) more valid for minority test takers than they are for majority test takers
C) more valid for majority test takers than they are for minority test takers
D) not valid for some groups of test takers
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
46
Two first-year college students, Carlos and Carl, are used to taking different types of academic tests. Carlos has mostly taken essay tests, and Carl has mostly taken multiple choice tests. What problem may arise when they take the same tests in college?

A) Depending on the type of tests given, there may be construct bias present in the test scores.
B) Depending on the type of tests given, there may be method bias present in the test scores.
C) Depending on the type of tests given, there may be reliability bias present in the test scores.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
47
Describe the processes and expected outcomes of validation and cross-validation studies. Explain why each is important.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
48
Items for which the p value falls in the range of 0.90 to 1.00 are usually considered ______.

A) too difficult
B) somewhat difficult
C) somewhat easy
D) too easy
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
49
Describe how we collect and interpret data for a qualitative item analysis. Discuss the types of information the test developer should seek and give examples of questions.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
50
What are item-total correlations? What is their purpose in an item analysis?
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
51
Identify and explain the criteria for retaining and dropping items to revise a test. Make a matrix with faked data for 5 items. Explain why each item should or should not be retained.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
52
Describe how we collect, analyze, and interpret data for a quantitative item analysis. Give examples.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
53
What is the purpose of an item discrimination index and how is it calculated?
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
54
While test validity is a statistical concept, test fairness is a ______.

A) social concept
B) mathematical concept
C) scientific concept
D) meaningless concept
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
55
Which one of the following explanations was thought to be the most likely reason why recent research on tests of cognitive ability showed that there was differential validity on the tests when test takers from majority and minority groups were compared?

A) range restriction in the minority group
B) range restriction in the majority group
C) culture bias for the majority group
D) culture bias for the minority group
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
56
What is the importance of item difficulty and how is it calculated?
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
57
What is the line in this graph called? <strong>What is the line in this graph called?  </strong> A) normal curve B) regression line C) item characteristic curve D) probability curve

A) normal curve
B) regression line
C) item characteristic curve
D) probability curve
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
58
What are interitem correlations? What is their purpose in an item analysis?
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
59
When the regression lines that predict performance for two groups have different slopes, which one of the following types of measurement bias is likely to occur?

A) single-group validity
B) differential validity
C) culture bias
D) method bias
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
60
When a common regression line for two groups is used to predict performance, but the individual regressions lines for the groups differ where they cross the Y axis, which one of the following problems may occur?

A) The performance of the group whose regression line crosses the y-axis at a higher point will be overpredicted.
B) The performance of the group whose regression line crosses the y-axis at a lower point will be overpredicted.
C) The performance of both groups will be overpredicted.
D) The performance of neither group will be overpredicted.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
61
What is item bias? How do test developers and researchers identify item bias?
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
62
Define and describe types of bias in psychological testing. Discuss different types of predictive bias. Give examples of each type.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
63
What is cross-validation and what is its purpose and importance? Describe two methods for finding the criterion-related validity coefficient for cross-validation.
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
64
What are item-criterion correlations? Describe one way in which developers use them?
Unlock Deck
Unlock for access to all 64 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 64 flashcards in this deck.