Exam 19: Logistic Regression
Complete the missing information for Table 1, using 0.50 as the cut value. Then complete the classification table (Table 2). Compute sensitivity, specificity, false positive rate, and false negative rate.
Table 1. Observed group membership Predicted Probability Predicted group membership 1 0.88 1 0.72 0 0.62 1 0.49 0 0.34 1 0.40 1 0.60 0 0.21 0 0.05 1 0.57
Table 2. .00 1.00 Observed .00 1.00
Assuming 0.5 is the cut value, cases with predicted probabilities at .5 or above are predicted as 1 and predicted probabilities below .5 are predicted as 0. There are four cases with observed value 1 and predicted value 1. There are three cases with observed value 0 and predicted value 0. There is one case with observed value 0 yet predicted value 1 (false positive).
There are two cases with observed value 1 yet predicted value 0 (false negative).
Sensitivity = 4/(2+4) = 0.67 = 67%
Specificity = 3/(3+1) = 0.75 = 75%
False positive rate = 1/(3+1) = 0.25 -25%
False negative rate = 2/(2+4) = 0.33 = 33%
You are given the following data, where X1 (sex; male = 0, female =1; use 0 as the reference category) and X2 (having at least one immediate family member who smokes; yes = 1, no = 0; use 0 as the reference category) are used to predict Y (being a smoker = 1 vs. being a nonsmoker = 0).
( = .05)
0 0 1 0 0 0 0 1 1 0 1 1 1 0 0 1 0 0 1 0 1 1 0 0 1 1 1 1 1 0
Determine the following values based on simultaneous entry of independent variables: -2LL, constant, b1, b2, se(b1), se(b2), odds ratios, Wald1, Wald2.
-2LL = 10.688;
b1(Sex) = 1.792, b2(Family) = 1.792, bconstant = -1.386;
se(b1(Sex)) = 1.571, se(b2(Family)) = 1.571;
odds ratio1(Sex) = 6.000, odds ratio2(Family) = 6.000;
Wald1(Sex) = 1.301, Wald2(Family) = 1.301.
Procedure:
Create a data set with three variables: Sex (X1), Family (X2), and Smoke (Y). The data set should have 10 cases.
Follow the steps described in Question 2.
Use Smoke as the dependent variable, and Sex and Family as the covariates.
In step 3, move both Sex and Family into the Categorical Covariates box. Select Reference Category: First. Click Change.
Selected SPSS Output:
Which one of the following can be used as an appropriate dependent variable for binary logistic regression?
If a person is predicted to be a smoker, we would expect that
-Aaron is studying smoking behavior and has coded "smoker" as "1" and "nonsmoker" as "0." The predictor is the number of family members who smoke. Which of the following is a correct interpretation of an odds ratio of +2?

A study was conducted to investigate variables associated with dropping out of high school.
The following logistic regression model was obtained:
Logit(Yi) = 3.5 - 1.3X1 + 2.3X2.
Y: 1 = dropped out of high school; 0 = did not drop out of high school;
X1: cumulative high school GPA obtained;
X2: 1 = retained in at least one grade; 0 = never retained in any grade.
-If Mindy has a high school GPA of 3, and has never repeated a grade, which of the following predictions can be derived from the model?
Complete the missing information for this table (Y is a dichotomous variable).
P(Y=1) P(Y=0) Odds(Y=1) 0.10 0.25 0.40 0.20 0.90 0.75 0.60
A study was conducted to investigate variables associated with dropping out of high school.
The following logistic regression model was obtained:
Logit(Yi) = 3.5 - 1.3X1 + 2.3X2.
Y: 1 = dropped out of high school; 0 = did not drop out of high school;
X1: cumulative high school GPA obtained;
X2: 1 = retained in at least one grade; 0 = never retained in any grade.
-What is being predicted in this model?
A study was conducted to investigate variables associated with dropping out of high school.
The following logistic regression model was obtained:
Logit(Yi) = 3.5 - 1.3X1 + 2.3X2.
Y: 1 = dropped out of high school; 0 = did not drop out of high school;
X1: cumulative high school GPA obtained;
X2: 1 = retained in at least one grade; 0 = never retained in any grade.
-Based on logistic regression, if a student has been retained in at least one grade, the chance that he/she will drop out of high school
Professor Pruefung wanted to examine if performance in quizzes can predict whether a student will pass or fail the final exam. The independent variables are scores in two pop quizzes (Quiz1, Quiz2), and the dependent variable is a dichotomous variable (pass = 1 vs. fail = 0). Below is part of the output of the analysis.
a. Professor Pruefung assumed that the better a student performed in the quizzes (a higher score indicates better performance), the higher the odds that he/she will pass the final exam. If that is the case, what are the expected signs for b1 and b2? Do the results confirm the expectation?
b. Based on the tables, is there any indication of assumptions violation? If so, which assumption(s) has (have) been violated?
c. What are the possible consequences of the assumption violation?
Chi-square df Sig. Step Step 24.055 2 .000 1 Block 24.055 2 .000 Model 24.055 2 .000
Step -2 Cox \& Snell Nagelkerke likelihood R Square R Square 1 22.998 .452 .653
Variables in the Equation B S.E. Wald df Sig. Exp () Step 1 Quiz1 1.557 1.064 2.140 1 .143 4.745 Quiz2 -.535 1.023 .273 1 .601 .586 Constant -21.721 8.990 5.838 1 .016 .000

You are given the following data, where X1 (high school cumulative GPA) and X2 (having repeated grade; 0 = never repeated any grade and 1 = have repeated at least one grade; use 0 as the reference category) are used to predict Y (dropping out of high school, "1," vs. graduating high school, "0").
( = .05)
2.50 1 0 2.60 0 0 2.75 0 0 1.33 1 1 3.00 1 0 3.42 0 0 2.70 1 1 2.33 1 1 1.75 0 1 2.80 0 0
Determine the following values based on simultaneous entry of the independent variables: -2LL, constant, b1, b2, se(b1), se(b2), odds ratios, Wald1, Wald2.
Which one of the following can occur when the number of variables equals, or nearly equals, the number of cases in the data?
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)