Exam 11: Regression With a Binary Dependent Variable

arrow
  • Select Tags
search iconSearch Question
flashcardsStudy Flashcards
  • Select Tags

The maximum likelihood estimation method produces,in general,all of the following desirable properties with the exception of

(Multiple Choice)
4.8/5
(37)

Earnings equations establish a relationship between an individual's earnings and its determinants such as years of education,tenure with an employer,IQ of the individual,professional choice,region within the country the individual is living in,etc.In addition,binary variables are often added to test for "discrimination" against certain sub-groups of the labor force such as blacks,females,etc.Compare this approach to the study in the textbook,which also investigates evidence on discrimination.Explain the fundamental differences in both approaches using equations and mathematical specifications whenever possible.

(Essay)
4.8/5
(29)

The logit model derives its name from

(Multiple Choice)
4.9/5
(35)

Consider the following probit regression Pr(Y = 1 Consider the following probit regression Pr(Y = 1   X)= Φ(8.9 - 0.14 × X) Calculate the change in probability for X increasing by 10 for X = 40 and X = 60.Why is there such a large difference in the change in probabilities? X)= Φ(8.9 - 0.14 × X) Calculate the change in probability for X increasing by 10 for X = 40 and X = 60.Why is there such a large difference in the change in probabilities?

(Essay)
4.9/5
(28)

(Requires Advanced material)Nonlinear least squares estimators in general are not

(Multiple Choice)
4.9/5
(39)

The probit model

(Multiple Choice)
4.8/5
(26)

The binary dependent variable model is an example of a

(Multiple Choice)
4.8/5
(42)

The Report of the Presidential Commission on the Space Shuttle Challenger Accident in 1986 shows a plot of the calculated joint temperature in Fahrenheit and the number of O-rings that had some thermal distress.You collect the data for the seven flights for which thermal distress was identified before the fatal flight and produce the accompanying plot. The Report of the Presidential Commission on the Space Shuttle Challenger Accident in 1986 shows a plot of the calculated joint temperature in Fahrenheit and the number of O-rings that had some thermal distress.You collect the data for the seven flights for which thermal distress was identified before the fatal flight and produce the accompanying plot.   (a)Do you see any relationship between the temperature and the number of O-ring failures? If you fitted a linear regression line through these seven observations,do you think the slope would be positive or negative? Significantly different from zero? Do you see any problems other than the sample size in your procedure? (b)You decide to look at all successful launches before Challenger,even those for which there were no incidents.Furthermore you simplify the problem by specifying a binary variable,which takes on the value one if there was some O-ring failure and is zero otherwise.You then fit a linear probability model with the following result,   = 2.858 - 0.037 × Temperature;R2 = 0.325,SER = 0.390, (0.496)(0.007) where Ofail is the binary variable which is one for launches where O-rings showed some thermal distress,and Temperature is measured in degrees of Fahrenheit.The numbers in parentheses are heteroskedasticity-robust standard errors. Interpret the equation.Why do you think that heteroskedasticity-robust standard errors were used? What is your prediction for some O-ring thermal distress when the temperature is 31°,the temperature on January 28,1986? Above which temperature do you predict values of less than zero? Below which temperature do you predict values of greater than one? (c)To fix the problem encountered in (b),you re-estimate the relationship using a logit regression: Pr(OFail = 1   Temperature)= F (15.297 - 0.236 × Temperature);pseudo- R2=0.297 (7.329)(0.107) What is the meaning of the slope coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50°.Why is the change in probability not constant? How does this compare to the linear probability model? (d)You want to see how sensitive the results are to using the logit,rather than the probit estimation method.The probit regression is as follows: Pr(OFail = 1   Temperature)= Φ(8.900 - 0.137 × Temperature);pseudo- R2=0.296 (3.983)(0.058) Why is the slope coefficient in the probit so different from the logit coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50° and compare the resulting changes in probability to your results in (c).What is the meaning of the pseudo- R2? What other measures of fit might you want to consider? (e)Calculate the predicted probability for 80° and 40°,using your probit and logit estimates.Based on the relationship between the probabilities,sketch what the general relationship between the logit and probit regressions is.Does there seem to be much of a difference for values other than these extreme values? (f)You decide to run one more regression,where the dependent variable is the actual number of incidences (NoOFail).You allow for a different functional form by choosing the inverse of the temperature,and estimate the regression by OLS.   = -3.8853 + 295.545 × (1/Temperature);R2 = 0.386,SER = 0.622 (1.516)(106.541) What is your prediction for O-ring failures for the 31° temperature which was forecasted for the launch on January 28,1986? Sketch the fitted line of the regression above. (a)Do you see any relationship between the temperature and the number of O-ring failures? If you fitted a linear regression line through these seven observations,do you think the slope would be positive or negative? Significantly different from zero? Do you see any problems other than the sample size in your procedure? (b)You decide to look at all successful launches before Challenger,even those for which there were no incidents.Furthermore you simplify the problem by specifying a binary variable,which takes on the value one if there was some O-ring failure and is zero otherwise.You then fit a linear probability model with the following result, The Report of the Presidential Commission on the Space Shuttle Challenger Accident in 1986 shows a plot of the calculated joint temperature in Fahrenheit and the number of O-rings that had some thermal distress.You collect the data for the seven flights for which thermal distress was identified before the fatal flight and produce the accompanying plot.   (a)Do you see any relationship between the temperature and the number of O-ring failures? If you fitted a linear regression line through these seven observations,do you think the slope would be positive or negative? Significantly different from zero? Do you see any problems other than the sample size in your procedure? (b)You decide to look at all successful launches before Challenger,even those for which there were no incidents.Furthermore you simplify the problem by specifying a binary variable,which takes on the value one if there was some O-ring failure and is zero otherwise.You then fit a linear probability model with the following result,   = 2.858 - 0.037 × Temperature;R2 = 0.325,SER = 0.390, (0.496)(0.007) where Ofail is the binary variable which is one for launches where O-rings showed some thermal distress,and Temperature is measured in degrees of Fahrenheit.The numbers in parentheses are heteroskedasticity-robust standard errors. Interpret the equation.Why do you think that heteroskedasticity-robust standard errors were used? What is your prediction for some O-ring thermal distress when the temperature is 31°,the temperature on January 28,1986? Above which temperature do you predict values of less than zero? Below which temperature do you predict values of greater than one? (c)To fix the problem encountered in (b),you re-estimate the relationship using a logit regression: Pr(OFail = 1   Temperature)= F (15.297 - 0.236 × Temperature);pseudo- R2=0.297 (7.329)(0.107) What is the meaning of the slope coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50°.Why is the change in probability not constant? How does this compare to the linear probability model? (d)You want to see how sensitive the results are to using the logit,rather than the probit estimation method.The probit regression is as follows: Pr(OFail = 1   Temperature)= Φ(8.900 - 0.137 × Temperature);pseudo- R2=0.296 (3.983)(0.058) Why is the slope coefficient in the probit so different from the logit coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50° and compare the resulting changes in probability to your results in (c).What is the meaning of the pseudo- R2? What other measures of fit might you want to consider? (e)Calculate the predicted probability for 80° and 40°,using your probit and logit estimates.Based on the relationship between the probabilities,sketch what the general relationship between the logit and probit regressions is.Does there seem to be much of a difference for values other than these extreme values? (f)You decide to run one more regression,where the dependent variable is the actual number of incidences (NoOFail).You allow for a different functional form by choosing the inverse of the temperature,and estimate the regression by OLS.   = -3.8853 + 295.545 × (1/Temperature);R2 = 0.386,SER = 0.622 (1.516)(106.541) What is your prediction for O-ring failures for the 31° temperature which was forecasted for the launch on January 28,1986? Sketch the fitted line of the regression above. = 2.858 - 0.037 × Temperature;R2 = 0.325,SER = 0.390, (0.496)(0.007) where Ofail is the binary variable which is one for launches where O-rings showed some thermal distress,and Temperature is measured in degrees of Fahrenheit.The numbers in parentheses are heteroskedasticity-robust standard errors. Interpret the equation.Why do you think that heteroskedasticity-robust standard errors were used? What is your prediction for some O-ring thermal distress when the temperature is 31°,the temperature on January 28,1986? Above which temperature do you predict values of less than zero? Below which temperature do you predict values of greater than one? (c)To fix the problem encountered in (b),you re-estimate the relationship using a logit regression: Pr(OFail = 1 The Report of the Presidential Commission on the Space Shuttle Challenger Accident in 1986 shows a plot of the calculated joint temperature in Fahrenheit and the number of O-rings that had some thermal distress.You collect the data for the seven flights for which thermal distress was identified before the fatal flight and produce the accompanying plot.   (a)Do you see any relationship between the temperature and the number of O-ring failures? If you fitted a linear regression line through these seven observations,do you think the slope would be positive or negative? Significantly different from zero? Do you see any problems other than the sample size in your procedure? (b)You decide to look at all successful launches before Challenger,even those for which there were no incidents.Furthermore you simplify the problem by specifying a binary variable,which takes on the value one if there was some O-ring failure and is zero otherwise.You then fit a linear probability model with the following result,   = 2.858 - 0.037 × Temperature;R2 = 0.325,SER = 0.390, (0.496)(0.007) where Ofail is the binary variable which is one for launches where O-rings showed some thermal distress,and Temperature is measured in degrees of Fahrenheit.The numbers in parentheses are heteroskedasticity-robust standard errors. Interpret the equation.Why do you think that heteroskedasticity-robust standard errors were used? What is your prediction for some O-ring thermal distress when the temperature is 31°,the temperature on January 28,1986? Above which temperature do you predict values of less than zero? Below which temperature do you predict values of greater than one? (c)To fix the problem encountered in (b),you re-estimate the relationship using a logit regression: Pr(OFail = 1   Temperature)= F (15.297 - 0.236 × Temperature);pseudo- R2=0.297 (7.329)(0.107) What is the meaning of the slope coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50°.Why is the change in probability not constant? How does this compare to the linear probability model? (d)You want to see how sensitive the results are to using the logit,rather than the probit estimation method.The probit regression is as follows: Pr(OFail = 1   Temperature)= Φ(8.900 - 0.137 × Temperature);pseudo- R2=0.296 (3.983)(0.058) Why is the slope coefficient in the probit so different from the logit coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50° and compare the resulting changes in probability to your results in (c).What is the meaning of the pseudo- R2? What other measures of fit might you want to consider? (e)Calculate the predicted probability for 80° and 40°,using your probit and logit estimates.Based on the relationship between the probabilities,sketch what the general relationship between the logit and probit regressions is.Does there seem to be much of a difference for values other than these extreme values? (f)You decide to run one more regression,where the dependent variable is the actual number of incidences (NoOFail).You allow for a different functional form by choosing the inverse of the temperature,and estimate the regression by OLS.   = -3.8853 + 295.545 × (1/Temperature);R2 = 0.386,SER = 0.622 (1.516)(106.541) What is your prediction for O-ring failures for the 31° temperature which was forecasted for the launch on January 28,1986? Sketch the fitted line of the regression above. Temperature)= F (15.297 - 0.236 × Temperature);pseudo- R2=0.297 (7.329)(0.107) What is the meaning of the slope coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50°.Why is the change in probability not constant? How does this compare to the linear probability model? (d)You want to see how sensitive the results are to using the logit,rather than the probit estimation method.The probit regression is as follows: Pr(OFail = 1 The Report of the Presidential Commission on the Space Shuttle Challenger Accident in 1986 shows a plot of the calculated joint temperature in Fahrenheit and the number of O-rings that had some thermal distress.You collect the data for the seven flights for which thermal distress was identified before the fatal flight and produce the accompanying plot.   (a)Do you see any relationship between the temperature and the number of O-ring failures? If you fitted a linear regression line through these seven observations,do you think the slope would be positive or negative? Significantly different from zero? Do you see any problems other than the sample size in your procedure? (b)You decide to look at all successful launches before Challenger,even those for which there were no incidents.Furthermore you simplify the problem by specifying a binary variable,which takes on the value one if there was some O-ring failure and is zero otherwise.You then fit a linear probability model with the following result,   = 2.858 - 0.037 × Temperature;R2 = 0.325,SER = 0.390, (0.496)(0.007) where Ofail is the binary variable which is one for launches where O-rings showed some thermal distress,and Temperature is measured in degrees of Fahrenheit.The numbers in parentheses are heteroskedasticity-robust standard errors. Interpret the equation.Why do you think that heteroskedasticity-robust standard errors were used? What is your prediction for some O-ring thermal distress when the temperature is 31°,the temperature on January 28,1986? Above which temperature do you predict values of less than zero? Below which temperature do you predict values of greater than one? (c)To fix the problem encountered in (b),you re-estimate the relationship using a logit regression: Pr(OFail = 1   Temperature)= F (15.297 - 0.236 × Temperature);pseudo- R2=0.297 (7.329)(0.107) What is the meaning of the slope coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50°.Why is the change in probability not constant? How does this compare to the linear probability model? (d)You want to see how sensitive the results are to using the logit,rather than the probit estimation method.The probit regression is as follows: Pr(OFail = 1   Temperature)= Φ(8.900 - 0.137 × Temperature);pseudo- R2=0.296 (3.983)(0.058) Why is the slope coefficient in the probit so different from the logit coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50° and compare the resulting changes in probability to your results in (c).What is the meaning of the pseudo- R2? What other measures of fit might you want to consider? (e)Calculate the predicted probability for 80° and 40°,using your probit and logit estimates.Based on the relationship between the probabilities,sketch what the general relationship between the logit and probit regressions is.Does there seem to be much of a difference for values other than these extreme values? (f)You decide to run one more regression,where the dependent variable is the actual number of incidences (NoOFail).You allow for a different functional form by choosing the inverse of the temperature,and estimate the regression by OLS.   = -3.8853 + 295.545 × (1/Temperature);R2 = 0.386,SER = 0.622 (1.516)(106.541) What is your prediction for O-ring failures for the 31° temperature which was forecasted for the launch on January 28,1986? Sketch the fitted line of the regression above. Temperature)= Φ(8.900 - 0.137 × Temperature);pseudo- R2=0.296 (3.983)(0.058) Why is the slope coefficient in the probit so different from the logit coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50° and compare the resulting changes in probability to your results in (c).What is the meaning of the pseudo- R2? What other measures of fit might you want to consider? (e)Calculate the predicted probability for 80° and 40°,using your probit and logit estimates.Based on the relationship between the probabilities,sketch what the general relationship between the logit and probit regressions is.Does there seem to be much of a difference for values other than these extreme values? (f)You decide to run one more regression,where the dependent variable is the actual number of incidences (NoOFail).You allow for a different functional form by choosing the inverse of the temperature,and estimate the regression by OLS. The Report of the Presidential Commission on the Space Shuttle Challenger Accident in 1986 shows a plot of the calculated joint temperature in Fahrenheit and the number of O-rings that had some thermal distress.You collect the data for the seven flights for which thermal distress was identified before the fatal flight and produce the accompanying plot.   (a)Do you see any relationship between the temperature and the number of O-ring failures? If you fitted a linear regression line through these seven observations,do you think the slope would be positive or negative? Significantly different from zero? Do you see any problems other than the sample size in your procedure? (b)You decide to look at all successful launches before Challenger,even those for which there were no incidents.Furthermore you simplify the problem by specifying a binary variable,which takes on the value one if there was some O-ring failure and is zero otherwise.You then fit a linear probability model with the following result,   = 2.858 - 0.037 × Temperature;R2 = 0.325,SER = 0.390, (0.496)(0.007) where Ofail is the binary variable which is one for launches where O-rings showed some thermal distress,and Temperature is measured in degrees of Fahrenheit.The numbers in parentheses are heteroskedasticity-robust standard errors. Interpret the equation.Why do you think that heteroskedasticity-robust standard errors were used? What is your prediction for some O-ring thermal distress when the temperature is 31°,the temperature on January 28,1986? Above which temperature do you predict values of less than zero? Below which temperature do you predict values of greater than one? (c)To fix the problem encountered in (b),you re-estimate the relationship using a logit regression: Pr(OFail = 1   Temperature)= F (15.297 - 0.236 × Temperature);pseudo- R2=0.297 (7.329)(0.107) What is the meaning of the slope coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50°.Why is the change in probability not constant? How does this compare to the linear probability model? (d)You want to see how sensitive the results are to using the logit,rather than the probit estimation method.The probit regression is as follows: Pr(OFail = 1   Temperature)= Φ(8.900 - 0.137 × Temperature);pseudo- R2=0.296 (3.983)(0.058) Why is the slope coefficient in the probit so different from the logit coefficient? Calculate the effect of a decrease in temperature from 80° to 70°,and from 60° to 50° and compare the resulting changes in probability to your results in (c).What is the meaning of the pseudo- R2? What other measures of fit might you want to consider? (e)Calculate the predicted probability for 80° and 40°,using your probit and logit estimates.Based on the relationship between the probabilities,sketch what the general relationship between the logit and probit regressions is.Does there seem to be much of a difference for values other than these extreme values? (f)You decide to run one more regression,where the dependent variable is the actual number of incidences (NoOFail).You allow for a different functional form by choosing the inverse of the temperature,and estimate the regression by OLS.   = -3.8853 + 295.545 × (1/Temperature);R2 = 0.386,SER = 0.622 (1.516)(106.541) What is your prediction for O-ring failures for the 31° temperature which was forecasted for the launch on January 28,1986? Sketch the fitted line of the regression above. = -3.8853 + 295.545 × (1/Temperature);R2 = 0.386,SER = 0.622 (1.516)(106.541) What is your prediction for O-ring failures for the 31° temperature which was forecasted for the launch on January 28,1986? Sketch the fitted line of the regression above.

(Essay)
5.0/5
(37)

The following tools from multiple regression analysis carry over in a meaningful manner to the linear probability model,with the exception of the

(Multiple Choice)
4.8/5
(29)

When testing joint hypothesis,you can use

(Multiple Choice)
4.9/5
(37)

A study analyzed the probability of Major League Baseball (MLB)players to "survive" for another season,or,in other words,to play one more season.The researchers had a sample of 4,728 hitters and 3,803 pitchers for the years 1901-1999.All explanatory variables are standardized.The probit estimation yielded the results as shown in the table: A study analyzed the probability of Major League Baseball (MLB)players to survive for another season,or,in other words,to play one more season.The researchers had a sample of 4,728 hitters and 3,803 pitchers for the years 1901-1999.All explanatory variables are standardized.The probit estimation yielded the results as shown in the table:    where the limited dependent variable takes on a value of one if the player had one more season (a minimum of 50 at bats or 25 innings pitched),number of seasons played is measured in years,performance is the batting average for hitters and the earned run average for pitchers,and average performance refers to performance over the career. (a)Interpret the two probit equations and calculate survival probabilities for hitters and pitchers at the sample mean.Why are these so high? (b)Calculate the change in the survival probability for a player who has a very bad year by performing two standard deviations below the average (assume also that this player has been in the majors for many years so that his average performance is hardly affected).How does this change the survival probability when compared to the answer in (a)? (c)Since the results seem similar,the researcher could consider combining the two samples.Explain in some detail how this could be done and how you could test the hypothesis that the coefficients are the same. where the limited dependent variable takes on a value of one if the player had one more season (a minimum of 50 at bats or 25 innings pitched),number of seasons played is measured in years,performance is the batting average for hitters and the earned run average for pitchers,and average performance refers to performance over the career. (a)Interpret the two probit equations and calculate survival probabilities for hitters and pitchers at the sample mean.Why are these so high? (b)Calculate the change in the survival probability for a player who has a very bad year by performing two standard deviations below the average (assume also that this player has been in the majors for many years so that his average performance is hardly affected).How does this change the survival probability when compared to the answer in (a)? (c)Since the results seem similar,the researcher could consider combining the two samples.Explain in some detail how this could be done and how you could test the hypothesis that the coefficients are the same.

(Essay)
4.9/5
(42)

In the expression Pr(Y = 1 In the expression Pr(Y = 1   = Φ(β0 + β1X), = Φ(β0 + β1X),

(Multiple Choice)
4.9/5
(41)

Equation (11.3)in your textbook presents the regression results for the linear probability model,and equation (11.10)the results for the logit model. a.Using a spreadsheet program such as Excel,plot the predicted probabilities for being denied a loan for both the linear probability model and the logit model if you are black.(Use a range from 0 to 1 for the P/I Ratio and allow for it to increase by increments of 0.05. ) b.Given the shortcomings of the linear probability model,do you think that it is a reasonable approximation to the logit model? c.Repeat the exercise using predicted probabilities for whites.

(Essay)
4.9/5
(37)

In the probit model Pr(Y = 1 In the probit model Pr(Y = 1   = Φ(β0 + β1X),Φ = Φ(β0 + β1X),Φ

(Multiple Choice)
4.8/5
(41)

When having a choice of which estimator to use with a binary dependent variable,use

(Multiple Choice)
4.9/5
(33)

Nonlinear least squares

(Multiple Choice)
4.9/5
(35)

In the expression Pr(deny = 1 In the expression Pr(deny = 1   P/I Ratio,black)= Φ(-2.26 + 2.74P/I ratio + 0.71black),the effect of increasing the P/I ratio from 0.3 to 0.4 for a white person P/I Ratio,black)= Φ(-2.26 + 2.74P/I ratio + 0.71black),the effect of increasing the P/I ratio from 0.3 to 0.4 for a white person

(Multiple Choice)
4.7/5
(36)

F-statistics computed using maximum likelihood estimators

(Multiple Choice)
4.8/5
(36)

In the probit regression,the coefficient β1 indicates

(Multiple Choice)
4.9/5
(37)

(Requires Appendix material and Calculus)The log of the likelihood function (L)for the simple regression model with i.i.d.normal errors is as follows (note that taking the logarithm of the likelihood function simplifies maximization.It is a monotonic transformation of the likelihood function,meaning that this transformation does not affect the choice of maximum): L = - (Requires Appendix material and Calculus)The log of the likelihood function (L)for the simple regression model with i.i.d.normal errors is as follows (note that taking the logarithm of the likelihood function simplifies maximization.It is a monotonic transformation of the likelihood function,meaning that this transformation does not affect the choice of maximum): L = -   log(2π)-   log σ2 -   Derive the maximum likelihood estimator for the slope and intercept.What general properties do these estimators have? Explain intuitively why the OLS estimator is identical to the maximum likelihood estimator here. log(2π)- (Requires Appendix material and Calculus)The log of the likelihood function (L)for the simple regression model with i.i.d.normal errors is as follows (note that taking the logarithm of the likelihood function simplifies maximization.It is a monotonic transformation of the likelihood function,meaning that this transformation does not affect the choice of maximum): L = -   log(2π)-   log σ2 -   Derive the maximum likelihood estimator for the slope and intercept.What general properties do these estimators have? Explain intuitively why the OLS estimator is identical to the maximum likelihood estimator here. log σ2 - (Requires Appendix material and Calculus)The log of the likelihood function (L)for the simple regression model with i.i.d.normal errors is as follows (note that taking the logarithm of the likelihood function simplifies maximization.It is a monotonic transformation of the likelihood function,meaning that this transformation does not affect the choice of maximum): L = -   log(2π)-   log σ2 -   Derive the maximum likelihood estimator for the slope and intercept.What general properties do these estimators have? Explain intuitively why the OLS estimator is identical to the maximum likelihood estimator here. Derive the maximum likelihood estimator for the slope and intercept.What general properties do these estimators have? Explain intuitively why the OLS estimator is identical to the maximum likelihood estimator here.

(Essay)
4.7/5
(41)
Showing 21 - 40 of 50
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)