Exam 20: Model Building

arrow
  • Select Tags
search iconSearch Question
  • Select Tags

In a stepwise regression procedure, if two independent variables are highly correlated, then one variable usually eliminates the second variable.

Free
(True/False)
4.8/5
(30)
Correct Answer:
Verified

True

We interpret the coefficients in a multiple regression model by holding all variables in the model constant.

Free
(True/False)
4.9/5
(38)
Correct Answer:
Verified

False

In the first-order model  In the first-order model   =75-12 x_{1}+5 x_{2}-3 x_{1} x_{2}  , a unit increase in  x _ { 1 }  , while holding  x _ { 2 }  constant at a value of 2, decreases the value of  y  on average by 8 units. =7512x1+5x23x1x2=75-12 x_{1}+5 x_{2}-3 x_{1} x_{2} , a unit increase in x1x _ { 1 } , while holding x2x _ { 2 } constant at a value of 2, decreases the value of yy on average by 8 units.

Free
(True/False)
4.9/5
(42)
Correct Answer:
Verified

False

In explaining the income earned by university graduates, which of the following independent variables is best represented by an indicator variable in a regression model?

(Multiple Choice)
4.8/5
(29)

The owner of an air conditioner business wants to investigate the relationship between the weekly number of air conditioners sold, temperature and the seasons of the year. A random sample of 14 weeks is taken, with the average temperature of that week (in degrees Celsius) and the quarter from which that week belonged, noted. There are three indicator variables, March, September and December. Excel is used to generate the following multiple linear regression output. SUMMARY OUTPUT Regression Statistics Multiple R 0.99 R Square 0.97 Adjusted RSquare 0.96 Standard Error 4.54 Observations 14.00 ANOVA df SS MS F Significance Regression 4.00 6999.27 1749.82 84.86 0.00 Residual 9.00 185.58 20.62 Total 13.00 7184.86 Coefficients Standard Error tStat P-value Lower 95\% Upper 95\% Intercept -17.94 8.54 -2.10 0.07 -37.27 1.38 Temperature 1.00 0.35 2.84 0.02 0.20 1.79 March 1.01 4.60 0.22 0.83 -9.39 11.40 September 7.22 5.58 1.29 0.23 -5.40 19.84 Deomber 27.87 6.55 4.26 0.00 13.06 42.68 (a) Write the linear regression model for each quarter: March, June, September and December (a) Roughly, sketch on the same set of axes, showing the intercept and the slope.

(Essay)
4.7/5
(41)

A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to the conclusion that two important variables are the number of cars and the number of tractor-trailer trucks. She proposed the second-order model with interaction: y=y = β0+β1x1+β2x2+β3x12+β4x22+β5x1x2+ε\beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } x _ { 2 } + \varepsilon . Where: y = number of annual fatalities per shire. x1x _ { 1 } = number of cars registered in the shire (in units of 10 000). x2x _ { 2 } = number of trucks registered in the shire (in units of 1000). The computer output (based on a random sample of 35 shires) is shown below. THE REGRESSION EQUATION IS y=y = 69.7+11.3x1+7.61x21.15x120.51x220.13x1x269.7 + 11.3 x _ { 1 } + 7.61 x _ { 2 } - 1.15 x _ { 1 } ^ { 2 } - 0.51 x _ { 2 } ^ { 2 } - 0.13 x _ { 1 } x _ { 2 } . Predictor Coef SiDev T Constant 69.7 41.3 1.688 11.3 5.1 2.216 7.61 2.55 2.984 -1.15 0.64 -1.797 -0.51 0.20 -2.55 -0.13 0.10 -1.30 S = 15.2 R-Sq = 47.2%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 5 5959 1191.800 5.181 Error 29 6671 230.034 Total 34 12630 Test at the 1% significance level to determine whether the x1x _ { 1 } term should be retained in the model.

(Essay)
4.7/5
(27)

Which of the following describes the numbers that an indicator variable can have in a regression model?

(Multiple Choice)
5.0/5
(29)

The graph of the model  The graph of the model   =\beta_{0}+\beta_{1} x_{i}+\beta_{2} x_{i}^{2}  is shaped like a straight line going upwards. =β0+β1xi+β2xi2=\beta_{0}+\beta_{1} x_{i}+\beta_{2} x_{i}^{2} is shaped like a straight line going upwards.

(True/False)
4.8/5
(30)

In general, to represent a categorical independent variable that has m possible categories, which of the following is the number of dummy variables that can be used in the regression model?

(Multiple Choice)
4.8/5
(37)

An economist is analysing the incomes of professionals (physicians, dentists and lawyers). He realises that an important factor is the number of years of experience. However, he wants to know if there are differences among the three professional groups. He takes a random sample of 125 professionals and estimates the multiple regression model: y=β0+β1x1+β2x2+β3x3+εy = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 } + \varepsilon . where y = annual income (in $1000). x1x _ { 1 } = years of experience. x2x _ { 2 } = 1 if physician. = 0 if not. x3x _ { 3 } = 1 if dentist. = 0 if not. The computer output is shown below. THE REGRESSION EQUATION IS y=y = 71.65+2.07x1+10.16x27.44x371.65 + 2.07 x _ { 1 } + 10.16 x _ { 2 } - 7.44 x _ { 3 } . Predictor Coef StDev T Constant 71.65 18.56 3.860 2.07 0.81 2.556 10.16 3.16 3.215 -7.44 2.85 -2.611 S = 42.6 R-Sq = 30.9%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 3 98008 32669.333 18.008 Error 121 219508 1814.116 Total 124 317516 Is there enough evidence at the 10% significance level to conclude that dentists earn less on average than lawyers?

(Essay)
4.9/5
(31)

A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to the conclusion that two important variables are the number of cars and the number of tractor-trailer trucks. She proposed the second-order model with interaction: y=y = β0+β1x1+β2x2+β3x12+β4x22+β5x1x2+ε\beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } x _ { 2 } + \varepsilon . Where: y = number of annual fatalities per shire. x1x _ { 1 } = number of cars registered in the shire (in units of 10 000). x2x _ { 2 } = number of trucks registered in the shire (in units of 1000). The computer output (based on a random sample of 35 shires) is shown below. THE REGRESSION EQUATION IS y=y = 69.7+11.3x1+7.61x21.15x120.51x220.13x1x269.7 + 11.3 x _ { 1 } + 7.61 x _ { 2 } - 1.15 x _ { 1 } ^ { 2 } - 0.51 x _ { 2 } ^ { 2 } - 0.13 x _ { 1 } x _ { 2 } . Predictor Coef SiDev T Constant 69.7 41.3 1.688 11.3 5.1 2.216 7.61 2.55 2.984 -1.15 0.64 -1.797 -0.51 0.20 -2.55 -0.13 0.10 -1.30 S = 15.2 R-Sq = 47.2%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 5 5959 1191.800 5.181 Error 29 6671 230.034 Total 34 12630 Is there enough evidence at the 5% significance level to conclude that the model is useful in predicting the number of fatalities?

(Essay)
4.8/5
(42)

A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to the conclusion that two important variables are the number of cars and the number of tractor-trailer trucks. She proposed the second-order model with interaction: y=y = β0+β1x1+β2x2+β3x12+β4x22+β5x1x2+ε\beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } x _ { 2 } + \varepsilon . Where: y = number of annual fatalities per shire. x1x _ { 1 } = number of cars registered in the shire (in units of 10 000). x2x _ { 2 } = number of trucks registered in the shire (in units of 1000). The computer output (based on a random sample of 35 shires) is shown below. THE REGRESSION EQUATION IS y=y = 69.7+11.3x1+7.61x21.15x120.51x220.13x1x269.7 + 11.3 x _ { 1 } + 7.61 x _ { 2 } - 1.15 x _ { 1 } ^ { 2 } - 0.51 x _ { 2 } ^ { 2 } - 0.13 x _ { 1 } x _ { 2 } . Predictor Coef SiDev T Constant 69.7 41.3 1.688 11.3 5.1 2.216 7.61 2.55 2.984 -1.15 0.64 -1.797 -0.51 0.20 -2.55 -0.13 0.10 -1.30 S = 15.2 R-Sq = 47.2%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 5 5959 1191.800 5.181 Error 29 6671 230.034 Total 34 12630 Test at the 1% significance level to determine whether the x22x _ { 2 } ^ { 2 } term should be retained in the model.

(Essay)
4.9/5
(30)

Which of the following is another name for a dummy variable?

(Multiple Choice)
4.9/5
(30)

A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to the conclusion that two important variables are the number of cars and the number of tractor-trailer trucks. She proposed the second-order model with interaction: y=y = β0+β1x1+β2x2+β3x12+β4x22+β5x1x2+ε\beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } x _ { 2 } + \varepsilon . Where: y = number of annual fatalities per shire. x1x _ { 1 } = number of cars registered in the shire (in units of 10 000). x2x _ { 2 } = number of trucks registered in the shire (in units of 1000). The computer output (based on a random sample of 35 shires) is shown below. THE REGRESSION EQUATION IS y=y = 69.7+11.3x1+7.61x21.15x120.51x220.13x1x269.7 + 11.3 x _ { 1 } + 7.61 x _ { 2 } - 1.15 x _ { 1 } ^ { 2 } - 0.51 x _ { 2 } ^ { 2 } - 0.13 x _ { 1 } x _ { 2 } . Predictor Coef SiDev T Constant 69.7 41.3 1.688 11.3 5.1 2.216 7.61 2.55 2.984 -1.15 0.64 -1.797 -0.51 0.20 -2.55 -0.13 0.10 -1.30 S = 15.2 R-Sq = 47.2%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 5 5959 1191.800 5.181 Error 29 6671 230.034 Total 34 12630 Test at the 1% significance level to determine whether the x12x _ { 1 } ^ { 2 } term should be retained in the model.

(Essay)
4.8/5
(36)

Suppose that the sample regression equation of a model is  Suppose that the sample regression equation of a model is   =10+4 x_{1}+3 x_{2}-x_{1} x_{2}  . If we examine the relationship between  x _ { 1 }  and y for three different values of  x _ { 2 }  , we observe that the: =10+4x1+3x2x1x2=10+4 x_{1}+3 x_{2}-x_{1} x_{2} . If we examine the relationship between x1x _ { 1 } and y for three different values of x2x _ { 2 } , we observe that the:

(Multiple Choice)
4.9/5
(28)

For the regression equation  For the regression equation   =20+8 x_{1}+5 x_{2}+3 x_{1} x_{2}  , which combination of  x _ { 1 }  and  x _ { 2 }  , respectively, results in the largest average value of y? =20+8x1+5x2+3x1x2=20+8 x_{1}+5 x_{2}+3 x_{1} x_{2} , which combination of x1x _ { 1 } and x2x _ { 2 } , respectively, results in the largest average value of y?

(Multiple Choice)
4.9/5
(36)

An avid football fan was in the process of examining the factors that determine the success or failure of football teams. He noticed that teams with many rookies and teams with many veterans seem to do quite poorly. To further analyse his beliefs, he took a random sample of 20 teams and proposed a second-order model with one independent variable. The selected model is: y=β0+β1x+β2x2+εy = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 } + \varepsilon . where y = winning team's percentage. x = average years of professional experience. The computer output is shown below: THE REGRESSION EQUATION IS: y=y = 32.6+5.96x0.48x232.6 + 5.96 x - 0.48 x ^ { 2 } Predictor Coef SyDev T Constant 32.6 19.3 1.689 x 5.96 2.41 2.473 -0.48 0.22 -2.182 S = 16.1 R-Sq = 43.9%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 2 3452 1726 6.663 Error 17 4404 259.059 Total 19 7856 Do these results allow us to conclude at the 5% significance level that the model is useful in predicting the team's winning percentage?

(Essay)
4.9/5
(39)

A traffic consultant has analysed the factors that affect the number of traffic fatalities. She has come to the conclusion that two important variables are the number of cars and the number of tractor-trailer trucks. She proposed the second-order model with interaction: y=y = β0+β1x1+β2x2+β3x12+β4x22+β5x1x2+ε\beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 1 } ^ { 2 } + \beta _ { 4 } x _ { 2 } ^ { 2 } + \beta _ { 5 } x _ { 1 } x _ { 2 } + \varepsilon . Where: y = number of annual fatalities per shire. x1x _ { 1 } = number of cars registered in the shire (in units of 10 000). x2x _ { 2 } = number of trucks registered in the shire (in units of 1000). The computer output (based on a random sample of 35 shires) is shown below. THE REGRESSION EQUATION IS y=y = 69.7+11.3x1+7.61x21.15x120.51x220.13x1x269.7 + 11.3 x _ { 1 } + 7.61 x _ { 2 } - 1.15 x _ { 1 } ^ { 2 } - 0.51 x _ { 2 } ^ { 2 } - 0.13 x _ { 1 } x _ { 2 } . Predictor Coef SiDev T Constant 69.7 41.3 1.688 11.3 5.1 2.216 7.61 2.55 2.984 -1.15 0.64 -1.797 -0.51 0.20 -2.55 -0.13 0.10 -1.30 S = 15.2 R-Sq = 47.2%. ANALYSIS OF VARIANCE Source of Variation df SS MS F Regression 5 5959 1191.800 5.181 Error 29 6671 230.034 Total 34 12630 What does the coefficient of x12x _ { 1 } ^ { 2 } tell you about the model?

(Essay)
4.9/5
(35)

In a first-order model with two predictors, x1x _ { 1 } and x2x _ { 2 } , an interaction term may be used when the relationship between the dependent variable yy and the predictor variables is linear.

(True/False)
4.7/5
(39)

An indicator variable (also called a dummy variable) is a variable that can assume either one of two values (usually 0 and 1), where one value represents the existence of a certain condition, and the other value indicates that the condition does not hold.

(True/False)
4.8/5
(25)
Showing 1 - 20 of 100
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)