TABLE 14-6 One of the most common questions of prospective house buyers pertains to the average cost of heating in dollars (Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit (XS1U1B11S1U1B0), the amount of insulation in inches (XS1U1B12S1U1B0), the number of windows in the house (XS1U1B13S1U1B0), and the age of the furnace in years (XS1U1B14S1U1B0). Given below are the EXCEL outputs of two regression models. $\begin{array}{ll} \text { Model } 1 \\ \text { Regression Statistics } \\ \hline \text { R Square }& 0.8080 \\ \text { AdjustedR S quare }& 0.7568 \\ \text { Observations } &20 \end{array}$ ANOVA $\begin{array}{lrrccc} & \text { df } &{S S} & M S & F & \text { Signuficance F } \\ \hline \text { Regression } & 4 & 169503.4241 & 42375.86 & 15.7874 & 2.96869 E-05 \\ \text { Residual } & 15 & 40262.3259 & 2684.155 & & \\ \text { Total } & 19 & 209765.75 & & & \\ \hline \end{array}$ $\begin{array}{lrrrrrrr} && \text { Standard } & & \text {Lower} & \text {Upper }\\ & \text {Coefficients} & \text {Error} & \text { t Stat } & \text {p -value} & 90.0 \% & 90.0 \% \\ \hline \text { Intercept }& 421.4277 & 77.8614 & 5.4125 & 7.2 \mathrm{E}-05& 284.9327 & 557.9227 \\ \text { X{1} (Temperature)} & -4.5098 & 0.8129 & -5.5476 &5.58 \mathrm{E}-05 & -5.9349 & -3.0847 \\ \text {X{2} (Insulation) }& -14.9029 & 5.0508 & -2.9505 & 0.0099 & -23.7573 & -6.0485 \\ \text { X{3} (Windows) }& 0.2151 & 4.8675 & 0.0442 & 0.9653 & -8.3181 & 8.7484 \\ \text { X{4} (Furnace Age)} & 6.3780 & 4.1026 & 1.5546 & 0.1408 & -0.8140 & 13.5702 \\ \hline \end{array}$ $\begin{array}{ll} \text { Model 2} \\ \hline \text {Regression Statistics} \\ \hline \text {R Square }& 0.7768\\ \text {Adjusted R Square }&0.7506 \\ \text {Observations }& 20 \\ \hline \end{array}$ ANOVA $\begin{array}{lrrrcc} \hline & \text { d f} & \text { SS } & \text { MS } & \text {SS } & \text {Significance F } \\ \hline \text {Regression} & 2 & 162958.2277 & 81479.11 & 29.5923 & 2.9036 \mathrm{E}-06 \\ \text {Residual }& 17 & 46807.5222 & 2753.384 & & \\ \text {Total }& 19 & 209765.75 & & & \\ \hline \end{array}$ $\begin{array}{lcccccc} && \text { Standard } & &\text { Lower} &\text { Upper} \\ & \text {Coefficients }&\text { Error }&\text { t Stat }& \text { p -value} &95 \%& 95 \% \\ \hline \text {Intercept }& 489.3227 & 43.982611 .1253 &3.17 \mathrm{E}-09& 396.5273 & 582.1180 \\ \text { X{1} (Temperature) }& -5.1103 & 0.6951-7.3515 & 1.13 \mathrm{E}-06 & -6.5769 & -3.6437 \\ \text { X{2} (Insulation) }& -14.7195 & 4.8864-3.0123 & 0.0078 & -25.0290 & -4.4099 \\ \hline \end{array}$ -Referring to Table 14-6, the estimated value of the partial regression parameter þ1 in Model 1 means that

A) all else equal, a 1 degree increase in the daily minimum outside temperature results in an estimated expected decrease in average heating costs by $4.51. B) all else equal, a 1 degree increase in the daily minimum outside temperature results in a decrease in average heating costs by $4.51. C) all else equal, a 1% increase in the daily minimum outside temperature results in an estimated expected decrease in average heating costs by 4.51%. D) all else equal, an estimated expected $1 increase in average heating costs is associated with a decrease in the daily minimum outside temperature by 4.51 degrees. A) all else equal, a 1 degree increase in the daily minimum outside temperature results in an estimated expected decrease in average heating costs by $4.51. B) all else equal, a 1 degree increase in the daily minimum outside temperature results in a decrease in average heating costs by $4.51. C) all else equal, a 1% increase in the daily minimum outside temperature results in an estimated expected decrease in average heating costs by 4.51%. D) all else equal, an estimated expected $1 increase in average heating costs is associated with a decrease in the daily minimum outside temperature by 4.51 degrees.

TABLE 14-10 You worked as an intern at We Always Win Car Insurance Company last summer. You notice that individual car insurance premiums depend very much on the age of the individual, the number of traffic tickets received by the individual, and the population density of the city in which the individual lives. You performed a regression analysis in EXCEL and obtained the following information: Regression Analysis $\begin{array}{lr} \hline\text {Regression Statistics } \\ \hline\text { Multiple R} & 0.63 \\ \text {R Square }& 0.40 \\ \text {Adjusted R Square} & 0.23 \\ \text {Standard Error }& 50.00 \\ \text {Observations} & 15.00 \\ \hline \end{array}$ ANOVA $\begin{array}{lccccc} \hline & \text { d f} & \text {SS} & \text {MS} & \text {F } & \text {Significance F } \\ \hline \text {Regression} & 3 & & 5994.24 & 2.40 & 0.12 \\ \text {Residual }& 11 & 27496.82 & & & \\ \text {Total }& 45479.54 & & & \\ \hline \end{array}$ $\begin{array}{lccrrrr} \hline & \text {oefficients}&\text { Standard Error }& \text { t Stat} &\text { p-value }& \text {Lower 99.0\% }& \text {Upper 99.0 \% } \\ \hline\text { Intercept} & 123.80 & 48.71 & 2.54 & 0.03 & -27.47 & 275.07 \\ \text {AGE }& -0.82 & 0.87 & -0.95 & 0.36 & -3.51 & 1.87 \\ \text {TICKETS}& 21.25 & 10.66 & 1.99 & 0.07 & -11.86 & 54.37 \\ \text {DENSITY} & -3.14 & 6.46 & -0.49 & 0.64 & -23.19 & 16.91 \\ \hline \end{array}$ -Referring to Table 14-10, to test the significance of the multiple regression model, what is the form of the null hypothesis?

A) $ H_{0}: \beta_{1} $ B) $ H_{0}: \beta_{1}=\beta_{2}=\beta_{3} $ C) $ H_{0}: \beta_{0} $ D) $ H_{0}: \beta_{0}=\beta_{1}=\beta_{2}=\beta_{3} $ A) $ H_{0}: \beta_{1} $ B) $ H_{0}: \beta_{1}=\beta_{2}=\beta_{3} $ C) $ H_{0}: \beta_{0} $ D) $ H_{0}: \beta_{0}=\beta_{1}=\beta_{2}=\beta_{3} $

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (XS1U1B11S1U1B0S1U1B1 S1U1B0= Age), experience in the field (XS1U1B12S1U1B0S1U1B1 S1U1B0= Exper), number of degrees (XS1U1B13S1U1B0S1U1B1 S1U1B0= Degrees), and number of previous jobs in the field (XS1U1B14S1U1B0S1U1B1 S1U1B0= Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ $\begin{array}{ll} \hline \text { Regression Statistics } \\ \hline \text { Multiple R }& 0.992 \\ \text { R Square} & 0.984 \\ \text { Adjusted R Square} & 0.979 \\ \text { Standard Error }& 2.26743 \\ \text { Observations} & 20 \\ \hline \end{array}$ ANOVA $\begin{array}{lccclc} \hline & \text {d f }& \text { SS }& \text { M S } & \text { F } & \text {Significance F } \\ \hline \text { Regression} & 4 & 4609.83164& 1152.45791 & 224.160& 0.0001 \\ \text {Residual} &15& 77.11836 & 5.14122& & \\ \text {Total} & 19 & 4686.95000 & & & \\ \hline \end{array}$ $\begin{array}{lrrrr} \hline & \text {Coefficients} &\text { Standard Error} & \text { t Stat } & \text { p -value} \\ \hline \text { Intercept }& -9.611198 & 2.77988638 & -3.457 & 0.0035 \\ \text { Age }& 1.327695 & 0.11491930 & 11.553 & 0.0001 \\ \text { Exper }& -0.106705 & 0.14265559 & -0.748 & 0.4660 \\ \text { Degrees} & 7.311332 & 0.80324187 & 9.102 & 0.0001 \\ \text { Prevjobs }& -0.504168 & 0.44771573 & -1.126 & 0.2778 \\ \hline \end{array}$ -Referring to Table 14-8, the critical value of an F test on the entire regression for a level of significance of 0.01 is _____.

The answer of TABLE 14-8 A financial analyst wanted to...

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (XS1U1B11S1U1B0S1U1B1 S1U1B0= Age), experience in the field (XS1U1B12S1U1B0S1U1B1 S1U1B0= Exper), number of degrees (XS1U1B13S1U1B0S1U1B1 S1U1B0= Degrees), and number of previous jobs in the field (XS1U1B14S1U1B0S1U1B1 S1U1B0= Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ $\begin{array}{ll} \hline \text { Regression Statistics } \\ \hline \text { Multiple R }& 0.992 \\ \text { R Square} & 0.984 \\ \text { Adjusted R Square} & 0.979 \\ \text { Standard Error }& 2.26743 \\ \text { Observations} & 20 \\ \hline \end{array}$ ANOVA $\begin{array}{lccclc} \hline & \text {d f }& \text { SS }& \text { M S } & \text { F } & \text {Significance F } \\ \hline \text { Regression} & 4 & 4609.83164& 1152.45791 & 224.160& 0.0001 \\ \text {Residual} &15& 77.11836 & 5.14122& & \\ \text {Total} & 19 & 4686.95000 & & & \\ \hline \end{array}$ $\begin{array}{lrrrr} \hline & \text {Coefficients} &\text { Standard Error} & \text { t Stat } & \text { p -value} \\ \hline \text { Intercept }& -9.611198 & 2.77988638 & -3.457 & 0.0035 \\ \text { Age }& 1.327695 & 0.11491930 & 11.553 & 0.0001 \\ \text { Exper }& -0.106705 & 0.14265559 & -0.748 & 0.4660 \\ \text { Degrees} & 7.311332 & 0.80324187 & 9.102 & 0.0001 \\ \text { Prevjobs }& -0.504168 & 0.44771573 & -1.126 & 0.2778 \\ \hline \end{array}$ -Referring to Table 14-8, the analyst decided to obtain a 99% confidence interval for þ3. The confidence interval is from _____ to _____.

The answer of TABLE 14-8 A financial analyst wanted...

TABLE 14-7 The department head of the accounting department wanted to see if she could predict the GPA of students using the number of course units (credits) and total SAT scores of each. She takes a sample of students and generates the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ $\begin{array}{ll} \hline \text { Regression Statistics } \\ \hline \text { Multiple R }& 0.916 \\ \text { R Square} & 0.839 \\ \text { Adjusted R Square} & 0.732 \\ \text { Standard Error }& 0.24685 \\ \text { Observations} & 6 \\ \hline \end{array}$ ANOVA $\begin{array}{lccclc} \hline & \text {d f }& \text { SS }& \text { M S } & \text { F } & \text {Significance F } \\ \hline \text { Regression} & 2 & 0.95219 & 0.47610 & 7.813 & 0.0646 \\ \text {Residual} & 3 & 0.18281 & 0.06094 & & \\ \text {Total} & 5 & 1.13500 & & & \\ \hline \end{array}$ $\begin{array}{lrcrr} \hline & \text {Coefficients }& \text {Standard Error} & \text {t Stat } & \text { p -value} \\ \hline \text { Intercept }& 4.593897 & 1.13374542 & 4.052 & 0.0271 \\ \text {Units }& -0.247270 & 0.06268485 & -3.945 & 0.0290 \\ \text {SAT Total }& 0.001443 & 0.00101241 & 1.425 & 0.2494 \\ \hline \end{array}$ -Referring to Table 14-7, the department head wants to use a t test to test for the significance of the coefficient of X1. For a level of significance of 0.05, the critical values of the test are______

The answer of TABLE 14-7 The department head...

TABLE 14-14 An econometrician is interested in evaluating the relation of demand for building materials to mortgage rates in Los Angeles and San Francisco. He believes that the appropriate model is Y = 10 + 5XS1U1B11S1U1B0 + 8XS1U1B12S1U1B0 where XS1U1B11S1U1B0 = mortgage rate in % XS1U1B12S1U1B0 = 1 if SF, 0 if LA Y = demand in $100 per capita -Referring to Table 14-14, the fitted model for predicting demand in San Francisco is .

A) 10 + 13X1 B) 18 + 5X1 C) 10 + 5X1 D) 15 + 8X2 A) 10 + 13X1 B) 18 + 5X1 C) 10 + 5X1 D) 15 + 8X2

TABLE 14-4 A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: $\begin{array}{ll} & \text { Regression Stuistics } \\ \hline \text { Multiple R } & 0.865 \\ \text { R Square } & 0.748 \\ \text { Adjusted R Square } & 0.726 \\ \text { Standard Error } & 5.195 \\ \text { Observations } &50 \\ \hline \end{array}$ ANOVA $\begin{array}{lrrrrr} \hline & \text { d f }& \text {S S } & \text {M S } & \text {F } & \text {Significance F } \\ \hline \text {Regression} & & 3605.7736 & 1201.9245 & &0.0000 \\ \text {Residual} & & 1214.2264 & 26.3962 & \\ Total & 49 & 4820.0000 & & & \\ \hline \end{array}$ $\begin{array}{lcccc} \hline & \text { Coefficients} & \text {Standard Error} & \text {t Stat }& \text {p -value} \\ \hline \text {Intercept }& -1.6335 & 5.8078 & -0.281 & 0.7798 \\ \text {Income} & 0.4485 & 0.1137 & 3.9545 & 0.0003 \\ \text {Size} & 4.2615 & 0.8062 & 5.286 & 0.0001 \\ \text {School }& -0.6517 & 0.4319 & -1.509 & 0.1383 \\ \hline \end{array}$ -Referring to Table 14-4, the observed value of the F-statistic is missing from the printout. What are the degrees of freedom for this F-statistic?

A) 46 for the numerator, 49 for the denominator B) 3 for the numerator, 49 for the denominator C) 3 for the numerator, 46 for the denominator D) 46 for the numerator, 3 for the denominator A) 46 for the numerator, 49 for the denominator B) 3 for the numerator, 49 for the denominator C) 3 for the numerator, 46 for the denominator D) 46 for the numerator, 3 for the denominator

TABLE 14-17 The marketing manager for a nationally franchised lawn service company would like to study the characteristics that differentiate home owners who do and do not have a lawn service. A random sample of 30 home owners located in a suburban area near a large city was selected; 15 did not have a lawn service (code 0) and 15 had a lawn service (code 1). Additional information available concerning these 30 home owners includes family income (Income, in thousands of dollars), lawn size (Lawn Size, in thousands of square feet), attitude toward outdoor recreational activities (Attitude 0 = unfavorable, 1 = favorable), number of teenagers in the household (Teenager), and age of the head of the household (Age). The Minitab output is given below: $\begin{array}{lccccccrr} & & & & \text { Odds } & \text { 95 \% CI } \\ \text {Predictor }& \text {Coef }&\text { SE Coef }&\text { Z }& \text {P} &\text { Ratio} &\text { Lower} & \text {Upper }\\ \text {Constant }& -70.49 & 47.22 & -1.49 & 0.135 & & & \\ \text {Income }& 0.2868 & 0.1523 & 1.88 & 0.060 & 1.33 & 0.99 & 1.80 \\ \text {Lawn Size} & 1.0647 & 0.7472 & 1.42 & 0.154 & 2.90 & 0.67 & 12.54 \\ \text {Attitude} & -12.744 & 9.455 & -1.35 & 0.178 & 0.00 & 0.00 & 326.06 \\ \text {Teenager} & -0.200 & 1.061 & -0.19 & 0.850 & 0.82 & 0.10 & 6.56 \\ \text {Age} & 1.0792 & 0.8783 & 1.23 & 0.219 & 2.94 & 0.53 & 16.45 \end{array}$ Log-Likelihood = -4.890 Test that all slopes are zero: G = 31.808, DF = 5, P-Value = 0.000 Goodness-of-Fit Tests $ \begin{array}{lrrr}\text { Method } & \text { Chi-Square } & \text { DF } & \text { P } \\ \text { Pearson } & 9.313 & 24 & 0.997 \\ \text { Deviance } & 9.780 & 24 & 0.995 \\ \text { Hosmer-Lemeshow } & 0.571 & 8 & 1.000\end{array} $ -Referring to Table 14-17, what is the p-value of the test statistic when testing whether Teenager makes a significant contribution to the model in the presence of the other independent variables?

The answer of TABLE 14-17 The marketing manager for a nationally...

TABLE 14-11 A logistic regression model was estimated in order to predict the probability that a randomly chosen university or college would be a private university using information on average total Scholastic Aptitude Test score (SAT) at the university or college, the room and board expense measured in thousands of dollars (Room/Brd), and whether the TOEFL criterion is at least 550 (Toefl550 = 1 if yes, 0 otherwise.) The dependent variable, Y, is school type (Type = 1 if private and 0 otherwise). Logistic Regression Table $\begin{array}{lrrrrrrr} & & & && \text { Odds } & \text { 95: CI } \\ \text { Predictor } & {\text { Coef }} & \text { SE Coef } & Z &{\text { P }} & \text { Ratio } & \text { Lower } & \text { Upper } \\ \text { Constant } &-27.118&6 .696& -4.05 & 0.000 & & & \\ \text { SAT } & 0.015 & 0.004666 & 3.17 & 0.002 & 1.01 & 1.01 & 1.02 \\ \text { Toefl550 } & -0.390 & 0.9538 & -0.41 & 0.682 & 0.68 & 0.10 & 4.39 \\ \text { Room/Brd } & 2.078 & 0.5076 & 4.09 & 0.000 & 7.99 & 2.95 & 21.60 \end{array}$ Log-Likelihood = -21.883 Test that all slopes are zero: G = 62.083, DF = 3, P-Value = 0.000 Goodness-of-Fit Tests $ \begin{array}{lrcr}\text { Method } & \text { Chi-Square } & \text { DF } & \text { P } \\ \text { Pearson } & 143.551 & 76 & 0.000 \\ \text { Deviance } & 43.767 & 76 & 0.999 \\ \text { Hosmer-Lemeshow } & 15.731 & 8 & 0.046\end{array} $ -Referring to Table 14-11, which of the following is the correct interpretation for the Tofel500 slope coefficient?

A) Holding constant the effect of the other variables, the estimated average value of school type is 0.39 lower when the school has a TOEFL criterion that is at least 550. B) Holding constant the effect of the other variables, the estimated probability of the school being a private school is 0.39 lower for a school that has a TOEFL criterion that is at least 550 than one that does not. C) Holding constant the effect of the other variables, the estimated school type decreases by 0.39 when the school has a TOEFL criterion that is at least 550. D) Holding constant the effect of the other variables, the estimated natural logarithm of the odds ratio of the school being a private school is 0.39 lower for a school that has a TOEFL criterion that is at least 550 than one that does not. A) Holding constant the effect of the other variables, the estimated average value of school type is 0.39 lower when the school has a TOEFL criterion that is at least 550. B) Holding constant the effect of the other variables, the estimated probability of the school being a private school is 0.39 lower for a school that has a TOEFL criterion that is at least 550 than one that does not. C) Holding constant the effect of the other variables, the estimated school type decreases by 0.39 when the school has a TOEFL criterion that is at least 550. D) Holding constant the effect of the other variables, the estimated natural logarithm of the odds ratio of the school being a private school is 0.39 lower for a school that has a TOEFL criterion that is at least 550 than one that does not.

TABLE 14-11 A logistic regression model was estimated in order to predict the probability that a randomly chosen university or college would be a private university using information on average total Scholastic Aptitude Test score (SAT) at the university or college, the room and board expense measured in thousands of dollars (Room/Brd), and whether the TOEFL criterion is at least 550 (Toefl550 = 1 if yes, 0 otherwise.) The dependent variable, Y, is school type (Type = 1 if private and 0 otherwise). The Minitab output is given below: Logistic Regression Table $\begin{array}{lrrrrrrr} & & & && \text { Odds } & \text { 95: CI } \\ \text { Predictor } & {\text { Coef }} & \text { SE Coef } & Z &{\text { P }} & \text { Ratio } & \text { Lower } & \text { Upper } \\ \text { Constant } &-27.118&6 .696& -4.05 & 0.000 & & & \\ \text { SAT } & 0.015 & 0.004666 & 3.17 & 0.002 & 1.01 & 1.01 & 1.02 \\ \text { Toefl550 } & -0.390 & 0.9538 & -0.41 & 0.682 & 0.68 & 0.10 & 4.39 \\ \text { Room/Brd } & 2.078 & 0.5076 & 4.09 & 0.000 & 7.99 & 2.95 & 21.60 \end{array}$ Log-Likelihood = -21.883 Test that all slopes are zero: G = 62.083, DF = 3, P-Value = 0.000 Goodness-of-Fit Tests $ \begin{array}{lrcr}\text { Method } & \text { Chi-Square } & \text { DF } & \text { P } \\ \text { Pearson } & 143.551 & 76 & 0.000 \\ \text { Deviance } & 43.767 & 76 & 0.999 \\ \text { Hosmer-Lemeshow } & 15.731 & 8 & 0.046\end{array} $ -Referring to Table 14-11, what is the estimated odds ratio for a school with an average SAT score of 1250, a TOEFL criterion that is at least 550, and the room and board expense of 5 thousand dollars?

The answer of TABLE 14-11 A logistic regression model...

Exam 14: Introduction to Multiple Regression

TABLE 14-6 One of the most common questions of prospective house buyers pertains to the average cost of heating in dollars (Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit (X₁), the amount of insulation in inches (X₂), the number of windows in the house (X₃), and the age of the furnace in years (X₄). Given below are the EXCEL outputs of two regression models. Model 1 Regression Statistics R Square 0.8080 AdjustedR S quare 0.7568 Observations 20 ANOVA df SS MS F Signuficance F Regression 4 169503.4241 42375.86 15.7874 2.96869E-05 Residual 15 40262.3259 2684.155 Total 19 209765.75 Standard Lower Upper Coefficients Error t Stat p -value 90.0\% 90.0\% Intercept 421.4277 77.8614 5.4125 7.2-05 284.9327 557.9227 X 1 (Temperature) -4.5098 0.8129 -5.5476 5.58-05 -5.9349 -3.0847 X 2 (Insulation) -14.9029 5.0508 -2.9505 0.0099 -23.7573 -6.0485 X 3 (Windows) 0.2151 4.8675 0.0442 0.9653 -8.3181 8.7484 X 4 (Furnace Age) 6.3780 4.1026 1.5546 0.1408 -0.8140 13.5702 Model 2 Regression Statistics R Square 0.7768 Adjusted R Square 0.7506 Observations 20 ANOVA d f SS MS SS Significance F Regression 2 162958.2277 81479.11 29.5923 2.9036-06 Residual 17 46807.5222 2753.384 Total 19 209765.75 Standard Lower Upper Coefficients Error t Stat p -value 95\% 95\% Intercept 489.3227 43.982611.1253 3.17-09 396.5273 582.1180 X 1 (Temperature) -5.1103 0.6951-7.3515 1.13-06 -6.5769 -3.6437 X 2 (Insulation) -14.7195 4.8864-3.0123 0.0078 -25.0290 -4.4099 -Referring to Table 14-6, the estimated value of the partial regression parameter þ1 in Model 1 means that

(Multiple Choice)

5.0/5

(36)

Question 81

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (X₁= Age), experience in the field (X₂= Exper), number of degrees (X₃= Degrees), and number of previous jobs in the field (X₄= Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ Regression Statistics Multiple R 0.992 R Square 0.984 Adjusted R Square 0.979 Standard Error 2.26743 Observations 20 ANOVA d f SS M S F Significance F Regression 4 4609.83164 1152.45791 224.160 0.0001 Residual 15 77.11836 5.14122 Total 19 4686.95000 Coefficients Standard Error t Stat p -value Intercept -9.611198 2.77988638 -3.457 0.0035 Age 1.327695 0.11491930 11.553 0.0001 Exper -0.106705 0.14265559 -0.748 0.4660 Degrees 7.311332 0.80324187 9.102 0.0001 Prevjobs -0.504168 0.44771573 -1.126 0.2778 -Referring to Table 14-8, the F test for the significance of the entire regression performed at a level of significance of 0.01 leads to a rejection of the null hypothesis.

(True/False)

4.9/5

(45)

Question 82

The total sum of squares (SST) in a regression model will never exceed the regression sum of squares (SSR).

(True/False)

4.9/5

(28)

Question 83

If we have taken into account all relevant explanatory factors, the residuals from a multiple regression should be random.

(True/False)

4.9/5

(32)

Question 84

TABLE 14-10 You worked as an intern at We Always Win Car Insurance Company last summer. You notice that individual car insurance premiums depend very much on the age of the individual, the number of traffic tickets received by the individual, and the population density of the city in which the individual lives. You performed a regression analysis in EXCEL and obtained the following information: Regression Analysis Regression Statistics Multiple R 0.63 R Square 0.40 Adjusted R Square 0.23 Standard Error 50.00 Observations 15.00 ANOVA d f SS MS F Significance F Regression 3 5994.24 2.40 0.12 Residual 11 27496.82 Total 45479.54 oefficients Standard Error t Stat p-value Lower 99.0\% Upper 99.0 \% Intercept 123.80 48.71 2.54 0.03 -27.47 275.07 AGE -0.82 0.87 -0.95 0.36 -3.51 1.87 TICKETS 21.25 10.66 1.99 0.07 -11.86 54.37 DENSITY -3.14 6.46 -0.49 0.64 -23.19 16.91 -Referring to Table 14-10, to test the significance of the multiple regression model, what is the form of the null hypothesis?

(Multiple Choice)

4.7/5

(37)

Question 85

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (X₁= Age), experience in the field (X₂= Exper), number of degrees (X₃= Degrees), and number of previous jobs in the field (X₄= Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ Regression Statistics Multiple R 0.992 R Square 0.984 Adjusted R Square 0.979 Standard Error 2.26743 Observations 20 ANOVA d f SS M S F Significance F Regression 4 4609.83164 1152.45791 224.160 0.0001 Residual 15 77.11836 5.14122 Total 19 4686.95000 Coefficients Standard Error t Stat p -value Intercept -9.611198 2.77988638 -3.457 0.0035 Age 1.327695 0.11491930 11.553 0.0001 Exper -0.106705 0.14265559 -0.748 0.4660 Degrees 7.311332 0.80324187 9.102 0.0001 Prevjobs -0.504168 0.44771573 -1.126 0.2778 -Referring to Table 14-8, the critical value of an F test on the entire regression for a level of significance of 0.01 is _____.

(Short Answer)

4.9/5

(34)

Question 86

TABLE 14-12 A weight-loss clinic wants to use regression analysis to build a model for weight-loss of a client (measured in pounds). Two variables thought to affect weight-loss are client's length of time on the weight loss program and time of session. These variables are described below: Y = Weight- loss (in pounds) ^X1 ⁼^{Length of time in weight}^-^{loss program (in months)
X}2 ⁼^{1 if morning session, 0 if not} ^X3 ⁼^{1 if afternoon session, 0}^if^not^{(Base level}⁼^evening^session) Data for 12 clients on a weight- loss program at the clinic were collected and used to fit the interaction model: $Y=\beta_{0}+\beta_{1} X_{1}+\beta_{2} X_{2}+\beta_{3} X_{3}+\beta_{4} X_{1} X_{2}+\beta_{5} X_{1} X_{3}+\varepsilon$ Partial output from Microsoft Excel follows: Regression Statistics Multiple R 0.73514 R Square 0.540438 Adjusted R Square 0.157469 Standard Error 12.4147 Observations 12 ANOVA $F = 5.41118 \quad\text { Significance } F = 0.040201$ Coefficients Standard Error t Stat p -value Intercept 0.089744 14.127 0.0060 0.9951 Length (X1) 6.22538 2.43473 2.54956 0.0479 Morn Ses (X2) 2.217272 22.1416 0.100141 0.9235 Aft Ses (X3) 11.8233 3.1545 3.558901 0.0165 LengthMorn Ses 0.77058 3.562 0.216334 0.8359 Length Aft Ses -0.54147 3.35988 -0.161158 0.8773 -Referring to Table 14-12, which of the following statements is supported by the analysis shown?

(Multiple Choice)

4.9/5

(34)

Question 87

The coefficient of multiple determination is calculated by taking the ratio of the regression sum of squares over the total sum of squares (SSR/SST) and subtracting that value from 1.

(True/False)

4.8/5

(38)

Question 88

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (X₁= Age), experience in the field (X₂= Exper), number of degrees (X₃= Degrees), and number of previous jobs in the field (X₄= Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ Regression Statistics Multiple R 0.992 R Square 0.984 Adjusted R Square 0.979 Standard Error 2.26743 Observations 20 ANOVA d f SS M S F Significance F Regression 4 4609.83164 1152.45791 224.160 0.0001 Residual 15 77.11836 5.14122 Total 19 4686.95000 Coefficients Standard Error t Stat p -value Intercept -9.611198 2.77988638 -3.457 0.0035 Age 1.327695 0.11491930 11.553 0.0001 Exper -0.106705 0.14265559 -0.748 0.4660 Degrees 7.311332 0.80324187 9.102 0.0001 Prevjobs -0.504168 0.44771573 -1.126 0.2778 -Referring to Table 14-8, the analyst decided to obtain a 99% confidence interval for þ3. The confidence interval is from _ to _.

(Short Answer)

4.9/5

(32)

Question 89

TABLE 14-7 The department head of the accounting department wanted to see if she could predict the GPA of students using the number of course units (credits) and total SAT scores of each. She takes a sample of students and generates the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ Regression Statistics Multiple R 0.916 R Square 0.839 Adjusted R Square 0.732 Standard Error 0.24685 Observations 6 ANOVA d f SS M S F Significance F Regression 2 0.95219 0.47610 7.813 0.0646 Residual 3 0.18281 0.06094 Total 5 1.13500 Coefficients Standard Error t Stat p -value Intercept 4.593897 1.13374542 4.052 0.0271 Units -0.247270 0.06268485 -3.945 0.0290 SAT Total 0.001443 0.00101241 1.425 0.2494 -Referring to Table 14-7, the department head wants to use a t test to test for the significance of the coefficient of X1. For a level of significance of 0.05, the critical values of the test are______

(Short Answer)

4.9/5

(27)

Question 90

TABLE 14-14 An econometrician is interested in evaluating the relation of demand for building materials to mortgage rates in Los Angeles and San Francisco. He believes that the appropriate model is Y = 10 + 5X₁ + 8X₂ where X₁ = mortgage rate in % X₂ = 1 if SF, 0 if LA Y = demand in $100 per capita -Referring to Table 14-14, the fitted model for predicting demand in San Francisco is .

(Multiple Choice)

5.0/5

(31)

Question 91

TABLE 14-16 The superintendent of a school district wanted to predict the percentage of students passing a sixth-grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Following is the multiple regression output with Y = % Passing as the dependent variable, X₁= % Attendance, X₂= Salaries and X₃= Spending: Regression Statistics Multiple R 0.7930 R Square 0.6288 Adjusted R Square 0.6029 Standard Error 10.4570 Observations 47 ANOVA d f SS MS F Significance F Regression 3 7965.08 2655.03 24.2802 2.3853-09 Residual 43 4702.02 109.35 Total 46 12667.11 Coeffs Stnd Err t Stat p -value Lower 95\% Upper 95\% Intercept -753.4225 101.1149 -7.4511 2.88-09 -957.3401 -549.5050 \% Attend 8.5014 1.0771 7.8929 6.73-10 6.3292 10.6735 Salary 6.85-07 0.0006 0.0011 0.9991 -0.0013 0.0013 Spending 0.0060 0.0046 1.2879 0.2047 -0.0034 0.0153 -Referring to Table 14-16, there is sufficient evidence that all of the explanatory variables is related to the percentage of students passing the proficiency test.

(True/False)

4.9/5

(37)

Question 92

TABLE 14-4 A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Regression Stuistics Multiple R 0.865 R Square 0.748 Adjusted R Square 0.726 Standard Error 5.195 Observations 50 ANOVA d f S S M S F Significance F Regression 3605.7736 1201.9245 0.0000 Residual 1214.2264 26.3962 Total 49 4820.0000 Coefficients Standard Error t Stat p -value Intercept -1.6335 5.8078 -0.281 0.7798 Income 0.4485 0.1137 3.9545 0.0003 Size 4.2615 0.8062 5.286 0.0001 School -0.6517 0.4319 -1.509 0.1383 -Referring to Table 14-4, the observed value of the F-statistic is missing from the printout. What are the degrees of freedom for this F-statistic?

(Multiple Choice)

4.7/5

(30)

Question 93

TABLE 14-17 The marketing manager for a nationally franchised lawn service company would like to study the characteristics that differentiate home owners who do and do not have a lawn service. A random sample of 30 home owners located in a suburban area near a large city was selected; 15 did not have a lawn service (code 0) and 15 had a lawn service (code 1). Additional information available concerning these 30 home owners includes family income (Income, in thousands of dollars), lawn size (Lawn Size, in thousands of square feet), attitude toward outdoor recreational activities (Attitude 0 = unfavorable, 1 = favorable), number of teenagers in the household (Teenager), and age of the head of the household (Age). The Minitab output is given below: Odds 95 \% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -70.49 47.22 -1.49 0.135 Income 0.2868 0.1523 1.88 0.060 1.33 0.99 1.80 Lawn Size 1.0647 0.7472 1.42 0.154 2.90 0.67 12.54 Attitude -12.744 9.455 -1.35 0.178 0.00 0.00 326.06 Teenager -0.200 1.061 -0.19 0.850 0.82 0.10 6.56 Age 1.0792 0.8783 1.23 0.219 2.94 0.53 16.45 Log-Likelihood = -4.890 Test that all slopes are zero: G = 31.808, DF = 5, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 9.313 24 0.997 Deviance 9.780 24 0.995 Hosmer-Lemeshow 0.571 8 1.000 -Referring to Table 14-17, what is the p-value of the test statistic when testing whether Teenager makes a significant contribution to the model in the presence of the other independent variables?

(Short Answer)

4.8/5

(34)

Question 94

TABLE 14-11 A logistic regression model was estimated in order to predict the probability that a randomly chosen university or college would be a private university using information on average total Scholastic Aptitude Test score (SAT) at the university or college, the room and board expense measured in thousands of dollars (Room/Brd), and whether the TOEFL criterion is at least 550 (Toefl550 = 1 if yes, 0 otherwise.) The dependent variable, Y, is school type (Type = 1 if private and 0 otherwise). Logistic Regression Table Odds 95: CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -27.118 6.696 -4.05 0.000 SAT 0.015 0.004666 3.17 0.002 1.01 1.01 1.02 Toefl550 -0.390 0.9538 -0.41 0.682 0.68 0.10 4.39 Room/Brd 2.078 0.5076 4.09 0.000 7.99 2.95 21.60 Log-Likelihood = -21.883 Test that all slopes are zero: G = 62.083, DF = 3, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 143.551 76 0.000 Deviance 43.767 76 0.999 Hosmer-Lemeshow 15.731 8 0.046 -Referring to Table 14-11, which of the following is the correct interpretation for the Tofel500 slope coefficient?

(Multiple Choice)

4.8/5

(33)

Question 95

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (X₁= Age), experience in the field (X₂= Exper), number of degrees (X₃= Degrees), and number of previous jobs in the field (X₄= Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: $\text {SUMMARY OUTPUT}$ Regression Statistics Multiple R 0.992 R Square 0.984 Adjusted R Square 0.979 Standard Error 2.26743 Observations 20 ANOVA d f SS M S F Significance F Regression 4 4609.83164 1152.45791 224.160 0.0001 Residual 15 77.11836 5.14122 Total 19 4686.95000 Coefficients Standard Error t Stat p -value Intercept -9.611198 2.77988638 -3.457 0.0035 Age 1.327695 0.11491930 11.553 0.0001 Exper -0.106705 0.14265559 -0.748 0.4660 Degrees 7.311332 0.80324187 9.102 0.0001 Prevjobs -0.504168 0.44771573 -1.126 0.2778 -Referring to Table 14-8, the value of the adjusted coefficient of multiple determination, adj r², is _____.

(Short Answer)

4.8/5

(39)

Question 96

TABLE 14-17 The marketing manager for a nationally franchised lawn service company would like to study the characteristics that differentiate home owners who do and do not have a lawn service. A random sample of 30 home owners located in a suburban area near a large city was selected; 15 did not have a lawn service (code 0) and 15 had a lawn service (code 1). Additional information available concerning these 30 home owners includes family income (Income, in thousands of dollars), lawn size (Lawn Size, in thousands of square feet), attitude toward outdoor recreational activities (Attitude 0 = unfavorable, 1 = favorable), number of teenagers in the household (Teenager), and age of the head of the household (Age). The Minitab output is given below: Odds 95 \% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -70.49 47.22 -1.49 0.135 Income 0.2868 0.1523 1.88 0.060 1.33 0.99 1.80 Lawn Size 1.0647 0.7472 1.42 0.154 2.90 0.67 12.54 Attitude -12.744 9.455 -1.35 0.178 0.00 0.00 326.06 Teenager -0.200 1.061 -0.19 0.850 0.82 0.10 6.56 Age 1.0792 0.8783 1.23 0.219 2.94 0.53 16.45 Log-Likelihood = -4.890 Test that all slopes are zero: G = 31.808, DF = 5, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 9.313 24 0.997 Deviance 9.780 24 0.995 Hosmer-Lemeshow 0.571 8 1.000 -Referring to Table 14-17, the null hypothesis that the model is a good-fitting model cannot be rejected when allowing for a 5% probability of making a type I error.

(True/False)

4.8/5

(30)

Question 97

TABLE 14-11 A logistic regression model was estimated in order to predict the probability that a randomly chosen university or college would be a private university using information on average total Scholastic Aptitude Test score (SAT) at the university or college, the room and board expense measured in thousands of dollars (Room/Brd), and whether the TOEFL criterion is at least 550 (Toefl550 = 1 if yes, 0 otherwise.) The dependent variable, Y, is school type (Type = 1 if private and 0 otherwise). The Minitab output is given below: Logistic Regression Table Odds 95: CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -27.118 6.696 -4.05 0.000 SAT 0.015 0.004666 3.17 0.002 1.01 1.01 1.02 Toefl550 -0.390 0.9538 -0.41 0.682 0.68 0.10 4.39 Room/Brd 2.078 0.5076 4.09 0.000 7.99 2.95 21.60 Log-Likelihood = -21.883 Test that all slopes are zero: G = 62.083, DF = 3, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 143.551 76 0.000 Deviance 43.767 76 0.999 Hosmer-Lemeshow 15.731 8 0.046 -Referring to Table 14-11, what is the estimated odds ratio for a school with an average SAT score of 1250, a TOEFL criterion that is at least 550, and the room and board expense of 5 thousand dollars?

(Short Answer)

4.7/5

(43)

Question 98

A regression had the following results: SST = 102.55, SSE = 82.04. It can be said that 90.0% of the variation in the dependent variable is explained by the independent variables in the regression.

(True/False)

4.9/5

(35)

Question 99

An interaction term in a multiple regression model may be used when the relationship between X1 and Y changes for differing values of X₂.

(True/False)

4.7/5

(35)

Question 100

The total sum of squares (SST) in a regression model will never exceed the regression sum of squares (SSR).

If we have taken into account all relevant explanatory factors, the residuals from a multiple regression should be random.

The coefficient of multiple determination is calculated by taking the ratio of the regression sum of squares over the total sum of squares (SSR/SST) and subtracting that value from 1.

A regression had the following results: SST = 102.55, SSE = 82.04. It can be said that 90.0% of the variation in the dependent variable is explained by the independent variables in the regression.

An interaction term in a multiple regression model may be used when the relationship between X1 and Y changes for differing values of X₂.

Introduction and Data Collection

Presenting Data in Tables and Charts

Numerical Descriptive Measures

Basic Probability

Some Important Discrete Probability Distributions

The Normal Distribution and Other Continuous Distributions

Sampling Distributions and Sampling

Confidence Interval Estimation

Fundamentals of Hypothesis Testing: One-Sample Tests

Two-Sample Tests

Analysis of Variance

Chi-Square Tests and Nonparametric Tests

Simple Linear Regression

Multiple Regression Model Building

Time-Series Forecasting and Index Numbers

Decision Making

Statistical Applications in Quality Management

Statistical Analysis Scenarios and Distributions

Filters

Exam 14: Introduction to Multiple Regression

The total sum of squares (SST) in a regression model will never exceed the regression sum of squares (SSR).

If we have taken into account all relevant explanatory factors, the residuals from a multiple regression should be random.

The coefficient of multiple determination is calculated by taking the ratio of the regression sum of squares over the total sum of squares (SSR/SST) and subtracting that value from 1.

A regression had the following results: SST = 102.55, SSE = 82.04. It can be said that 90.0% of the variation in the dependent variable is explained by the independent variables in the regression.

An interaction term in a multiple regression model may be used when the relationship between X1 and Y changes for differing values of X2.

Introduction and Data Collection

Presenting Data in Tables and Charts

Numerical Descriptive Measures

Basic Probability

Some Important Discrete Probability Distributions

The Normal Distribution and Other Continuous Distributions

Sampling Distributions and Sampling

Confidence Interval Estimation

Fundamentals of Hypothesis Testing: One-Sample Tests

Two-Sample Tests

Analysis of Variance

Chi-Square Tests and Nonparametric Tests

Simple Linear Regression

Multiple Regression Model Building

Time-Series Forecasting and Index Numbers

Decision Making

Statistical Applications in Quality Management

Statistical Analysis Scenarios and Distributions

Filters

An interaction term in a multiple regression model may be used when the relationship between X1 and Y changes for differing values of X₂.