Exam 14: Introduction to Multiple Regression

arrow
  • Select Tags
search iconSearch Question
flashcardsStudy Flashcards
  • Select Tags

TABLE 14-6 One of the most common questions of prospective house buyers pertains to the average cost of heating in dollars (Y). To provide its customers with information on that matter, a large real estate firm used the following 4 variables to predict heating costs: the daily minimum outside temperature in degrees of Fahrenheit (X1), the amount of insulation in inches (X2), the number of windows in the house (X3), and the age of the furnace in years (X4). Given below are the EXCEL outputs of two regression models. Model 1 Regression Statistics R Square 0.8080 AdjustedR S quare 0.7568 Observations 20 ANOVA df SS MS F Signuficance F Regression 4 169503.4241 42375.86 15.7874 2.96869E-05 Residual 15 40262.3259 2684.155 Total 19 209765.75 Standard Lower Upper Coefficients Error t Stat p -value 90.0\% 90.0\% Intercept 421.4277 77.8614 5.4125 7.2-05 284.9327 557.9227 X 1 (Temperature) -4.5098 0.8129 -5.5476 5.58-05 -5.9349 -3.0847 X 2 (Insulation) -14.9029 5.0508 -2.9505 0.0099 -23.7573 -6.0485 X 3 (Windows) 0.2151 4.8675 0.0442 0.9653 -8.3181 8.7484 X 4 (Furnace Age) 6.3780 4.1026 1.5546 0.1408 -0.8140 13.5702 Model 2 Regression Statistics R Square 0.7768 Adjusted R Square 0.7506 Observations 20 ANOVA d f SS MS SS Significance F Regression 2 162958.2277 81479.11 29.5923 2.9036-06 Residual 17 46807.5222 2753.384 Total 19 209765.75 Standard Lower Upper Coefficients Error t Stat p -value 95\% 95\% Intercept 489.3227 43.982611.1253 3.17-09 396.5273 582.1180 X 1 (Temperature) -5.1103 0.6951-7.3515 1.13-06 -6.5769 -3.6437 X 2 (Insulation) -14.7195 4.8864-3.0123 0.0078 -25.0290 -4.4099 -Referring to Table 14-6, what are the degrees of freedom of the partial F test for H0 : þ3 = þ4 = 0 versus H1: At least one þj × 0, j = 3,4?

(Multiple Choice)
4.8/5
(35)

TABLE 14-12 A weight-loss clinic wants to use regression analysis to build a model for weight-loss of a client (measured in pounds). Two variables thought to affect weight-loss are client's length of time on the weight loss program and time of session. These variables are described below: Y = Weight- loss (in pounds) X1 = Length of time in weight- loss program (in months) X2 = 1 if morning session, 0 if not X3 = 1 if afternoon session, 0 if not (Base level = evening session) Data for 12 clients on a weight- loss program at the clinic were collected and used to fit the interaction model: Y=β0+β1X1+β2X2+β3X3+β4X1X2+β5X1X3+ε Y=\beta_{0}+\beta_{1} X_{1}+\beta_{2} X_{2}+\beta_{3} X_{3}+\beta_{4} X_{1} X_{2}+\beta_{5} X_{1} X_{3}+\varepsilon Partial output from Microsoft Excel follows: Regression Statistics Multiple R 0.73514 R Square 0.540438 Adjusted R Square 0.157469 Standard Error 12.4147 Observations 12 ANOVA F=5.41118 Significance F=0.040201F = 5.41118 \quad\text { Significance } F = 0.040201 Coefficients Standard Error t Stat p -value Intercept 0.089744 14.127 0.0060 0.9951 Length (X1) 6.22538 2.43473 2.54956 0.0479 Morn Ses (X2) 2.217272 22.1416 0.100141 0.9235 Aft Ses (X3) 11.8233 3.1545 3.558901 0.0165 Length*Morn Ses 0.77058 3.562 0.216334 0.8359 Length * Aft Ses -0.54147 3.35988 -0.161158 0.8773 -Referring to Table 14-12, what null hypothesis would you test to determine whether the slope of the linear relationship between weight-loss (Y) and time in the program (X1) varies according to time of session?

(Multiple Choice)
4.8/5
(36)

TABLE 14-17 The marketing manager for a nationally franchised lawn service company would like to study the characteristics that differentiate home owners who do and do not have a lawn service. A random sample of 30 home owners located in a suburban area near a large city was selected; 15 did not have a lawn service (code 0) and 15 had a lawn service (code 1). Additional information available concerning these 30 home owners includes family income (Income, in thousands of dollars), lawn size (Lawn Size, in thousands of square feet), attitude toward outdoor recreational activities (Attitude 0 = unfavorable, 1 = favorable), number of teenagers in the household (Teenager), and age of the head of the household (Age). The Minitab output is given below: Odds 95 \% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -70.49 47.22 -1.49 0.135 Income 0.2868 0.1523 1.88 0.060 1.33 0.99 1.80 Lawn Size 1.0647 0.7472 1.42 0.154 2.90 0.67 12.54 Attitude -12.744 9.455 -1.35 0.178 0.00 0.00 326.06 Teenager -0.200 1.061 -0.19 0.850 0.82 0.10 6.56 Age 1.0792 0.8783 1.23 0.219 2.94 0.53 16.45 Log-Likelihood = -4.890 Test that all slopes are zero: G = 31.808, DF = 5, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 9.313 24 0.997 Deviance 9.780 24 0.995 Hosmer-Lemeshow 0.571 8 1.000 -Referring to Table 14-17, what is the estimated probability that a 48-year-old home owner with a family income of $100,000, a lawn size of 5,000 square feet, a negative attitude toward outdoor recreation, and one teenager in the household will purchase a lawn service?

(Short Answer)
4.9/5
(34)

TABLE 14-14 An econometrician is interested in evaluating the relation of demand for building materials to mortgage rates in Los Angeles and San Francisco. He believes that the appropriate model is Y = 10 + 5X1 + 8X2 where X1 = mortgage rate in % X2 = 1 if SF, 0 if LA Y = demand in $100 per capita -Referring to Table 14-14, the predicted demand in San Francisco when the mortgage rate is 10% is _____.

(Short Answer)
4.9/5
(38)

If a categorical independent variable contains 4 categories, then_____ dummy variable(s) will be needed to uniquely represent these categories.

(Multiple Choice)
4.8/5
(28)

TABLE 14-4 A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. Microsoft Excel output is provided below: Regression Stuistics Multiple R 0.865 R Square 0.748 Adjusted R Square 0.726 Standard Error 5.195 Observations 50 ANOVA d f S S M S F Significance F Regression 3605.7736 1201.9245 0.0000 Residual 1214.2264 26.3962 Total 49 4820.0000 Coefficients Standard Error t Stat p -value Intercept -1.6335 5.8078 -0.281 0.7798 Income 0.4485 0.1137 3.9545 0.0003 Size 4.2615 0.8062 5.286 0.0001 School -0.6517 0.4319 -1.509 0.1383 -Referring to Table 14-4, one individual in the sample had an annual income of $10,000, a family size of 1, and an education of 8 years. This individual owned a home with an area of 1,000 square feet (House = 10.00). What is the residual (in hundreds of square feet) for this data point?

(Multiple Choice)
4.7/5
(36)

TABLE 14-10 You worked as an intern at We Always Win Car Insurance Company last summer. You notice that individual car insurance premiums depend very much on the age of the individual, the number of traffic tickets received by the individual, and the population density of the city in which the individual lives. You performed a regression analysis in EXCEL and obtained the following information: Regression Analysis Regression Statistics Multiple R 0.63 R Square 0.40 Adjusted R Square 0.23 Standard Error 50.00 Observations 15.00 ANOVA d f SS MS F Significance F Regression 3 5994.24 2.40 0.12 Residual 11 27496.82 Total 45479.54 oefficients Standard Error t Stat p-value Lower 99.0\% Upper 99.0 \% Intercept 123.80 48.71 2.54 0.03 -27.47 275.07 AGE -0.82 0.87 -0.95 0.36 -3.51 1.87 TICKETS 21.25 10.66 1.99 0.07 -11.86 54.37 DENSITY -3.14 6.46 -0.49 0.64 -23.19 16.91 -Referring to Table 14-10, the regression sum of squares that is missing in the ANOVA table should be______ .

(Short Answer)
4.7/5
(36)

TABLE 14-1 A manager of a product sales group believes the number of sales made by an employee (Y) depends on how many years that employee has been with the company (X1) and how he/she scored on a business aptitude test (X2). A random sample of 8 employees provides the following: Employee Y 1 100 10 7 2 90 3 10 3 80 8 9 4 70 5 4 5 60 5 8 6 50 7 5 7 40 1 4 8 30 1 1 -Referring to Table 14-1, for these data, what is the estimated coefficient for the variable representing years an employee has been with the company, b1?

(Multiple Choice)
4.9/5
(34)

TABLE 14-10 You worked as an intern at We Always Win Car Insurance Company last summer. You notice that individual car insurance premiums depend very much on the age of the individual, the number of traffic tickets received by the individual, and the population density of the city in which the individual lives. You performed a regression analysis in EXCEL and obtained the following information: Regression Analysis Regression Statistics Multiple R 0.63 R Square 0.40 Adjusted R Square 0.23 Standard Error 50.00 Observations 15.00 ANOVA d f SS MS F Significance F Regression 3 5994.24 2.40 0.12 Residual 11 27496.82 Total 45479.54 oefficients Standard Error t Stat p-value Lower 99.0\% Upper 99.0 \% Intercept 123.80 48.71 2.54 0.03 -27.47 275.07 AGE -0.82 0.87 -0.95 0.36 -3.51 1.87 TICKETS 21.25 10.66 1.99 0.07 -11.86 54.37 DENSITY -3.14 6.46 -0.49 0.64 -23.19 16.91 -Referring to Table 14-10, the proportion of the total variability in insurance premiums that can be explained by AGE, TICKETS, and DENSITY is____ .

(Short Answer)
4.8/5
(41)

The coefficient of multiple determination r2Y.12 measures the proportion of variation in Y that is explained by X1 and X2.

(True/False)
4.9/5
(38)

TABLE 14-16 The superintendent of a school district wanted to predict the percentage of students passing a sixth-grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Following is the multiple regression output with Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending: Regression Statistics Multiple R 0.7930 R Square 0.6288 Adjusted R Square 0.6029 Standard Error 10.4570 Observations 47 ANOVA d f SS MS F Significance F Regression 3 7965.08 2655.03 24.2802 2.3853-09 Residual 43 4702.02 109.35 Total 46 12667.11 Coeffs Stnd Err t Stat p -value Lower 95\% Upper 95\% Intercept -753.4225 101.1149 -7.4511 2.88-09 -957.3401 -549.5050 \% Attend 8.5014 1.0771 7.8929 6.73-10 6.3292 10.6735 Salary 6.85-07 0.0006 0.0011 0.9991 -0.0013 0.0013 Spending 0.0060 0.0046 1.2879 0.2047 -0.0034 0.0153 -Referring to Table 14-16, we can conclude that average teacher salary has no impact on average percentage of students passing the proficiency test at a 10% level of significance based solely on the 95% confidence interval estimate for þ2.

(True/False)
4.7/5
(37)

TABLE 14-11 A logistic regression model was estimated in order to predict the probability that a randomly chosen university or college would be a private university using information on average total Scholastic Aptitude Test score (SAT) at the university or college, the room and board expense measured in thousands of dollars (Room/Brd), and whether the TOEFL criterion is at least 550 (Toefl550 = 1 if yes, 0 otherwise.) The dependent variable, Y, is school type (Type = 1 if private and 0 otherwise). The Minitab output is given below: Logistic Regression Table Odds 95: CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -27.118 6.696 -4.05 0.000 SAT 0.015 0.004666 3.17 0.002 1.01 1.01 1.02 Toefl550 -0.390 0.9538 -0.41 0.682 0.68 0.10 4.39 Room/Brd 2.078 0.5076 4.09 0.000 7.99 2.95 21.60 Log-Likelihood = -21.883 Test that all slopes are zero: G = 62.083, DF = 3, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 143.551 76 0.000 Deviance 43.767 76 0.999 Hosmer-Lemeshow 15.731 8 0.046 -Referring to Table 14-11, what are the degrees of freedom for the chi-square distribution when testing whether the model is a good-fitting model?

(Short Answer)
4.9/5
(36)

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (X1 = Age), experience in the field (X2 = Exper), number of degrees (X3 = Degrees), and number of previous jobs in the field (X4 = Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: SUMMARY OUTPUT\text {SUMMARY OUTPUT} Regression Statistics Multiple R 0.992 R Square 0.984 Adjusted R Square 0.979 Standard Error 2.26743 Observations 20 ANOVA d f SS M S F Significance F Regression 4 4609.83164 1152.45791 224.160 0.0001 Residual 15 77.11836 5.14122 Total 19 4686.95000 Coefficients Standard Error t Stat p -value Intercept -9.611198 2.77988638 -3.457 0.0035 Age 1.327695 0.11491930 11.553 0.0001 Exper -0.106705 0.14265559 -0.748 0.4660 Degrees 7.311332 0.80324187 9.102 0.0001 Prevjobs -0.504168 0.44771573 -1.126 0.2778 -Referring to Table 14-8, the value of the F-statistic for testing the significance of the entire regression is ______ .

(Short Answer)
4.8/5
(34)

TABLE 14-16 The superintendent of a school district wanted to predict the percentage of students passing a sixth-grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Following is the multiple regression output with Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending: Regression Statistics Multiple R 0.7930 R Square 0.6288 Adjusted R Square 0.6029 Standard Error 10.4570 Observations 47 ANOVA d f SS MS F Significance F Regression 3 7965.08 2655.03 24.2802 2.3853-09 Residual 43 4702.02 109.35 Total 46 12667.11 Coeffs Stnd Err t Stat p -value Lower 95\% Upper 95\% Intercept -753.4225 101.1149 -7.4511 2.88-09 -957.3401 -549.5050 \% Attend 8.5014 1.0771 7.8929 6.73-10 6.3292 10.6735 Salary 6.85-07 0.0006 0.0011 0.9991 -0.0013 0.0013 Spending 0.0060 0.0046 1.2879 0.2047 -0.0034 0.0153 -Referring to Table 14-16, predict the percentage of students passing the proficiency test for a school which has a daily average of 95% of students attending class, an average teacher salary of 40,000 dollars, and an instructional spending per pupil of 2000 dollars.

(Short Answer)
5.0/5
(32)

TABLE 14-3 An economist is interested to see how consumption for an economy (in $ billions) is influenced by gross domestic product ($ billions) and aggregate price (consumer price index). The Microsoft Excel output of this regression is partially reproduced below. SUMMARY OUTPUT\text {SUMMARY OUTPUT} Regression Statistics Multiple R 0.991 R Square 0.982 Adjusted R Square 0.976 Standard Error 0.299 Observations 10 ANOVA d f SS MS F Significance F Regression 2 33.4163 16.7082 186.325 0.0001 Residual 7 0.6277 0.0897 Total 9 34.0440 Coefficients Standard Error t Stat p -value Intercept -0.0861 0.5674 -0.152 0.8837 GDP 0.7654 0.0574 13.340 0.0001 Price -0.0006 0.0028 -0.219 0.8330 -Referring to Table 14-3, what is the estimated average consumption level for an economy with GDP equal to $4 billion and an aggregate price index of 150?

(Multiple Choice)
4.8/5
(39)

TABLE 14-17 The marketing manager for a nationally franchised lawn service company would like to study the characteristics that differentiate home owners who do and do not have a lawn service. A random sample of 30 home owners located in a suburban area near a large city was selected; 15 did not have a lawn service (code 0) and 15 had a lawn service (code 1). Additional information available concerning these 30 home owners includes family income (Income, in thousands of dollars), lawn size (Lawn Size, in thousands of square feet), attitude toward outdoor recreational activities (Attitude 0 = unfavorable, 1 = favorable), number of teenagers in the household (Teenager), and age of the head of the household (Age). The Minitab output is given below: Odds 95 \% CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant -70.49 47.22 -1.49 0.135 Income 0.2868 0.1523 1.88 0.060 1.33 0.99 1.80 Lawn Size 1.0647 0.7472 1.42 0.154 2.90 0.67 12.54 Attitude -12.744 9.455 -1.35 0.178 0.00 0.00 326.06 Teenager -0.200 1.061 -0.19 0.850 0.82 0.10 6.56 Age 1.0792 0.8783 1.23 0.219 2.94 0.53 16.45 Log-Likelihood = -4.890 Test that all slopes are zero: G = 31.808, DF = 5, P-Value = 0.000 Goodness-of-Fit Tests Method Chi-Square DF P Pearson 9.313 24 0.997 Deviance 9.780 24 0.995 Hosmer-Lemeshow 0.571 8 1.000 -Referring to Table 14-17, what is the p-value of the test statistic when testing whether Age makes a significant contribution to the model in the presence of the other independent variables?

(Short Answer)
4.8/5
(36)

TABLE 14-5 A microeconomist wants to determine how corporate sales are influenced by capital and wage spending by companies. She proceeds to randomly select 26 large corporations and record information in millions of dollars. The Microsoft Excel output below shows results of this multiple regression. Regression Statistics Multiple R 0.830 R Square 0.689 Adjusted R Square 0.662 Standard Error 17501.643 Observations 26 ANOVA d f S S M S F Significance F Regression 2 15579777040 7789888520 25.432 0.0001 Residual 23 7045072780 306307512 Total 25 22624849820 Coefficients Standard Error t Stat p-value Intercept 15800.0000 6038.2999 2.617 0.0154 C apital 0.1245 0.2045 0.609 0.5485 W ages 7.0762 1.4729 4.804 0.0001 -Referring to Table 14-5, which of the following values for ? is the smallest for which the regression model as a whole is significant?

(Multiple Choice)
4.8/5
(34)

TABLE 14-8 A financial analyst wanted to examine the relationship between salary (in $1,000) and 4 variables: age (X1 = Age), experience in the field (X2 = Exper), number of degrees (X3 = Degrees), and number of previous jobs in the field (X4 = Prevjobs). He took a sample of 20 employees and obtained the following Microsoft Excel output: SUMMARY OUTPUT\text {SUMMARY OUTPUT} Regression Statistics Multiple R 0.992 R Square 0.984 Adjusted R Square 0.979 Standard Error 2.26743 Observations 20 ANOVA d f SS M S F Significance F Regression 4 4609.83164 1152.45791 224.160 0.0001 Residual 15 77.11836 5.14122 Total 19 4686.95000 Coefficients Standard Error t Stat p -value Intercept -9.611198 2.77988638 -3.457 0.0035 Age 1.327695 0.11491930 11.553 0.0001 Exper -0.106705 0.14265559 -0.748 0.4660 Degrees 7.311332 0.80324187 9.102 0.0001 Prevjobs -0.504168 0.44771573 -1.126 0.2778 -Referring to Table 14-8, the p-value of the F test for the significance of the entire regression is______ .

(Short Answer)
4.8/5
(38)

TABLE 14-7 The department head of the accounting department wanted to see if she could predict the GPA of students using the number of course units (credits) and total SAT scores of each. She takes a sample of students and generates the following Microsoft Excel output: SUMMARY OUTPUT\text {SUMMARY OUTPUT} Regression Statistics Multiple R 0.916 R Square 0.839 Adjusted R Square 0.732 Standard Error 0.24685 Observations 6 ANOVA d f SS M S F Significance F Regression 2 0.95219 0.47610 7.813 0.0646 Residual 3 0.18281 0.06094 Total 5 1.13500 Coefficients Standard Error t Stat p -value Intercept 4.593897 1.13374542 4.052 0.0271 Units -0.247270 0.06268485 -3.945 0.0290 SAT Total 0.001443 0.00101241 1.425 0.2494 -Referring to Table 14-7, the department head wants to test H0 : ?1 = ?2 = 0. The value of the F-test statistic is ______.

(Short Answer)
4.7/5
(33)

TABLE 14-16 The superintendent of a school district wanted to predict the percentage of students passing a sixth-grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. Following is the multiple regression output with Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending: Regression Statistics Multiple R 0.7930 R Square 0.6288 Adjusted R Square 0.6029 Standard Error 10.4570 Observations 47 ANOVA d f SS MS F Significance F Regression 3 7965.08 2655.03 24.2802 2.3853-09 Residual 43 4702.02 109.35 Total 46 12667.11 Coeffs Stnd Err t Stat p -value Lower 95\% Upper 95\% Intercept -753.4225 101.1149 -7.4511 2.88-09 -957.3401 -549.5050 \% Attend 8.5014 1.0771 7.8929 6.73-10 6.3292 10.6735 Salary 6.85-07 0.0006 0.0011 0.9991 -0.0013 0.0013 Spending 0.0060 0.0046 1.2879 0.2047 -0.0034 0.0153 -Referring to Table 14-16, what are the lower and upper limits of the 95% confidence interval estimate for the effect of a one dollar increase in average teacher salary on average percentage of students passing the proficiency test?

(Short Answer)
4.9/5
(39)
Showing 181 - 200 of 258
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)