Services
Discover
Ask a Question
Log in
Sign up
Filters
Done
Question type:
Essay
Multiple Choice
Short Answer
True False
Matching
Topic
Statistics
Study Set
Business Statistics Abridged
Exam 19: Multiple Regression
Path 4
Access For Free
Share
All types
Filters
Study Flashcards
Practice Exam
Learn
Question 1
Essay
An actuary wanted to develop a model to predict how long individuals will live. After consulting a number of physicians, she collected the age at death (y), the average number of hours of exercise per week (
x
1
x _ { 1 }
x
1
), the cholesterol level (
x
2
x _ { 2 }
x
2
), and the number of points by which the individual's blood pressure exceeded the recommended value (
x
3
x _ { 3 }
x
3
). A random sample of 40 individuals was selected. The computer output of the multiple regression model is shown below: THE REGRESSION EQUATION IS
y
=
y=
y
=
55.8
+
1.79
x
1
−
0.021
x
2
−
0.016
x
3
55.8 + 1.79 x _ { 1 } - 0.021 x _ { 2 } - 0.016 x _ { 3 }
55.8
+
1.79
x
1
−
0.021
x
2
−
0.016
x
3
Predictor
Coef
StDev
T
Constant
55.8
11.8
4.729
x
1
1.79
0.44
4.068
x
2
−
0.021
0.011
−
1.909
x
3
−
0.016
0.014
−
1.143
\begin{array} { | c | c c c | } \hline \text { Predictor } & \text { Coef } & \text { StDev } & \mathrm { T } \\\hline \text { Constant } & 55.8 & 11.8 & 4.729 \\x _ { 1 } & 1.79 & 0.44 & 4.068 \\x _ { 2 } & - 0.021 & 0.011 & - 1.909 \\x _ { 3 } & - 0.016 & 0.014 & - 1.143 \\\hline\end{array}
Predictor
Constant
x
1
x
2
x
3
Coef
55.8
1.79
−
0.021
−
0.016
StDev
11.8
0.44
0.011
0.014
T
4.729
4.068
−
1.909
−
1.143
S = 9.47 R-Sq = 22.5%.
ANALYSIS OF VARIANCE
Source of Variation
df
SS
MS
F
Regression
3
936
312
3.477
Error
36
3230
89.722
Total
39
4166
\begin{array}{l}\text { ANALYSIS OF VARIANCE }\\\begin{array} { | l | c c c c | } \hline \text { Source of Variation } & \text { df } & \text { SS } & \text { MS } & \text { F } \\\hline \text { Regression } & 3 & 936 & 312 & 3.477 \\\text { Error } & 36 & 3230 & 89.722 & \\\hline \text { Total } & 39 & 4166 & & \\\hline\end{array}\end{array}
ANALYSIS OF VARIANCE
Source of Variation
Regression
Error
Total
df
3
36
39
SS
936
3230
4166
MS
312
89.722
F
3.477
Is there enough evidence at the 1% significance level to infer that the average number of hours of exercise per week and the age at death are linearly related?
Question 2
Essay
Given the following statistics of a multiple regression model, can we conclude at the 5% significance level that
x
1
x _ { 1 }
x
1
and y are linearly related? n = 42 k = 6
b
1
=
b _ { 1 } =
b
1
=
-5.30
s
b
1
=
s _ { b _ { 1 } } =
s
b
1
=
1.5
Question 3
Multiple Choice
A multiple regression analysis that includes 20 data points and 4 independent variables results in total variation in y = SSY = 200 and SSR = 160. The multiple standard error of estimate will be:
Question 4
Essay
Consider the following statistics of a multiple regression model: n = 30 k = 4 SS
y
= 1500 SSE = 260. a. Determine the standard error of estimate. b. Determine the multiple coefficient of determination. c. Determine the F-statistic.
Question 5
Multiple Choice
For a set of 30 data points, Excel has found the estimated multiple regression equation to be
y
^
\hat{y}
y
^
= -8.61 + 22x
1
+ 7x
2
+ 28x
3
, and has listed the t statistic for testing the significance of each regression coefficient. Using the 5% significance level for testing whether
β
\beta
β
3
= 0, the critical region will be that the absolute value of the t statistic for
β
\beta
β
3
is greater than or equal to:
Question 6
Essay
Test the hypotheses:
H
0
:
H _ { 0 } :
H
0
:
There is no first-order autocorrelation
H
1
:
H _ { 1 } :
H
1
:
There is positive first-order autocorrelation, given that: the Durbin-Watson statistic d = 0.686, n = 16, k = 1 and
α
=
\alpha =
α
=
0.05.
Question 7
Essay
Pop-up coffee vendors have been popular in the city of Adelaide in 2013. A vendor is interested in knowing how temperature (in degrees Celsius) and number of different pastries and biscuits offered to customers impacts daily hot coffee sales revenue (in $00's). A random sample of 6 days was taken, with the daily hot coffee sales revenue and the corresponding temperature and number of different pastries and biscuits offered on that day, noted. Describe the following scatterplots.
Scatterplot of Daily hot coffee sales revenue vs Temperature
Scatterplot of Daily hot coffee sales revenue Pastries/biscuits
Residual scatterplot of Daily hot coffee sales revenue vs fitted values
Question 8
True/False
A multiple regression the coefficient of determination is 0.81. The percentage of the variation in
y
y
y
that is explained by the regression equation is 81%.
Question 9
Essay
Test the hypotheses:
H
0
:
H _ { 0 } :
H
0
:
There is no first-order autocorrelation
H
1
:
H _ { 1 } :
H
1
:
There is first-order autocorrelation, given that the Durbin-Watson statistic d = 1.89, n = 28, k = 3 and
α
=
\alpha =
α
=
0.05.
Question 10
Essay
A statistician wanted to determine whether the demographic variables of age, education and income influence the number of hours of television watched per week. A random sample of 25 adults was selected to estimate the multiple regression model
y
=
β
0
+
β
1
x
1
+
β
2
x
2
+
β
3
x
3
+
ε
y = \beta _ { 0 } + \beta _ { 1 } x _ { 1 } + \beta _ { 2 } x _ { 2 } + \beta _ { 3 } x _ { 3 } + \varepsilon
y
=
β
0
+
β
1
x
1
+
β
2
x
2
+
β
3
x
3
+
ε
. Where: y = number of hours of television watched last week.
x
1
x _ { 1 }
x
1
= age.
x
2
x _ { 2 }
x
2
= number of years of education.
x
3
x _ { 3 }
x
3
= income (in $1000s). The computer output is shown below. THE REGRESSION EQUATION IS
y
=
y =
y
=
22.3
+
0.41
x
1
−
0.29
x
2
−
0.12
x
3
22.3 + 0.41 x _ { 1 } - 0.29 x _ { 2 } - 0.12 x _ { 3 }
22.3
+
0.41
x
1
−
0.29
x
2
−
0.12
x
3
Predictor
Coef
StDev
T
Constant
22.3
10.7
2.084
x
1
0.41
0.19
2.158
x
2
−
0.29
0.13
−
2.231
x
3
−
0.12
0.03
−
4.00
\begin{array} { | c | c c c | } \hline \text { Predictor } & \text { Coef } & \text { StDev } & \mathrm { T } \\\hline \text { Constant } & 22.3 & 10.7 & 2.084 \\x _ { 1 } & 0.41 & 0.19 & 2.158 \\x _ { 2 } & - 0.29 & 0.13 & - 2.231 \\x _ { 3 } & - 0.12 & 0.03 & - 4.00 \\\hline\end{array}
Predictor
Constant
x
1
x
2
x
3
Coef
22.3
0.41
−
0.29
−
0.12
StDev
10.7
0.19
0.13
0.03
T
2.084
2.158
−
2.231
−
4.00
S = 4.51 R-Sq = 34.8%.
ANALYSIS OF VARIANCE
Source of Variation
df
SS
MS
F
Regression
3
227
75.667
3.730
Error
21
426
20.286
Total
24
653
\begin{array}{l}\text { ANALYSIS OF VARIANCE }\\\begin{array} { | l | c c c c | } \hline \text { Source of Variation } & \text { df } & \text { SS } & \text { MS } & \text { F } \\\hline \text { Regression } & 3 & 227 & 75.667 & 3.730 \\\text { Error } & 21 & 426 & 20.286 & \\\hline \text { Total } & 24 & 653 & & \\\hline\end{array}\end{array}
ANALYSIS OF VARIANCE
Source of Variation
Regression
Error
Total
df
3
21
24
SS
227
426
653
MS
75.667
20.286
F
3.730
What is the coefficient of determination? What does this statistic tell you?
Question 11
True/False
In multiple regression, the standard error of estimate is defined by
S
ε
=
S
S
E
/
(
n
−
k
)
S _ { \varepsilon } = \sqrt { SS E / ( n - k ) }
S
ε
=
SSE
/
(
n
−
k
)
, where n is the sample size and k is the number of independent variables.
Question 12
Multiple Choice
Excel and Minitab both provide the p-value for testing each coefficient in the multiple regression model. In the case of
b
2
b _ { 2 }
b
2
, this represents the probability that:
Question 13
Multiple Choice
A multiple regression analysis that includes 4 independent variables results in a sum of squares for regression of 1200 and a sum of squares for error of 800. The multiple coefficient of determination will be:
Question 14
True/False
Given the multiple linear regression equation, ŷ = b
0
+ b
1
x
1
+ b
2
x
2
, the value of b
2
is the estimated average increase in y for a one unit increase in x
2
, whilst holding x
1
constant.
Question 15
True/False
In a multiple regression, a large value of the test statistic F indicates that most of the variation in y is explained by the regression equation, and that the model is useful; while a small value of F indicates that most of the variation in y is unexplained by the regression equation, and that the model is useless.
Question 16
True/False
In multiple regression, the problem of multicollinearity affects the t-tests of the individual coefficients as well as the F-test in the analysis of variance for regression, since the F-test combines these t-tests into a single test.