Deck 18: Multiple Linear Regression

Full screen (f)
exit full mode
Question
You are given the following data, where X1 (Pretest score) and X2 (Hours spent in the program) are used to predict Y (Posttest score):

YX1X265607.582629.094758.580787.0876510.066608.0\begin{array}{ccc}\hline \boldsymbol{Y} & \boldsymbol{X}_{\mathbf{1}} & \boldsymbol{X}_{\mathbf{2}} \\\hline 65 & 60 & 7.5 \\82 & 62 & 9.0 \\94 & 75 & 8.5 \\80 & 78 & 7.0 \\87 & 65 & 10.0 \\66 & 60 & 8.0 \\\hline\end{array}
Determine the following values: intercept, b1, b2, SSres, SSreg, F, sres2, s(b1), s(b2), t1, t2.
Use Space or
up arrow
down arrow
to flip the card.
Question
Complete the missing information for this regression model (df = 25).
Complete the missing information for this regression model (df = 25).  <div style=padding-top: 35px>
Question
A researcher would like to predict GPA from a set of three predictor variables for a sample of 34 college students. Multiple linear regression analysis was utilized. Complete the following summary table
( α\alpha = .05) for the test of significance of the overall regression model:
 A researcher would like to predict GPA from a set of three predictor variables for a sample of 34 college students. Multiple linear regression analysis was utilized. Complete the following summary table ( \alpha = .05) for the test of significance of the overall regression model:  <div style=padding-top: 35px>
Question
You are given the following data, where X1 (attendance rate) and X2 (average SAT score) are to be used to predict Y (average score in graduation test). Each case represents one school.

YX1X278.493.4101081.394.6102081.395.4102482.591.1113677.891.695284.594.2104288.294.5110688.793.4100472.592.188085.494.9112482.994.3112481.494.7996\begin{array}{|c|c|c|}\hline Y & X_{\mathbf{1}} & X_{\mathbf{2}} \\\hline 78.4 & 93.4 & 1010 \\\hline 81.3 & 94.6 & 1020 \\\hline 81.3 & 95.4 & 1024 \\\hline 82.5 & 91.1 & 1136 \\\hline 77.8 & 91.6 & 952 \\\hline 84.5 & 94.2 & 1042 \\\hline 88.2 & 94.5 & 1106 \\\hline 88.7 & 93.4 & 1004 \\\hline 72.5 & 92.1 & 880 \\\hline 85.4 & 94.9 & 1124 \\\hline 82.9 & 94.3 & 1124 \\\hline 81.4 & 94.7 & 996 \\\hline\end{array}
Determine the following values: intercept, b1, b2, SSres, SSreg, F, sres2, s(b1), s(b2), t1, t2.
Question
For the regression model, Yi = b1X1i + b2X2i + a + ei, consider the following two situations:
Situation 1: rY1 = ?0.5 rY2 = 0.8 r12 = 0.1
Situation 2: rY1 = 0.2 rY2 = 0.8 r12 = 0.1
In which of the two situations will R2 be larger?

A) Situation 1.
B) Situation 2.
C) R2 will be the same in both situations.
D) Uncertain.
Question
The scatterplot of X and Y are shown as follows.
<strong>The scatterplot of X and Y are shown as follows.   Based on the plot, which model is the most appropriate to use?</strong> A) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub> + a + e<sub>i</sub>. B) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub> + b<sub>2</sub>X<sub>i</sub><sup>2</sup> + a + e<sub>i</sub>. C) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub><sup>2</sup> + a + e<sub>i</sub>. D) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub> + b<sub>2</sub>X<sub>i</sub> + b<sub>3</sub>X<sub>i</sub><sup>3</sup> + a + e<sub>i</sub>. <div style=padding-top: 35px> Based on the plot, which model is the most appropriate to use?

A) Yi = b1Xi + a + ei.
B) Yi = b1Xi + b2Xi2 + a + ei.
C) Yi = b1Xi2 + a + ei.
D) Yi = b1Xi + b2Xi + b3Xi3 + a + ei.
Question
Which of the following situations will result in the best prediction in multiple regression analysis?

A) rY1 = 0.1 rY2 = 0.4 r12 = 0.1
B) rY1 = 0.1 rY2 = 0.4 r12 = 0.8
C) rY1 = 0.6 rY2 = 0.4 r12 = 0.1
D) rY1 = 0.6 rY2 = 0.4 r12 = 0.8
Question
Which one of the following reflects variables appropriate for a multiple linear regression model?

A) One categorical dependent variable and one continuous independent variable
B) One continuous dependent variable and one continuous or categorical independent variable
C) One continuous dependent variable and two or more continuous independent variables
D) Two or more continuous dependent variables and one continuous or categorical independent variable
Question
In a multiple linear regression with three independent variables, X1, X2, and X3, which one of the following reflects an example of a semipartial correlation?

A) The correlation between X1 and X2 and X3 where both X2 and X3 are removed from X1 and X2
B) The correlation between X1 and X2 where X3 is held constant
C) The correlation between X2 and X3 where X1 is partialed out
D) The correlation between X1 and X2 where X3 is removed from X2 only
Question
Partial correlations allow for which one of the following in multiple linear regression?

A) Design control
B) Experiential control
C) Experimental control
D) Statistical control
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/10
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 18: Multiple Linear Regression
1
You are given the following data, where X1 (Pretest score) and X2 (Hours spent in the program) are used to predict Y (Posttest score):

YX1X265607.582629.094758.580787.0876510.066608.0\begin{array}{ccc}\hline \boldsymbol{Y} & \boldsymbol{X}_{\mathbf{1}} & \boldsymbol{X}_{\mathbf{2}} \\\hline 65 & 60 & 7.5 \\82 & 62 & 9.0 \\94 & 75 & 8.5 \\80 & 78 & 7.0 \\87 & 65 & 10.0 \\66 & 60 & 8.0 \\\hline\end{array}
Determine the following values: intercept, b1, b2, SSres, SSreg, F, sres2, s(b1), s(b2), t1, t2.
Intercept = -70.662, b1 = 1.235, b2 = 8.077.

SSreg = 619.504, SSres = 44.496, F(2,3) = 20.884 (p < .017, reject at .05), s2res = 14.832.

s(b1) = .227, s(b2) = 1.660, t1 = 5.437 (p = .012, reject at .05), t2 = 4.866 (p = .017, reject at .05).

Procedure:
Create a data set with three variables: Posttest (Y), Pretest (X1), and Hour (X2). The data set should have six cases.

1) Go to Analyze \rightarrow Regression \rightarrow Linear.

2) Select Posttest to the Dependent list. Select Pretest and Hour to the Independent(s) list.

"3) Click OK.
Selected SPSS Output:

 Model Summary \text { Model Summary }
 Model RRSquare Adjusted R Square  Std. Error of the Estimate 1.9662.933.8883.851\begin{array}{ccccc}\hline \text { Model } & R & R S q u a r e & \text { Adjusted } R \text { Square } & \text { Std. Error of the Estimate } \\\hline 1 & .966^{2} & .933 & .888 & 3.851 \\\hline\end{array}
 a. Fredictors: (Constant), Hour, Pretest \text { a. Fredictors: (Constant), Hour, Pretest } ANOVAb {ANOVA}^{b}
 Model  Sum of Squares df Mean Square F Sig.  Regression 619.5042309.75220.884.017a1 Residual 44.496314.832 Total 664.0005\begin{array}{ccccccc}\hline {\text { Model }} & \text { Sum of Squares } & d f & \text { Mean Square } & F & \text { Sig. } \\\hline {\begin{array}{c}\text { Regression }\end{array}} & \mathbf{6 1 9 . 5 0 4} & 2 & 309.752 & \mathbf{2 0 . 8 8 4} & \mathbf{. 0 1 7}^a \\1\text { Residual } & \mathbf{4 4 . 4 9 6} & 3 & \mathbf{1 4 . 8 3 2} & &\\\text { Total }&664.000&5\\\hline\end{array}

a. Predictors: (Constant), Hour, Fretest
b. Dependent Variable: Posttest Results:
 Intercept = -70.662, b<sub>1</sub> = 1.235, b<sub>2</sub> = 8.077.  SS<sub>re</sub><sub>g</sub> = 619.504, SS<sub>re</sub><sub>s</sub> = 44.496, F(2,3) = 20.884 (p < .017, reject at .05), s<sup>2</sup><sub>res</sub> = 14.832.  s(b<sub>1</sub>) = .227, s(b<sub>2</sub>) = 1.660, t<sub>1</sub> = 5.437 (p = .012, reject at .05), t<sub>2</sub> = 4.866 (p = .017, reject at .05).  Procedure: Create a data set with three variables: Posttest (Y), Pretest (X<sub>1</sub>), and Hour (X<sub>2</sub>). The data set should have six cases.  1) Go to Analyze  \rightarrow  Regression  \rightarrow  Linear.  2) Select Posttest to the Dependent list. Select Pretest and Hour to the Independent(s) list.  3) Click OK. Selected SPSS Output:   \text { Model Summary }   \begin{array}{ccccc} \hline \text { Model } & R & R S q u a r e & \text { Adjusted } R \text { Square } & \text { Std. Error of the Estimate } \\ \hline 1 & .966^{2} & .933 & .888 & 3.851 \\ \hline \end{array}   \text { a. Fredictors: (Constant), Hour, Pretest }    {ANOVA}^{b}    \begin{array}{ccccccc} \hline {\text { Model }} & \text { Sum of Squares } & d f & \text { Mean Square } & F & \text { Sig. } \\ \hline {\begin{array}{c} \text { Regression } \end{array}} & \mathbf{6 1 9 . 5 0 4} & 2 & 309.752 & \mathbf{2 0 . 8 8 4} & \mathbf{. 0 1 7}^a \\ 1\text { Residual } & \mathbf{4 4 . 4 9 6} & 3 & \mathbf{1 4 . 8 3 2} & &\\ \text { Total }&664.000&5\\ \hline \end{array}   a. Predictors: (Constant), Hour, Fretest b. Dependent Variable: Posttest Results:   The results of the multiple linear regression suggest that a significant proportion of the total variation in posttest scores was effectively predicted by pretest scores and hours spent in the program, F(2,3) = 20.884, p = .017. For Pretest, the unstandardized partial slope (1.235) and standardized partial slope (.846) are statistically significantly different from 0 (t = 5.437, df = 3, p = .012); with every one-point increase in pretest, posttest score will increase by 1.235 when controlling for Hour. For Hour, the unstandardized partial slope (8.077) and standardized partial slope (.757) are statistically significantly different from 0 (t = 4.866, df = 3, p = .017); with every additional hour spent in the program, posttest score is expected to increase by 8.077 when controlling for Pretest scores. Thus, Pretest and Hour were shown to be statistically significant predictors of Posttest, both individually and collectively. Multiple R<sup>2</sup> indicates that 93.3% of the variation in Salary was predicted by Pretest and Hour. This suggests a large effect size.  The intercept was -70.662, which is not statistically significantly different from 0 at the .05 level (t = -3.042, df = 3, p = .056).
The results of the multiple linear regression suggest that a significant proportion of the total variation in posttest scores was effectively predicted by pretest scores and hours spent in the program, F(2,3) = 20.884, p = .017. For Pretest, the unstandardized partial slope (1.235) and standardized partial slope (.846) are statistically significantly different from 0 (t = 5.437, df = 3, p = .012); with every one-point increase in pretest, posttest score will increase by 1.235 when controlling for Hour. For Hour, the unstandardized partial slope (8.077) and standardized partial slope (.757) are statistically significantly different from 0 (t = 4.866, df = 3, p = .017); with every additional hour spent in the program, posttest score is expected to increase by 8.077 when controlling for Pretest scores. Thus, Pretest and Hour were shown to be statistically significant predictors of Posttest, both individually and collectively. Multiple R2 indicates that 93.3% of the variation in Salary was predicted by Pretest and Hour. This suggests a large effect size.

The intercept was -70.662, which is not statistically significantly different from 0 at the .05 level (t = -3.042, df = 3, p = .056)."
2
Complete the missing information for this regression model (df = 25).
Complete the missing information for this regression model (df = 25).
t1 = b1/s(b1) = 16/4 = 4; t2 = b2/s(b2) = .4/.05 = 8; t3 = b3/s(b3) = 70/10 = 7.
The critical t value is ±\pm α\alpha /2tdf = ±\pm 0.025t25 = 2.06.
|t1|, |t2|, |t3| > critical t, so X1, X2, and X3 are all significant predictors of Y.


 t<sub>1</sub> = b<sub>1</sub>/s(b<sub>1</sub>) = 16/4 = 4; t<sub>2</sub> = b<sub>2</sub>/s(b<sub>2</sub>) = .4/.05 = 8; t<sub>3</sub> = b<sub>3</sub>/s(b<sub>3</sub>) = 70/10 = 7. The critical t value is  \pm <sub> \alpha </sub><sub>/2</sub>t<sub>df</sub> =  \pm <sub>0.025</sub>t<sub>25</sub> = 2.06. |t<sub>1</sub>|, |t<sub>2</sub>|, |t<sub>3</sub>| > critical t, so X<sub>1</sub>, X<sub>2</sub>, and X<sub>3</sub> are all significant predictors of Y.
3
A researcher would like to predict GPA from a set of three predictor variables for a sample of 34 college students. Multiple linear regression analysis was utilized. Complete the following summary table
( α\alpha = .05) for the test of significance of the overall regression model:
 A researcher would like to predict GPA from a set of three predictor variables for a sample of 34 college students. Multiple linear regression analysis was utilized. Complete the following summary table ( \alpha = .05) for the test of significance of the overall regression model:
There are three independent variables, so m = 3. There are 34 students, so n = 34.

dfreg = m = 3, dfres = n - m -1 = 34 - 3 - 1 = 30, dftotal = n - 1 = 34 - 1 = 33.

SSreg = MSreg*dfreg = 6.5*3 = 19.5, SSres = SStotal- SSreg = 66 - 19.5 = 46.5.

MSres = SSres/dfres = 46.5/30 = 1.55

F = MSreg/MSres = 6.5/1.55 = 4.19; critical value = .05F3,30 = 2.92 < F, reject H0.

There are three independent variables, so m = 3. There are 34 students, so n = 34.  df<sub>reg</sub><sub> </sub>= m = 3, df<sub>res</sub> = n - m -1 = 34 - 3 - 1 = 30, df<sub>total</sub> = n - 1 = 34 - 1 = 33.  SS<sub>reg</sub> = MS<sub>reg</sub>*df<sub>reg</sub> = 6.5*3 = 19.5, SS<sub>res</sub> = SS<sub>total</sub>- SS<sub>reg</sub> = 66 - 19.5 = 46.5.  MS<sub>res</sub> = SS<sub>res</sub>/df<sub>res</sub> = 46.5/30 = 1.55  F = MS<sub>reg</sub>/MS<sub>re</sub><sub>s</sub> = 6.5/1.55 = 4.19; critical value = <sub>.05</sub>F<sub>3</sub><sub>,</sub><sub>3</sub><sub>0</sub> = 2.92 < F, reject H<sub>0</sub>.
4
You are given the following data, where X1 (attendance rate) and X2 (average SAT score) are to be used to predict Y (average score in graduation test). Each case represents one school.

YX1X278.493.4101081.394.6102081.395.4102482.591.1113677.891.695284.594.2104288.294.5110688.793.4100472.592.188085.494.9112482.994.3112481.494.7996\begin{array}{|c|c|c|}\hline Y & X_{\mathbf{1}} & X_{\mathbf{2}} \\\hline 78.4 & 93.4 & 1010 \\\hline 81.3 & 94.6 & 1020 \\\hline 81.3 & 95.4 & 1024 \\\hline 82.5 & 91.1 & 1136 \\\hline 77.8 & 91.6 & 952 \\\hline 84.5 & 94.2 & 1042 \\\hline 88.2 & 94.5 & 1106 \\\hline 88.7 & 93.4 & 1004 \\\hline 72.5 & 92.1 & 880 \\\hline 85.4 & 94.9 & 1124 \\\hline 82.9 & 94.3 & 1124 \\\hline 81.4 & 94.7 & 996 \\\hline\end{array}
Determine the following values: intercept, b1, b2, SSres, SSreg, F, sres2, s(b1), s(b2), t1, t2.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
5
For the regression model, Yi = b1X1i + b2X2i + a + ei, consider the following two situations:
Situation 1: rY1 = ?0.5 rY2 = 0.8 r12 = 0.1
Situation 2: rY1 = 0.2 rY2 = 0.8 r12 = 0.1
In which of the two situations will R2 be larger?

A) Situation 1.
B) Situation 2.
C) R2 will be the same in both situations.
D) Uncertain.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
6
The scatterplot of X and Y are shown as follows.
<strong>The scatterplot of X and Y are shown as follows.   Based on the plot, which model is the most appropriate to use?</strong> A) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub> + a + e<sub>i</sub>. B) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub> + b<sub>2</sub>X<sub>i</sub><sup>2</sup> + a + e<sub>i</sub>. C) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub><sup>2</sup> + a + e<sub>i</sub>. D) Y<sub>i</sub> = b<sub>1</sub>X<sub>i</sub> + b<sub>2</sub>X<sub>i</sub> + b<sub>3</sub>X<sub>i</sub><sup>3</sup> + a + e<sub>i</sub>. Based on the plot, which model is the most appropriate to use?

A) Yi = b1Xi + a + ei.
B) Yi = b1Xi + b2Xi2 + a + ei.
C) Yi = b1Xi2 + a + ei.
D) Yi = b1Xi + b2Xi + b3Xi3 + a + ei.
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
7
Which of the following situations will result in the best prediction in multiple regression analysis?

A) rY1 = 0.1 rY2 = 0.4 r12 = 0.1
B) rY1 = 0.1 rY2 = 0.4 r12 = 0.8
C) rY1 = 0.6 rY2 = 0.4 r12 = 0.1
D) rY1 = 0.6 rY2 = 0.4 r12 = 0.8
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
8
Which one of the following reflects variables appropriate for a multiple linear regression model?

A) One categorical dependent variable and one continuous independent variable
B) One continuous dependent variable and one continuous or categorical independent variable
C) One continuous dependent variable and two or more continuous independent variables
D) Two or more continuous dependent variables and one continuous or categorical independent variable
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
9
In a multiple linear regression with three independent variables, X1, X2, and X3, which one of the following reflects an example of a semipartial correlation?

A) The correlation between X1 and X2 and X3 where both X2 and X3 are removed from X1 and X2
B) The correlation between X1 and X2 where X3 is held constant
C) The correlation between X2 and X3 where X1 is partialed out
D) The correlation between X1 and X2 where X3 is removed from X2 only
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
10
Partial correlations allow for which one of the following in multiple linear regression?

A) Design control
B) Experiential control
C) Experimental control
D) Statistical control
Unlock Deck
Unlock for access to all 10 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 10 flashcards in this deck.