Deck 15: Multiple Regression Model Building
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/88
Play
Full screen (f)
Deck 15: Multiple Regression Model Building
1
Which of the following is not used to find a "best" model?
A) Mallow's Cp
B) odds ratio
C) adjusted r2
D) all of the above
A) Mallow's Cp
B) odds ratio
C) adjusted r2
D) all of the above
odds ratio
2
TABLE 15-3
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, what is the p-value associated with the test statistic for testing whether there is an upward curvature in the response curve relating the demand (Y) and the price (X)?
A) 0.3647
B) 0.0006
C) 0.0001
D) none of the above
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, what is the p-value associated with the test statistic for testing whether there is an upward curvature in the response curve relating the demand (Y) and the price (X)?
A) 0.3647
B) 0.0006
C) 0.0001
D) none of the above
0.3647
3
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the "best" model using a 5% level of significance among those chosen by the Cp statistic is
A) X1, X2, X3.
B) X1, X3.
C) either of the above
D) none of the above
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the "best" model using a 5% level of significance among those chosen by the Cp statistic is
A) X1, X2, X3.
B) X1, X3.
C) either of the above
D) none of the above
X1, X3.
4
Using the hat matrix elements hi to determine influential points in a multiple regression model with k independent variable and n observations, Xi is an influential point if
A) hi < n(k +1)/2.
B) hi > n(k +1)/2.
C) hi < 2(k +1)/n.
D) hi > 2(k +1)/n.
A) hi < n(k +1)/2.
B) hi > n(k +1)/2.
C) hi < 2(k +1)/n.
D) hi > 2(k +1)/n.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
5
TABLE 15-3
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, what is the value of the test statistic for testing whether there is an upward curvature in the response curve relating the demand (Y) and the price (X)?
A) 0.95
B) 373
C) - 5.14
D) none of the above
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, what is the value of the test statistic for testing whether there is an upward curvature in the response curve relating the demand (Y) and the price (X)?
A) 0.95
B) 373
C) - 5.14
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
6
TABLE 15-4
In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are
built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove.
where Y = Sale price of property in thousands of dollars X1 = Size of property in thousands of square feet X2 = 1 if property located near cove, 0 if not
Model 1:
Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown: SUMMARY OUTPUT
-Referring to Table 15-4, given a quadratic relationship between sale price (Y) and property size (X1), what null hypothesis would you test to determine whether the curves differ from cove and non-cove properties?
A)
B)
C)
D)
In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are
built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove.
where Y = Sale price of property in thousands of dollars X1 = Size of property in thousands of square feet X2 = 1 if property located near cove, 0 if not
Model 1:
Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown: SUMMARY OUTPUT
-Referring to Table 15-4, given a quadratic relationship between sale price (Y) and property size (X1), what null hypothesis would you test to determine whether the curves differ from cove and non-cove properties?
A)
B)
C)
D)
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
7
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the correct interpretation for the estimated coefficient for PROMOTION?
A) The estimated mean paid attendance on promotion day will be 6927.88 higher than when there is no promotion taking into consideration all the other independent variables included in the model.
B) The paid attendance on promotion day will be 6927.88 higher than when there is no promotion taking into consideration all the other independent variables included in the model.
C) The estimated mean paid attendance on promotion day will be 6927.88 higher than when there is no promotion.
D) The paid attendance on promotion day will be 6927.88 higher than when there is no promotion.
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the correct interpretation for the estimated coefficient for PROMOTION?
A) The estimated mean paid attendance on promotion day will be 6927.88 higher than when there is no promotion taking into consideration all the other independent variables included in the model.
B) The paid attendance on promotion day will be 6927.88 higher than when there is no promotion taking into consideration all the other independent variables included in the model.
C) The estimated mean paid attendance on promotion day will be 6927.88 higher than when there is no promotion.
D) The paid attendance on promotion day will be 6927.88 higher than when there is no promotion.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
8
A regression diagnostic tool used to study the possible effects of collinearity is
A) the slope.
B) the Y-intercept.
C) the standard error of the estimate.
D) the VIF.
A) the slope.
B) the Y-intercept.
C) the standard error of the estimate.
D) the VIF.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
9
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the normal probability plot?
A) equal variance
B) normality of errors
C) linearity
D) none of the above
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the normal probability plot?
A) equal variance
B) normality of errors
C) linearity
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
10
The logarithm transformation can be used
A) to overcome violations to the autocorrelation assumption.
B) to test for possible violations to the autocorrelation assumption.
C) to change a linear independent variable into a nonlinear independent variable.
D) to change a nonlinear model into a linear model.
A) to overcome violations to the autocorrelation assumption.
B) to test for possible violations to the autocorrelation assumption.
C) to change a linear independent variable into a nonlinear independent variable.
D) to change a nonlinear model into a linear model.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
11
If a group of independent variables are not significant individually but are significant as a group at a specified level of significance, this is most likely due to
A) the absence of dummy variables.
B) autocorrelation.
C) the presence of dummy variables.
D) collinearity.
A) the absence of dummy variables.
B) autocorrelation.
C) the presence of dummy variables.
D) collinearity.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
12
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the better model using a 5% level of significance derived from the "best" model above is
A) X1, X2, X3.
B) X1, X3.
C) X1.
D) X3.
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the better model using a 5% level of significance derived from the "best" model above is
A) X1, X2, X3.
B) X1, X3.
C) X1.
D) X3.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
13
Using the Studentized residuals ti to determine influential points in a multiple regression model with k independent variable and n observations and letting tn-k-2 denote the upper critical value of a two-tail t test with a 0.10 level of significance, Xi is an influential point if
A) .
B) .
C) .
D) .
A) .
B) .
C) .
D) .
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
14
An independent variable Xj is considered highly correlated with the other independent variables if
A) VIFj > VIFi for i ? j .
B) VIFj > 5.
C) VIFj < VIFi for i ?j .
D) VIFj < 5.
A) VIFj > VIFi for i ? j .
B) VIFj > 5.
C) VIFj < VIFi for i ?j .
D) VIFj < 5.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
15
TABLE 15-1
To explain personal consumption (CONS) measured in dollars, data is collected for
A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and SEX as the independent variables. The estimated model was
y^ = 2.28 - 0.29 ln(CRDTLIM) + 5.77 ln(APR) + 2.35 ln(ADVT) + 0.39 SEX
-Referring to Table 15-1, what is the correct interpretation for the estimated coefficient for ADVT?
A) A $1 increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of $2.35 on personal consumption holding other variables constant.
B) A 1% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of 2.35% on personal consumption holding other variables constant.
C) A 100% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of 2.35% on personal consumption holding other variables constant.
D) A 100% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of $2.35 on personal consumption holding other variables constant.
To explain personal consumption (CONS) measured in dollars, data is collected for
A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and SEX as the independent variables. The estimated model was
y^ = 2.28 - 0.29 ln(CRDTLIM) + 5.77 ln(APR) + 2.35 ln(ADVT) + 0.39 SEX
-Referring to Table 15-1, what is the correct interpretation for the estimated coefficient for ADVT?
A) A $1 increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of $2.35 on personal consumption holding other variables constant.
B) A 1% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of 2.35% on personal consumption holding other variables constant.
C) A 100% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of 2.35% on personal consumption holding other variables constant.
D) A 100% increase in per person advertising expenditure by the manufacturer will result in an estimated average increase of $2.35 on personal consumption holding other variables constant.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
16
TABLE 15-1
To explain personal consumption (CONS) measured in dollars, data is collected for
A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and SEX as the independent variables. The estimated model was
y^ = 2.28 - 0.29 ln(CRDTLIM) + 5.77 ln(APR) + 2.35 ln(ADVT) + 0.39 SEX
-Referring to Table 15-1, what is the correct interpretation for the estimated coefficient for APR?
A) A 100% increase in average annualized percentage interest rate will result in an estimated average increase of $5.77 on personal consumption holding other variables constant.
B) A one percentage point increase in average annualized percentage interest rate will result in an estimated average increase of $5.77 on personal consumption holding other variables constant.
C) A 100% increase in average annualized percentage interest rate will result in an estimated average increase of 5.77% on personal consumption holding other variables constant.
D) A 1% increase in average annualized percentage interest rate will result in an estimated average increase of 5.77% on personal consumption holding other variables constant.
To explain personal consumption (CONS) measured in dollars, data is collected for
A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and SEX as the independent variables. The estimated model was
y^ = 2.28 - 0.29 ln(CRDTLIM) + 5.77 ln(APR) + 2.35 ln(ADVT) + 0.39 SEX
-Referring to Table 15-1, what is the correct interpretation for the estimated coefficient for APR?
A) A 100% increase in average annualized percentage interest rate will result in an estimated average increase of $5.77 on personal consumption holding other variables constant.
B) A one percentage point increase in average annualized percentage interest rate will result in an estimated average increase of $5.77 on personal consumption holding other variables constant.
C) A 100% increase in average annualized percentage interest rate will result in an estimated average increase of 5.77% on personal consumption holding other variables constant.
D) A 1% increase in average annualized percentage interest rate will result in an estimated average increase of 5.77% on personal consumption holding other variables constant.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
17
The Cp statistic is used
A) if the variances of the error terms are all the same in a regression model.
B) to determine if there is a problem of collinearity.
C) to determine if there is an irregular component in a time series.
D) to choose the best model.
A) if the variances of the error terms are all the same in a regression model.
B) to determine if there is a problem of collinearity.
C) to determine if there is an irregular component in a time series.
D) to choose the best model.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
18
TABLE 15-4
In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are
built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove.
where Y = Sale price of property in thousands of dollars
X1 = Size of property in thousands of square feet
X2 = 1 if property located near cove, 0 if not
Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown:
SUMMARY OUTPUT
-Referring to Table 15-4, given a quadratic relationship between sale price (Y) and property size (X1), what test should be used to test whether the curves differ from cove and non-cove properties?
A) t test on each of the subsets of the appropriate coefficients
B) F test for the entire regression model
C) partial F test on the subset of the appropriate coefficients
D) t test on each of the coefficients in the entire regression model
In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are
built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove.
where Y = Sale price of property in thousands of dollars
X1 = Size of property in thousands of square feet
X2 = 1 if property located near cove, 0 if not
Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown:
SUMMARY OUTPUT
-Referring to Table 15-4, given a quadratic relationship between sale price (Y) and property size (X1), what test should be used to test whether the curves differ from cove and non-cove properties?
A) t test on each of the subsets of the appropriate coefficients
B) F test for the entire regression model
C) partial F test on the subset of the appropriate coefficients
D) t test on each of the coefficients in the entire regression model
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
19
TABLE 15-4
In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are
built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove.
Model 1:
where Y = Sale price of property in thousands of dollars X1 = Size of property in thousands of square feet X2 = 1 if property located near cove, 0 if not
Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown: SUMMARY OUTPUT
Regression
Statistics
-Referring to Table 15-4, is the overall model statistically adequate at a 0.05 level of significance for predicting sale price (Y)?
A) Yes, since the p-value for the test is smaller than 0.05.
B) No, since some of the t tests for the individual variables are not significant.
C) No, since the standard deviation of the model is fairly large.
D) Yes, since none of the þ-estimates are equal to 0.
In Hawaii, condemnation proceedings are under way to enable private citizens to own the property that their homes are
built on. Until recently, only estates were permitted to own land, and homeowners leased the land from the estate. In order to comply with the new law, a large Hawaiian estate wants to use regression analysis to estimate the fair market value of the land. The following model was fit to data collected for n = 20 properties, 10 of which are located near a cove.
Model 1:
where Y = Sale price of property in thousands of dollars X1 = Size of property in thousands of square feet X2 = 1 if property located near cove, 0 if not
Using the data collected for the 20 properties, the following partial output obtained from Microsoft Excel is shown: SUMMARY OUTPUT
Regression
Statistics
-Referring to Table 15-4, is the overall model statistically adequate at a 0.05 level of significance for predicting sale price (Y)?
A) Yes, since the p-value for the test is smaller than 0.05.
B) No, since some of the t tests for the individual variables are not significant.
C) No, since the standard deviation of the model is fairly large.
D) Yes, since none of the þ-estimates are equal to 0.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
20
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
-Referring to Table 15-9, what is the correct interpretation for the estimated coefficient for TEMP?
A) As the high temperature increases by one degree, the paid attendance will increase by 51.70.
B) As the high temperature increases by one degree, the paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
C) As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70.
D) As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




-Referring to Table 15-9, what is the correct interpretation for the estimated coefficient for TEMP?
A) As the high temperature increases by one degree, the paid attendance will increase by 51.70.
B) As the high temperature increases by one degree, the paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
C) As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70.
D) As the high temperature increases by one degree, the estimated mean paid attendance will increase by 51.70 taking into consideration all the other independent variables included in the model.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
21
Using the Cook's distance statistic Di to determine influential points in a multiple regression model with k independent variable and n observations and letting Fv1,v 2 denote the critical value of an F distribution with v1 and v2 degrees of freedom at a 0.50 level of significance, Xi is an influential point if
A) Di > Fk+1,n-k-1
B) Di < Fn-k-1,k+1
C) Di < Fk+1,n-k-1
D)Di>Fn-k-1,k+1
A) Di > Fk+1,n-k-1
B) Di < Fn-k-1,k+1
C) Di < Fk+1,n-k-1
D)Di>Fn-k-1,k+1
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
22
The logarithm transformation can be used
A) to overcome violations to the homoscedasticity assumption.
B) to test for possible violations to the homoscedasticity assumption.
C) to overcome violations to the autocorrelation assumption.
D) to test for possible violations to the autocorrelation assumption.
A) to overcome violations to the homoscedasticity assumption.
B) to test for possible violations to the homoscedasticity assumption.
C) to overcome violations to the autocorrelation assumption.
D) to test for possible violations to the autocorrelation assumption.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
23
In multiple regression, the procedure permits variables to enter and leave the model at different stages of its development.
A) stepwise regression
B) residual analysis
C) backward elimination
D) forward selection
A) stepwise regression
B) residual analysis
C) backward elimination
D) forward selection
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
24
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the residual plot for OPWIN%?
A) equal variance
B) linearity
C) normality of errors
D) none of the above
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the residual plot for OPWIN%?
A) equal variance
B) linearity
C) normality of errors
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
25
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, which of the following models should be taken into consideration using the Mallows' Cp statistic?
A) X1, X2, X3
B) X1, X3
C) both of the above
D) none of the above
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, which of the following models should be taken into consideration using the Mallows' Cp statistic?
A) X1, X2, X3
B) X1, X3
C) both of the above
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
26
TABLE 15-3
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, what is the correct interpretation of the coefficient of multiple determination?
A) 98.8% of the total variation in demand can be explained by the addition of the square term in price.
B) 98.8% of the total variation in demand can be explained by just the square term in price.
C) 98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price.
D) 98.8% of the total variation in demand can be explained by the linear relationship between demand and price.
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, what is the correct interpretation of the coefficient of multiple determination?
A) 98.8% of the total variation in demand can be explained by the addition of the square term in price.
B) 98.8% of the total variation in demand can be explained by just the square term in price.
C) 98.8% of the total variation in demand can be explained by the quadratic relationship between demand and price.
D) 98.8% of the total variation in demand can be explained by the linear relationship between demand and price.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
27
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, which of the following predictors should first be dropped to remove collinearity?
A) X1
B) X3
C) X2
D) none of the above
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, which of the following predictors should first be dropped to remove collinearity?
A) X1
B) X3
C) X2
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
28
A microeconomist wants to determine how corporate sales are influenced by capital and wage spending by companies. She proceeds to randomly select 26 large corporations and record information in millions of dollars. A statistical analyst discovers that capital spending by corporations has a significant inverse relationship with wage spending. What should the microeconomist who developed this multiple regression model be particularly concerned with?
A) collinearity
B) randomness of error terms
C) normality of residuals
D) missing observations
A) collinearity
B) randomness of error terms
C) normality of residuals
D) missing observations
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
29
As a project for his business statistics class, a student examined the factors that determined parking meter rates throughout the campus area. Data were collected for the price per hour of parking, blocks to the quadrangle, and one of the three jurisdictions: on campus, in downtown and off campus, or outside of downtown and off campus. The population regression model hypothesized is where
Y is the meter price
X1 is the number of blocks to the quad
X2 is a dummy variable that takes the value 1 if the meter is located in downtown and off campus and the value 0 otherwise
X3 is a dummy variable that takes the value 1 if the meter is located outside of downtown and off campus, and the value 0 otherwise
Suppose that whether the meter is located on campus is an important explanatory factor. Why should the variable that depicts this attribute not be included in the model?
A) Its inclusion will introduce autocorrelation.
B) Its inclusion will inflate the standard errors of the estimated coefficients.
C) Its inclusion will introduce collinearity.
D) both B and C
Y is the meter price
X1 is the number of blocks to the quad
X2 is a dummy variable that takes the value 1 if the meter is located in downtown and off campus and the value 0 otherwise
X3 is a dummy variable that takes the value 1 if the meter is located outside of downtown and off campus, and the value 0 otherwise
Suppose that whether the meter is located on campus is an important explanatory factor. Why should the variable that depicts this attribute not be included in the model?
A) Its inclusion will introduce autocorrelation.
B) Its inclusion will inflate the standard errors of the estimated coefficients.
C) Its inclusion will introduce collinearity.
D) both B and C
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
30
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the "best" model chosen using the adjusted R-square statistic is
A) X1, X2, X3.
B) X1, X3.
C) either of the above
D) none of the above
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the "best" model chosen using the adjusted R-square statistic is
A) X1, X2, X3.
B) X1, X3.
C) either of the above
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
31
A real estate builder wishes to determine how house size (House) is influenced by family income (Income), family size (Size), and education of the head of household (School). House size is measured in hundreds of square feet, income is measured in thousands of dollars, and education is in years. The builder randomly selected 50 families and ran the multiple regression. The business literature involving human capital shows that education influences an individual's annual income. Combined, these may influence family size. With this in mind, what should the real estate builder be particularly concerned with when analyzing the multiple regression model?
A) missing observations
B) normality of residuals
C) collinearity
D) randomness of error terms
A) missing observations
B) normality of residuals
C) collinearity
D) randomness of error terms
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
32
Which of the following is used to determine observations that have influential effect on the fitted model?
A) Cook's distance statistic
B) the Cp statistic
C) variance inflationary factor
D) Durbin Watson statistic
A) Cook's distance statistic
B) the Cp statistic
C) variance inflationary factor
D) Durbin Watson statistic
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
33
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the residual plot for TEMP?
A) normality of errors
B) equal variance
C) linearity
D) none of the above
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the residual plot for TEMP?
A) normality of errors
B) equal variance
C) linearity
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
34
The Variance Inflationary Factor (VIF) measures the
A) correlation of the X variables with each other.
B) contribution of each X variable with the Y variable after all other X variables are included in the model.
C) standard deviation of the slope.
D) correlation of the X variables with the Y variable.
A) correlation of the X variables with each other.
B) contribution of each X variable with the Y variable after all other X variables are included in the model.
C) standard deviation of the slope.
D) correlation of the X variables with the Y variable.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
35
Which of the following is not used to determine observations that have influential effect on the fitted model?
A) Cook's distance statistic
B) the studentized deleted residuals ti
C) the hat matrix elements hi
D) the Cp statistic
A) Cook's distance statistic
B) the studentized deleted residuals ti
C) the hat matrix elements hi
D) the Cp statistic
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
36
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, what are, respectively, the values of the variance inflationary factor of the 3 predictors?
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, what are, respectively, the values of the variance inflationary factor of the 3 predictors?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
37
Using the best-subsets approach to model building, models are being considered when their
A) Cp ? (k + 1).
B) Cp > (k + 1).
C) Cp ? k.
D) Cp > k.
A) Cp ? (k + 1).
B) Cp > (k + 1).
C) Cp ? k.
D) Cp > k.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
38
TABLE 15-3
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, does there appear to be significant upward curvature in the response curve relating the demand (Y) and the price (X) at 10% level of significance?
A) No, since the p-value for the test is greater than 0.10.
B) Yes, since the value of þ2 is positive.
C) Yes, since the p-value for the test is less than 0.10.
D) No, since the value of þ2 is near 0.
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, does there appear to be significant upward curvature in the response curve relating the demand (Y) and the price (X) at 10% level of significance?
A) No, since the p-value for the test is greater than 0.10.
B) Yes, since the value of þ2 is positive.
C) Yes, since the p-value for the test is less than 0.10.
D) No, since the value of þ2 is near 0.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
39
Which of the following will not change a nonlinear model into a linear model?
A) logarithmic transformation
B) square-root transformation
C) variance inflationary factor
D) quadratic regression model
A) logarithmic transformation
B) square-root transformation
C) variance inflationary factor
D) quadratic regression model
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
40
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise
PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the residual plot for WIN%?
A) normality of errors
B) linearity
C) equal variance
D) none of the above
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise
PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, which of the following assumptions is most likely violated based on the residual plot for WIN%?
A) normality of errors
B) linearity
C) equal variance
D) none of the above
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
41
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, the prediction of time to relief for a person receiving a dose of the drug 10 units above the average dose , is____ .
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, the prediction of time to relief for a person receiving a dose of the drug 10 units above the average dose , is____ .
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
42
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, what is the p-value of the test statistic to determine whether TEMP makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, what is the p-value of the test statistic to determine whether TEMP makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
43
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9,_____ of the variation in ATTENDANCE can be explained by the five independent variables after taking into consideration the number of independent variables and the number of observations.
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9,_____ of the variation in ATTENDANCE can be explained by the five independent variables after taking into consideration the number of independent variables and the number of observations.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
44
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the value of the test statistic to determine whether PROMOTION makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j) of each of the 5 predictors with all the other remaining predictors are, respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the value of the test statistic to determine whether PROMOTION makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
45
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. The p-value of the test is_______
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. The p-value of the test is_______
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
46
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. The value of the test statistic is ____ .
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. The value of the test statistic is ____ .
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
47
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a linear model and a curvilinear model that includes a linear term. The p-value of the test statistic for the contribution of the curvilinear term is______
.
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a linear model and a curvilinear model that includes a linear term. The p-value of the test statistic for the contribution of the curvilinear term is______
.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
48
The_____ (larger/smaller) the value of the Variance Inflationary Factor, the higher is the collinearity of the X variables.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
49
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. The value of the test statistic is______ .
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. The value of the test statistic is______ .
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
50
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a curvilinear model without a linear term and a curvilinear
model that includes a linear term. The p-value of the test is _______.
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a curvilinear model without a linear term and a curvilinear
model that includes a linear term. The p-value of the test is _______.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
51
The Variance Inflationary Factor (VIF) measures the correlation of the X variables with the Y variable.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
52
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the value of the test statistic to determine whether TEMP makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the value of the test statistic to determine whether TEMP makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
53
In multiple regression, the_____ procedure permits variables to enter and leave the model at different stages of its development.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
54
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, what is the p-value of the test statistic to determine whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance?
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, what is the p-value of the test statistic to determine whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
55
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the p-value of the test statistic to determine whether TEMP makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what is the p-value of the test statistic to determine whether TEMP makes a significant contribution to the regression model in the presence of the other independent variables at a 5% level of significance?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
56
A regression diagnostic tool used to study the possible effects of collinearity is ______.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
57
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is not significant at a 5% level of significance.
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is not significant at a 5% level of significance.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
58
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9,_______ of the variation in ATTENDANCE can be explained by the five independent variables.
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9,_______ of the variation in ATTENDANCE can be explained by the five independent variables.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
59
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, what is the value of the test statistic to determine whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance?
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, what is the value of the test statistic to determine whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
60
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what are, respectively, the values of the variance inflationary factor of the 5 predictors?
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308.
-Referring to Table 15-9, what are, respectively, the values of the variance inflationary factor of the 5 predictors?
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
61
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.01 she would decide that there is a significant curvilinear relationship.
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT

-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.01 she would decide that there is a significant curvilinear relationship.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
62
TABLE 15-9
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.
The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, there is reason to suspect collinearity between some pairs of predictors.
Many factors determine the attendance at Major League Baseball games. These factors can include when the game is played, the weather, the opponent, whether or not the team is having a good season, and whether or not a marketing promotion is held. Data from 80 games of the Kansas City Royals for the following variables are collected.
ATTENDANCE = Paid attendance for the game
TEMP = High temperature for the day
WIN% = Team's winning percentage at the time of the game
OPWIN% = Opponent team's winning percentage at the time of the game WEEKEND - 1 if game played on Friday, Saturday or Sunday; 0 otherwise PROMOTION - 1 = if a promotion was held; 0 = if no promotion was held
The regression results using attendance as the dependent variable and the remaining five variables as the independent variables are presented below.




The coefficient of multiple determination ( R 2 j ) of each of the 5 predictors with all the other remaining predictors are,
respectively, 0.2675, 0.3101, 0.1038, 0.7325, and 0.7308
-Referring to Table 15-9, there is reason to suspect collinearity between some pairs of predictors.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
63
One of the consequences of collinearity in multiple regression is inflated standard errors in some or all of the estimated slope coefficients.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
64
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the residual plot suggests that a nonlinear model on % attendance may be a better model.
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the residual plot suggests that a nonlinear model on % attendance may be a better model.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
65
One of the consequences of collinearity in multiple regression is biased estimates on the slope coefficients.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
66
Only when all three of the hat matrix elements hi, the Studentized deleted residuals ti and the Cook's distance statistic Di reveal consistent result should an observation be removed from the regression analysis.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
67
Two simple regression models were used to predict a single dependent variable. Both models were highly significant, but when the two independent variables were placed in the same multiple regression model for the dependent variable, R2 did not increase substantially and the parameter estimates for the model were not significantly different from 0. This is probably an example of collinearity.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
68
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.05, she would decide that there is a significant curvilinear relationship.
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT

-Referring to Table 15-7, suppose the chemist decides to use an F test to determine if there is a significant curvilinear relationship between time and dose. If she chooses to use a level of significance of 0.05, she would decide that there is a significant curvilinear relationship.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
69
TABLE 15-3
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, a more parsimonious simple linear model is likely to be statistically superior to the fitted curvilinear for predicting sale price (Y).
A certain type of rare gem serves as a status symbol for many of its owners. In theory, for low prices, the demand increases and it decreases as the price of the gem increases. However, experts hypothesize that when the gem is valued at very high prices, the demand increases with price due to the status owners believe they gain in obtaining the gem. Thus, the model proposed to best explain the demand for the gem by its price is the quadratic model:
where Y = demand (in thousands) and X = retail price per carat.
This model was fit to data collected for a sample of 12 rare gems of this type. A portion of the computer analysis obtained from Microsoft Excel is shown below:
SUMMARY OUTPUT
-Referring to Table 15-3, a more parsimonious simple linear model is likely to be statistically superior to the fitted curvilinear for predicting sale price (Y).
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
70
Collinearity is present when there is a high degree of correlation between independent variables.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
71
The goals of model building are to find a good model with the fewest independent variables that is easier to interpret and has lower probability of collinearity.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
72
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a linear model and a curvilinear model that includes a linear term. If she used a level of significance of 0.02, she would decide that the linear model is sufficient.
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a linear model and a curvilinear model that includes a linear term. If she used a level of significance of 0.02, she would decide that the linear model is sufficient.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
73
In stepwise regression, an independent variable is not allowed to be removed from the model once it has entered into the model.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
74
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, there is reason to suspect collinearity between some pairs of predictors.
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, there is reason to suspect collinearity between some pairs of predictors.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
75
Collinearity is present when there is a high degree of correlation between the dependent variable and any of the independent variables.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
76
Collinearity is present if the dependent variable is linearly related to one of the explanatory variables.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
77
A high value of R2 significantly above 0 in multiple regression accompanied by insignificant
t-values on all parameter estimates very often indicates a high correlation between independent variables in the model.
t-values on all parameter estimates very often indicates a high correlation between independent variables in the model.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
78
In data mining where huge data sets are being explored to discover relationships among a large number of variables, the best-subsets approach is more practical than the stepwise regression approach.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
79
TABLE 15- 8
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the null hypothesis should be rejected when testing whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance.
The superintendent of a school district wanted to predict the percentage of students passing a sixth- grade proficiency test. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state.
Let Y = % Passing as the dependent variable, X1 = % Attendance, X2 = Salaries and X3 = Spending.
The coefficient of multiple determination (R 2 j) of each of the 3 predictors with all the other remaining predictors are,
respectively, 0.0338, 0.4669, and 0.4743.
The output from the best- subset regressions is given below:
Adjusted
Following is the residual plot for % Attendance:
Following is the residual plot for % Attendance:

Following is the output of several multiple regression models:
-Referring to Table 15-8, the null hypothesis should be rejected when testing whether the quadratic effect of daily average of the percentage of students attending class on percentage of students passing the proficiency test is significant at a 5% level of significance.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck
80
TABLE 15-7
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. Using a level of significance of 0.05, she would decide that the curvilinear model should include a linear term.
A chemist employed by a pharmaceutical firm has developed a muscle relaxant. She took a sample of 14 people suffering from extreme muscle constriction. She gave each a vial containing a dose (X) of the drug and recorded the time to relief (Y) measured in seconds for each. She fit a "centered" curvilinear model to this data. The results obtained by Microsoft Excel follow, where the dose (X) given has been "centered."
SUMMARY OUTPUT
-Referring to Table 15-7, suppose the chemist decides to use a t test to determine if there is a significant difference between a curvilinear model without a linear term and a curvilinear model that includes a linear term. Using a level of significance of 0.05, she would decide that the curvilinear model should include a linear term.
Unlock Deck
Unlock for access to all 88 flashcards in this deck.
Unlock Deck
k this deck