Question 1

Incorporating a categorical variable with 5 possible categories into a multiple regression model requires the use of __________ dummy variables.

Accepted Answer

The answer of Incorporating a categorical variable with 5 possible...

Question 2

Which of the following statements are true?

Accepted Answer

A)  The idea behind the stepwise procedure is that with forward selection, a single variable may be more strongly related to y than either of two or more other variables individually, but the combination of those variables may make the single variable subsequently redundant. 
B)  When the predictors $x _ { 1 } , x _ { 2 } , \ldots , \ldots , x _ { k }$
Are highly interdependent, the data is said to exhibit multicollinearity. 
C)  There is unfortunately no consensus among statisticians as to what remedies are appropriate when sever multicollinearity is present. One possibility involves continuing to use a model that includes all the predictors but estimating parameters by using something other than least squares. 
D)  All of the above statements are true. 
E)  None of the above statements are true. 
A)  The idea behind the stepwise procedure is that with forward selection, a single variable may be more strongly related to y than either of two or more other variables individually, but the combination of those variables may make the single variable subsequently redundant. 
B)  When the predictors $x _ { 1 } , x _ { 2 } , \ldots , \ldots , x _ { k }$
Are highly interdependent, the data is said to exhibit multicollinearity. 
C)  There is unfortunately no consensus among statisticians as to what remedies are appropriate when sever multicollinearity is present. One possibility involves continuing to use a model that includes all the predictors but estimating parameters by using something other than least squares. 
D)  All of the above statements are true. 
E)  None of the above statements are true.

Question 3

Which of the following statements are not true?

Accepted Answer

A)  Often theoretical considerations suggest a nonlinear relation between a dependent variable and two or more independent variables, whereas on other occasions, diagnostic plots indicate that some type of nonlinear function should be used. 
B)  The logistic regression model is used to relate a dichotomous variable y to a single prediction. Unfortunately, this model cannot be extended to incorporate more than one predictor. 
C)  A multiple regression model with k predictors includes k+1 regression parameters $\beta _ { i }$
's, because $\beta _ { 0 }$
Will always be included. 
D)  All of the above statements are true. 
E)  None of the above statements are true. 
A)  Often theoretical considerations suggest a nonlinear relation between a dependent variable and two or more independent variables, whereas on other occasions, diagnostic plots indicate that some type of nonlinear function should be used. 
B)  The logistic regression model is used to relate a dichotomous variable y to a single prediction. Unfortunately, this model cannot be extended to incorporate more than one predictor. 
C)  A multiple regression model with k predictors includes k+1 regression parameters $\beta _ { i }$
's, because $\beta _ { 0 }$
Will always be included. 
D)  All of the above statements are true. 
E)  None of the above statements are true.

Question 4

Which of the following statements are true?

Accepted Answer

A)  The proportion of total variation explained by the multiple regression model is $R ^ { 2 } = 1 - \frac { \operatorname { SSE } } { \operatorname { SST } }$
; the coefficient of multiple determination. 
B)  The coefficient of multiple determination $R ^ { 2 }$
Is often adjusted for the number of parameters (k+1) in the model by the formula $R _ { \pm } ^ { 2 } = \left[ ( n - 1 ) R ^ { 2 } - k \right] / [ n - ( k + 1 ) ]$ 
C)  With multivariate data, there is no preliminary picture analogous to a scatter plot to indicate whether a particular multiple regression model will be judged useful. 
D)  The model utility test in multiple regression involves testing $H _ { 0 } : \beta _ { 1 } = \beta _ { 2 } = \ldots \ldots = \beta _ { k } = 0$

Versus $H _ { ± } : \text { at least one } \beta _ { i} \neq 0$
(i = 1, 2, ……, k) 
E)  All of the above statements are true. 
A)  The proportion of total variation explained by the multiple regression model is $R ^ { 2 } = 1 - \frac { \operatorname { SSE } } { \operatorname { SST } }$
; the coefficient of multiple determination. 
B)  The coefficient of multiple determination $R ^ { 2 }$
Is often adjusted for the number of parameters (k+1) in the model by the formula $R _ { \pm } ^ { 2 } = \left[ ( n - 1 ) R ^ { 2 } - k \right] / [ n - ( k + 1 ) ]$ 
C)  With multivariate data, there is no preliminary picture analogous to a scatter plot to indicate whether a particular multiple regression model will be judged useful. 
D)  The model utility test in multiple regression involves testing $H _ { 0 } : \beta _ { 1 } = \beta _ { 2 } = \ldots \ldots = \beta _ { k } = 0$

Versus $H _ { ± } : \text { at least one } \beta _ { i} \neq 0$
(i = 1, 2, ……, k) 
E)  All of the above statements are true.

Question 5

In multiple regression analysis with n observations and k predictors (or equivalently k+1 parameters), inferences concerning a single parameter $\beta _ { i }$ are based on the standardized variable   , which has a t-distribution with degrees of freedom equal to

Accepted Answer

A)  n-k+1 
B)  n-k 
C)  n-k-1 
D)  n+k-1 
E)  n+k+1 
A)  n-k+1 
B)  n-k 
C)  n-k-1 
D)  n+k-1 
E)  n+k+1

Question 6

The following data resulted from an experiment to assess the potential of unburnt colliery spoil as a medium for plant growth. The variables are x=acid extractable cations and y=exchangeable acidity/total cation exchange capacity. $$\begin{array}{l}
\begin{array} { c c c c c c c c } 
\hline x & - 23 & - 5 & 16 & 26 & 30 & 38 & 52 \
\hline x & 1.50 & 1.46 & 1.32 & 1.17 & .96 & .78 & .77 \
\hline
\end{array}\\
\begin{array} { l l l l l l l } 
\hline x & 58 & 67 & 81 & 96 & 100 & 113 \
\hline x & .91 & .78 & .69 & .52 & .48 & .55 \
\hline
\end{array}
\end{array}$$ Standardizing the independent variable x to obtain $$x ^ { t } = ( x - \bar { x } ) / s _ { x }$$ and fitting the regression function $$y = \vec { \beta } _ { 0 } + \vec\beta _ { 1 } x ^ { t } + \vec { \beta } _ { 2 } ( x ^{t}) ^ { 2 }$$
 yielded the accompanying computer output.   
a. Estimate $\mu _ { y ^ { 50 } }$ 
. 
b. Compute the value of the coefficient of multiple determination.
c. What is the estimated regression function $\hat { \beta } _ { 10 } + \hat { \beta } _ { 1 } x + \hat { \beta } _ { 2 } x ^ { 2 }$ 
using the unstandardized variable x? 
d. What is the estimated standard deviation of $\hat { \beta } _ { 2 }$ 
computed in part ( c )? 
e. Carry out a test using the standardized estimates to decide whether the quadratic term should be retained in the model. Repeat using the unstandardized estimates. Do your conclusions differ?

Accepted Answer

a. @#IMG-DLM& and @#IMG-DLM& b. SST = 1.456923 and SSE = .11

Question 7

In general, with $S S E _ { \alpha }$ is the error sum of squares from a kth degree polynomial, $S S E _ { k }$ ____________ $S S E _ { k }$ , and $R _ { k } ^ { 2 }$ ____________ $R _ {k} ^ { 2 }$ whenever $k ^ { t }$ > k.

Accepted Answer

The answer of In general, with \(S S E _...

Question 8

For the quadratic model with regression function $\mu _ { i\cdot x } = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }$ , the parameters $\beta _ { 0 } , \beta _ { 1 } \text {, and } \beta _ { 2 }$ characterize the behavior of the function near

Accepted Answer

A)  x = 2.0 
B)  x = 1.5 
C)  x = 1.0 
D)  x = .05 
E)  x = 0.0 
A)  x = 2.0 
B)  x = 1.5 
C)  x = 1.0 
D)  x = .05 
E)  x = 0.0

Question 9

Suppose the variables x=commuting distance and y=commuting time are related according to the simple linear regression model with $\sigma = 10 .$ 
a. If n=5 observations are made at the x values $x _ { 1 } = 4 , x _ { 2 } = 9 , x _ { 3 } = 14 , x _ { 4 } = 19 , \text { and } x _ { 5 } = 24$ 
calculate the standard deviations of the five corresponding residuals. 
b. Repeat part (a) for $x _ { 1 } = 4 , x _ { 2 } = 9 , x _ { 3 } = 14 , x _ { 4 } = 19 , \text { and } x _ { 5} = 49$ 
c. What do the results of parts (a) and (b) imply about the deviation of the estimated line from the observation made at the largest sampled x value?

Accepted Answer

a. @#IMG-DLM& 6.32, 8.37, 8.94, 8.37, and 6.32 for

Question 10

A function relating y to x is ___________ if by means of a transformation on x and / or y, the function can be expressed as $y ^ { \prime } = \beta _ { 0 } + \beta _ { 1 } x ^ { \prime }$ , where $x ^ { t }$ is the transformed independent variable and $y ^ {t }$ is the transformed dependent variable.

Accepted Answer

intrinsica

Question 11

When the numbers of predictors is too large to allow for an explicit or implicit examination of all possible subsets, several alternative selection procedures generally will identify good models. The simplest such procedure is the __________, known as BE method.

Accepted Answer

backward e

Question 12

For a multiple regression model, $\sum \left( y _ { i } - \bar { y } \right) ^ { 2 } = 250$ , and $\sum \left( y _ { i } - \hat { y } _ { i } \right) ^ { 2 } = 60$ , then the proportion of the total variation in the observed $y _ { i}$ 's that is not explained by the model is

Accepted Answer

A)  .76 
B)  .24 
C)  310 
D)  190 
E)  .52 
A)  .76 
B)  .24 
C)  310 
D)  190 
E)  .52

Question 13

In many multiple regression data sets, the predictors $x _ { 1 } , x _ { 3 } , x _ { 4 } , \ldots \ldots , x _ { k }$ are highly interdependent. When the sample $x _ { i }$ values can be predicted very well from the other predictor values, for at least one predictor, the data is said to exhibit __________.

Accepted Answer

The answer of In many multiple regression data sets, the...

Question 14

Which of the following statements are not true?

Accepted Answer

A)  In analyzing transformed data, one should keep in mind that if a transformation on y has been made and one wishes to use the standardized formulas to test hypothesis or construct confidence intervals, the transformed error term $\varepsilon ^ {t }$
Should be at least approximately normally distributed. 
B)  When y is transformed, the $Y^ { 2 }$
Coefficient of determination value from the resulting regression refers to variation in the $y _ { i }$
's explained by the original (non-transformed) regression model. 
C)  The additive exponential and power models, $Y = α e ^ { \beta x} + \varepsilon$
And $Y = \alpha x ^ { \beta } + \varepsilon$
, respectively, are not intrinsically linear. 
D)  When the transformed model satisfies all required assumptions, the method of least squares yields best estimates of the transformed parameters. However, estimates of the original parameters may not be best in any sense, though they will be reasonable. 
E)  All of the above statements are true. 
A)  In analyzing transformed data, one should keep in mind that if a transformation on y has been made and one wishes to use the standardized formulas to test hypothesis or construct confidence intervals, the transformed error term $\varepsilon ^ {t }$
Should be at least approximately normally distributed. 
B)  When y is transformed, the $Y^ { 2 }$
Coefficient of determination value from the resulting regression refers to variation in the $y _ { i }$
's explained by the original (non-transformed) regression model. 
C)  The additive exponential and power models, $Y = α e ^ { \beta x} + \varepsilon$
And $Y = \alpha x ^ { \beta } + \varepsilon$
, respectively, are not intrinsically linear. 
D)  When the transformed model satisfies all required assumptions, the method of least squares yields best estimates of the transformed parameters. However, estimates of the original parameters may not be best in any sense, though they will be reasonable. 
E)  All of the above statements are true.

Question 15

The regression coefficient $\beta _ { 2 }$ in the multiple regression model $Y = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 } + \cdots \cdots + \beta _ { k } x ^ { k} + \varepsilon$  is interpreted as the expected change in ___________ associated with a 1-unit increase in ___________,while___________ are held fixed.

Accepted Answer

The answer of The regression coefficient \(\beta _ { 2...

Question 16

Wear resistance of certain nuclear reactor components made of Zircaloy-2 is partly determined by properties of the oxide layer. The following data appears in a study that proposed a new nondestructive testing method to monitor thickness of the layer. The variables are x =oxide-layer thickness ( $( \mu m )$ and y =eddy-current respond (arbitrary units). $\begin{array} { c c c c c c c c c c c } 
\hline x & 0 & 7 & 17 & 114 & 133 & 142 & 190 & 218 & 237 & 285 \
\hline x & 20.3 & 19.8 & 19.5 & 15.9 & 15.1 & 14.7 & 11.9 & 11.5 & 8.3 & 6.6 \
\hline
\end{array}$ The equation of the least squares line is $\hat { y }$ =20.6 - .047x. Calculate and plot the residuals against x and then comment on the appropriateness of the simple linear regression model.

Accepted Answer

The (x, residual) pairs for the plot are

Question 17

Cardiorespiratory fitness is widely recognized as a major component of overall physical well-being. Direct measurement of maximal oxygen uptake $\left( \mathrm { VO } _ { 2 } \max \right)$ is the single best measure of such fitness, but direct measurement is time-consuming and expensive. It is therefore desirable to have a prediction equation for $\mathrm { VO } _ { 2 } \max$ in terms of easily obtained quantities. Consider the variables $y = V O _ { 2 } \max ( L / \min )$ $x _ { 1 } = \text { weight (kg) }$ $x _ { 2 } = \operatorname { age } ( \mathrm { y } )$ $x _ { 3 } = \text { time necessary to }  \text { walk } 1 \mathrm {~m} \text { ile (min) }$
 $x _ { 4 } = \text { heart } \mathrm { r } \text { ate at the end of the } \text { walk (beats/min) }$
 Here is one possible model, for male students: $Y = 5.0 + .015 x _ { 1 } - .053 x _ { 2 } - .134 x _ { 3 } - .011 x _ { 4 } + \varepsilon$ , and $\sigma = .4$ 
a. Interpret $\beta _ { 1 } \text { and } \beta _ { 3 }$ 
. 
b. What is the expected value of $\mathrm { VO } _ { 2 } \max$ 
when weight 75 kg. age is 20 yr, walk time is 15 minutes, and heart rate is 140 b/m? 
c. What is the probability that $\mathrm { VO } _ { 2 } \max$ 
will be between 1.00 and 2.60 for a single observation made when the values of the predictors are as stated in part (b)?

Accepted Answer

a. Holding age, time, and heart rate con

Question 18

It is important to find characteristics of the production process that produce tortilla chips with an appealing texture. The following data on x = frying time (sec) and y = moisture content (%) are obtained: $$\begin{array} { c c c c c c c c c } 
\hline x & 5 & 10 & 15 & 20 & 25 & 30 & 45 & 60 \
\hline x & 16.3 & 11.4 & 8.1 & 4.5 & 3.4 & 2.9 & 1.9 & 1.3 \
\hline
\end{array}$$ 
a. Construct a scatter plot of y versus x and comment.
b. Construct a scatter plot of the (In(x), In(y)) pairs and comment.
c. What probabilistic relationship between x and y is suggested by the linear pattern in the plot of part (b)?
d. Predict the value of moisture content when frying time is 20 in a way that conveys information about reliability and precision.

Accepted Answer

a.
The points have a definite curved pat

Question 19

If the regression parameters $\beta _ { 0 }$ and $\beta _ { 1 }$ are estimated by minimizing the expression $f _ {w } \left( b _ { 0 } , b _ { 1 } \right) = \sum w _ { i } \left[ y _ { i } - \left( b _ { 0 } + b _ { 1 } x _ { i } \right) \right] ^ { 2 }$ , where the $w _ { i }$ 's are weights that decrease with increasing $x _ { i }$ , this yields____________estimates.

Accepted Answer

weighted l

Question 20

In each of the following cases, decide whether the given function is intrinsically linear. If so, identify $x ^ { t } \text { and } y ^ { t }$ and then explain how a random error term $ \varepsilon$  
 can be introduced to yield an intrinsically linear probabilistic model. 
a. $y = 1 / ( \alpha + \beta x )$ 
b. $y = 1 / \left( 1 + e ^ { \alpha+ \beta x } \right)$ 
c.   
(a Gompertz curve) 
d. $y = \alpha + \beta e ^ { \lambda x }$

Accepted Answer

a. @#IMG-DLM& The corresponding probabilistic mode

Incorporating a categorical variable with 5 possible categories into a multiple regression model requires the use of __________ dummy variables.

Which of the following statements are true?

Which of the following statements are not true?

Which of the following statements are true?

In general, with $S S E _ { \alpha }$ is the error sum of squares from a kth degree polynomial, $S S E _ { k }$ ____________ $S S E _ { k }$ , and $R _ { k } ^ { 2 }$ ____________ $R _ {k} ^ { 2 }$ whenever $k ^ { t }$ > k.

For the quadratic model with regression function $\mu _ { i\cdot x } = \beta _ { 0 } + \beta _ { 1 } x + \beta _ { 2 } x ^ { 2 }$ , the parameters $\beta _ { 0 } , \beta _ { 1 } \text {, and } \beta _ { 2 }$ characterize the behavior of the function near

When the numbers of predictors is too large to allow for an explicit or implicit examination of all possible subsets, several alternative selection procedures generally will identify good models. The simplest such procedure is the __________, known as BE method.

For a multiple regression model, $\sum \left( y _ { i } - \bar { y } \right) ^ { 2 } = 250$ , and $\sum \left( y _ { i } - \hat { y } _ { i } \right) ^ { 2 } = 60$ , then the proportion of the total variation in the observed $y _ { i}$ 's that is not explained by the model is

Which of the following statements are not true?

Overview and Descriptive Statistics

Probability

Discrete Random Variables and Probability Distributions

Continuous Random Variables and Probability Distributions

Joint Probability Distributions and Random Samples

Point Estimation

Statistical Intervals Based on a Single Sample

Tests of Hypotheses Based on a Single Sample

Inferences Based on Two Samples

The Analysis of Variance

Multifactor Analysis of Variance

Simple Linear Regression and Correlation

Goodness-Of-Fit Tests and Categorical Data Analysis

Distribution-Free Procedures

Quality Control Methods

Filters

Exam 13: Nonlinear and Multiple Regression

Incorporating a categorical variable with 5 possible categories into a multiple regression model requires the use of __________ dummy variables.

Which of the following statements are true?

Which of the following statements are not true?

Which of the following statements are true?

In multiple regression analysis with n observations and k predictors (or equivalently k+1 parameters), inferences concerning a single parameter βi\beta _ { i }βi​ are based on the standardized variable , which has a t-distribution with degrees of freedom equal to

In general, with SSEαS S E _ { \alpha }SSEα​ is the error sum of squares from a kth degree polynomial, SSEkS S E _ { k }SSEk​ ____________ SSEkS S E _ { k }SSEk​ , and Rk2R _ { k } ^ { 2 }Rk2​ ____________ Rk2R _ {k} ^ { 2 }Rk2​ whenever ktk ^ { t }kt > k.

When the numbers of predictors is too large to allow for an explicit or implicit examination of all possible subsets, several alternative selection procedures generally will identify good models. The simplest such procedure is the __________, known as BE method.

Which of the following statements are not true?

Overview and Descriptive Statistics

Probability

Discrete Random Variables and Probability Distributions

Continuous Random Variables and Probability Distributions

Joint Probability Distributions and Random Samples

Point Estimation

Statistical Intervals Based on a Single Sample

Tests of Hypotheses Based on a Single Sample

Inferences Based on Two Samples

The Analysis of Variance

Multifactor Analysis of Variance

Simple Linear Regression and Correlation

Goodness-Of-Fit Tests and Categorical Data Analysis

Distribution-Free Procedures

Quality Control Methods

Filters

In general, with $S S E _ { \alpha }$ is the error sum of squares from a kth degree polynomial, $S S E _ { k }$ ____________ $S S E _ { k }$ , and $R _ { k } ^ { 2 }$ ____________ $R _ {k} ^ { 2 }$ whenever $k ^ { t }$ > k.