Exam 4: Linear Regression With One Regressor

arrow
  • Select Tags
search iconSearch Question
  • Select Tags

In which of the following relationships does the intercept have a real-world interpretation?

Free
(Multiple Choice)
4.9/5
(29)
Correct Answer:
Verified

A

Prove that the regression R2 is identical to the square of the correlation coefficient between two variables Y and X. Regression functions are written in a form that suggests causation running from X to Y. Given your proof, does a high regression R2 present supportive evidence of a causal relationship? Can you think of some regression examples where the direction of causality is not clear? Is without a doubt?

Free
(Essay)
4.9/5
(31)
Correct Answer:
Verified

The regression R2 = ESSTSS\frac { E S S } { T S S } , where ESS is given by i=1n(Y^Yˉ)2\sum _ { i = 1 } ^ { n } ( \hat { Y } - \bar { Y } ) ^ { 2 } But Y^\hat { Y } i = β^\hat { \beta } 0 + β^\hat { \beta } 1Xi and Yˉ\bar { Y } = β^\hat { \beta } 0 + β^\hat { \beta } 1
Xˉ\bar { X } Hence ( Y^\hat { Y } i - Yˉ\bar { Y } )2 = β^12\hat { \beta } _ { 1 } ^ { 2 } (Xi - Xˉ\bar { X } )2 and therefore ESS = β^12\hat { \beta } _ { 1 } ^ { 2 } i=1n(XiXˉ)2\sum _ { i = 1 } ^ { n } \left( X _ { i } - \bar { X } \right) ^ { 2 } Using small letters to indicate deviations from mean, i.e., zi = Zi - Zˉ\bar { Z } , we get that the regression R2 = β^12i=1nxi2i=1nyi2\frac { \hat { \beta } _ { 1 } ^ { 2 } \sum _ { i = 1 } ^ { n } x _ { i } ^ { 2 } } { \sum _ { i = 1 } ^ { n } y _ { i } ^ { 2 } } The square of the correlation coefficient is r2=i=1n(yixi)2i=1nxi2i=1nyi2=i=1n(yixi)2i=1nxi2(i=1nxi2)2i=1nyi2=β^i2i=1nxi2i=1nyi2r ^ { 2 } = \frac { \sum _ { i = 1 } ^ { n } \left( y _ { i } x _ { i } \right) ^ { 2 } } { \sum _ { i = 1 } ^ { n } x _ { i } ^ { 2 } \sum _ { i = 1 } ^ { n } y _ { i } ^ { 2 } } = \frac { \sum _ { i = 1 } ^ { n } \left( y _ { i } x _ { i } \right) ^ { 2 } \sum _ { i = 1 } ^ { n } x _ { i } ^ { 2 } } { \left( \sum _ { i = 1 } ^ { n } x _ { i } ^ { 2 } \right) ^ { 2 } \sum _ { i = 1 } ^ { n } y _ { i } ^ { 2 } } = \frac { \hat { \beta } _ { i } ^ { 2 } \sum _ { i = 1 } ^ { n } x _ { i } ^ { 2 } } { \sum _ { i = 1 } ^ { n } y _ { i } ^ { 2 } } . Hence the two are the same. Correlation does not imply causation. Income is a regressor in the consumption function, yet consumption enters on the right-hand side of the GDP identity. Regressing the weight of individuals on the height is a situation where causality is without doubt, since the author of this test bank should be seven feet tall otherwise. The authors of the textbook use weather data to forecast orange juice prices later in the text.

Assume that there is a change in the units of measurement on X. The new variables X* = bX. Prove that this change in the units of measurement on the explanatory variable has no effect on the intercept in the resulting regression.

Free
(Essay)
4.7/5
(35)
Correct Answer:
Verified

Consider the sample regression function Y^=β^0+β^1Xˉ\hat { Y } = \hat { \beta } _ { 0 } ^ { * } + \hat { \beta } _ { 1 } ^ { * } \bar { X } ^ { * } The formula for the intercept will be β^0=Yˉβ^1bXˉ\hat { \beta } _ { 0 } ^ { * } = \bar { Y } - \hat { \beta } _ { 1 } ^ { * } b \bar { X } But β^1+=i=1nxiyii=1nxi+2=i=1n(bxi)yii=1n(bxi)2=abi=1nxiyib2i=1nxi2=1bβ^1\hat { \beta } _ { 1 } ^ { + } = \frac { \sum _ { i = 1 } ^ { n } x _ { i } ^ { * } y _ { i } } { \sum _ { i = 1 } ^ { n } x _ { i } ^ { + 2 } } = \frac { \sum _ { i = 1 } ^ { n } \left( b x _ { i } \right) y _ { i } } { \sum _ { i = 1 } ^ { n } \left( b x _ { i } \right) ^ { 2 } } = \frac { a b \sum _ { i = 1 } ^ { n } x _ { i } y _ { i } } { b ^ { 2 } \sum _ { i = 1 } ^ { n } x _ { i } ^ { 2 } } = \frac { 1 } { b } \hat { \beta } _ { 1 } Hence β^0=Yˉ1bβ^1bXˉ=β^0\hat { \beta } _ { 0 } ^ { * } = \bar { Y } - \frac { 1 } { b } \hat { \beta } _ { 1 } ^ { * } b \bar { X } = \hat { \beta } _ { 0 } .

In deriving the OLS estimator, you minimize the sum of squared residuals with respect to the two parameters β^\hat { \beta } 0 and β^\hat { \beta } 1. The resulting two equations imply two restrictions that OLS places on the data, namely that i=1nu^i=0\sum _ { i = 1 } ^ { n } \hat { u } _ { i } = 0 and i=1nu^iXi=0\sum _ { i = 1 } ^ { n } \hat { u } _ { i } X _ { i } = 0 Show that you get the same formula for the regression slope and the intercept if you impose these two conditions on the sample regression function.

(Essay)
4.7/5
(35)

The normal approximation to the sampling distribution of β^\hat { \beta } 1 is powerful because

(Multiple Choice)
4.9/5
(31)

To obtain the slope estimator using the least squares principle, you divide the

(Multiple Choice)
4.7/5
(33)

Your textbook presented you with the following regression output:  TestScore ^\widehat {\text { TestScore }} = 698.9 - 2.28 × STR n = 420, R2 = 0.051, SER = 18.6 (a)How would the slope coefficient change, if you decided one day to measure testscores in 100s, i.e., a test score of 650 became 6.5? Would this have an effect on your interpretation? (b)Do you think the regression R2 will change? Why or why not? (c)Although Chapter 4 in your textbook did not deal with hypothesis testing, it presented you with the large sample distribution for the slope and the intercept estimator. Given the change in the units of measurement in (a), do you think that the variance of the slope estimator will change numerically? Why or why not?

(Essay)
4.8/5
(34)

Assume that you have collected a sample of observations from over 100 households and their consumption and income patterns. Using these observations, you estimate the following regression Ci = ?0+?1Yi+ ui where C is consumption and Y is disposable income. The estimate of ?1 will tell you

(Multiple Choice)
4.8/5
(35)

A necessary and sufficient condition to derive the OLS estimator is that the following two conditions hold: i=1nui^\sum _ { i = 1 } ^ { n } \hat { u _ { i } } = 0 and i=1nu^iXi\sum _ { i = 1 } ^ { n } \hat { u } _ { i } X _ { i } = 0. Show that these conditions imply that i=1nui^Yi\sum _ { i = 1 } ^ { n } \hat { u _ { i } } Y _ { i } = 0.

(Essay)
4.9/5
(29)

In order to calculate the regression R2 you need the TSS and either the SSR or the ESS. The TSS is fairly straightforward to calculate, being just the variation of Y. However, if you had to calculate the SSR or ESS by hand (or in a spreadsheet), you would need all fitted values from the regression function and their deviations from the sample mean, or the residuals. Can you think of a quicker way to calculate the ESS simply using terms you have already used to calculate the slope coefficient?

(Essay)
4.9/5
(28)

A peer of yours, who is a major in another social science, says he is not interested in the regression slope and/or intercept. Instead he only cares about correlations. For example, in the testscore/student-teacher ratio regression, he claims to get all the information he needs from the negative correlation coefficient corr(X,Y)=-0.226. What response might you have for your peer?

(Essay)
4.9/5
(36)

You have obtained a sample of 14,925 individuals from the Current Population Survey (CPS)and are interested in the relationship between average hourly earnings and years of education. The regression yields the following result: ahe^\hat { a h e } = -4.58 + 1.71×educ , R2 = 0.182, SER = 9.30 where ahe and educ are measured in dollars and years respectively. a. Interpret the coefficients and the regression R2. b. Is the effect of education on earnings large? c. Why should education matter in the determination of earnings? Do the results suggest that there is a guarantee for average hourly earnings to rise for everyone as they receive an additional year of education? Do you think that the relationship between education and average hourly earnings is linear? d. The average years of education in this sample is 13.5 years. What is mean of average hourly earnings in the sample? e. Interpret the measure SER. What is its unit of measurement.

(Essay)
4.7/5
(29)

Indicate in a scatterplot what the data for your dependent variable and your explanatory variable would look like in a regression with an R2 equal to zero. How would this change if the regression R2 was equal to one?

(Essay)
4.9/5
(26)

In the simple linear regression model Yi = β0 + β1Xi + ui,

(Multiple Choice)
4.8/5
(33)

In the simple linear regression model, the regression slope

(Multiple Choice)
4.7/5
(33)

Sir Francis Galton, a cousin of James Darwin, examined the relationship between the height of children and their parents towards the end of the 19th century. It is from this study that the name "regression" originated. You decide to update his findings by collecting data from 110 college students, and estimate the following relationship:  Studenth ^\widehat { \text { Studenth }} = 19.6 + 0.73 × Midparh, R2 = 0.45, SER = 2.0 where Studenth is the height of students in inches, and Midparh is the average of the parental heights. (Following Galton's methodology, both variables were adjusted so that the average female height was equal to the average male height.) (a)Interpret the estimated coefficients. (b)What is the meaning of the regression R2? (c)What is the prediction for the height of a child whose parents have an average height of 70.06 inches? (d)What is the interpretation of the SER here? (e)Given the positive intercept and the fact that the slope lies between zero and one, what can you say about the height of students who have quite tall parents? Those who have quite short parents? (f)Galton was concerned about the height of the English aristocracy and referred to the above result as "regression towards mediocrity." Can you figure out what his concern was? Why do you think that we refer to this result today as "Galton's Fallacy"?

(Essay)
4.8/5
(30)

The slope estimator, β1, has a smaller standard error, other things equal, if

(Multiple Choice)
4.7/5
(28)

In a simple regression with an intercept and a single explanatory variable, the variation in Y (TSS=i=1n(YiYˉ)2)\left( T S S = \sum _ { i = 1 } ^ { n } \left( Y _ { i } - \bar { Y } \right) ^ { 2 } \right) can be decomposed into the explained sums of squares (ESS=i=1n(Y^iYˉ)2)\left( E S S = \sum _ { i = 1 } ^ { n } \left( \hat { Y } _ { i } - \bar { Y } \right) ^ { 2 } \right) and the sum of squared residuals (SSR=i=1nu^i2=i=1n(YiY^)2)\left( \operatorname { SSR } = \sum _ { i = 1 } ^ { n } \hat { u } _ { i } ^ { 2 } = \sum _ { i = 1 } ^ { n } \left( Y _ { i } - \hat { Y } \right) ^ { 2 } \right) (see, for example, equation (4.35)in the textbook). Consider any regression line, positively or negatively sloped in {X,Y} space. Draw a horizontal line where, hypothetically, you consider the sample mean of Y (=Yˉ)( = \bar { Y } ) to be. Next add a single actual observation of Y. In this graph, indicate where you find the following distances: the (i)residual (ii)actual minus the mean of Y (iii)fitted value minus the mean of Y

(Essay)
4.8/5
(29)

At the Stock and Watson (http://www.pearsonhighered.com/stock_watson)website, go to Student Resources and select the option "Datasets for Replicating Empirical Results." Then select the "California Test Score Data Used in Chapters 4-9" and read the data either into Excel or STATA (or another statistical program). Run a regression of the average reading score (read_scr)on the average math score (math_scr). What values for the slope and the intercept would you expect? Interpret the coefficients in the resulting regression output and the regression R2.

(Essay)
4.9/5
(26)

Given the amount of money and effort that you have spent on your education, you wonder if it was (is)all worth it. You therefore collect data from the Current Population Survey (CPS)and estimate a linear relationship between earnings and the years of education of individuals. What would be the effect on your regression slope and intercept if you measured earnings in thousands of dollars rather than in dollars? Would the regression R2 be affected? Should statistical inference be dependent on the scale of variables? Discuss.

(Essay)
4.9/5
(41)
Showing 1 - 20 of 65
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)