Exam 8: Nonlinear Regression Functions

arrow
  • Select Tags
search iconSearch Question
  • Select Tags

Including an interaction term between two independent variables, X1 and X2, allows for the following except:

Free
(Multiple Choice)
4.9/5
(39)
Correct Answer:
Verified

C

An extension of the Solow growth model that includes human capital in addition to physical capital, suggests that investment in human capital (education)will increase the wealth of a nation (per capita income). To test this hypothesis, you collect data for 104 countries and perform the following regression: = 0.046-5.869\times gpop +0.738\times+0.055\times Educ, R2=0.775, SER =0.1377 (0.079)(2.238)(0.294)(0.010) where RelPersInc is GDP per worker relative to the United States, gpop is the average population growth rate, 1980 to 1990, sK is the average investment share of GDP from 1960 to 1990, and Educ is the average educational attainment in years for 1985. Numbers in parentheses are for heteroskedasticity-robust standard errors. (a)Interpret the results and indicate whether or not the coefficients are significantly different from zero. Do the coefficients have the expected sign? (b)To test for equality of the coefficients between the OECD and other countries, you introduce a binary variable (DOECD), which takes on the value of one for the OECD countries and is zero otherwise. To conduct the test for equality of the coefficients, you estimate the following regression: =-0.068-0.063\times gpop +0.719\times+0.044\timesEduc , (0.072)(2.271)(0.365)(0.012) 0.381\timesDOECD-8.038\times(DOECD\timesgpop)-0.430\times DOECD\times (0.184)(5.366)(0.768) +0.003\times(DOECD\timesEduc),=0.845,SER=0.116 (0.018) Write down the two regression functions, one for the OECD countries, the other for the non-OECD countries. The F- statistic that all coefficients involving DOECD are zero, is 6.76. Find the corresponding critical value from the F table and decide whether or not the coefficients are equal across the two sets of countries. (c)Given your answer in the previous question, you want to investigate further. You first force the same slopes across all countries, but allow the intercept to differ. That is, you reestimate the above regression but set ?DOECD × gpop = ?DOECD × SKS _ { K } = ?DOECD × Educ = 0. The t-statistic for DOECD is 4.39. Is the coefficient, which was 0.241, statistically significant? (d)Your final regression allows the slopes to differ in addition to the intercept. The F-statistic for ?DOECD × gpop = ?DOECD × SKS _ { K } = ?DOECD × Educ = 0 is 1.05. What is your decision? Each one of the t-statistics is also smaller than the critical value from the standard normal table. Which test should you use? (e)Looking at the tests in the two previous questions, what is your conclusion?

Free
(Essay)
4.9/5
(33)
Correct Answer:
Verified

(a) A one percentage point decrease in the population growth rate increases GDP per worker relative to the United States by roughly 0.06. An increase in the investment share of 0.1 results in an increase of GDP per worker relative to the United States by approximately 0.07. For every additional year of average educational attainment, the increase is 0.055. The intercept should not be interpreted. The regression explains 77.5 percent of the variation in relative productivity. All coefficients are significantly different from zero at conventional levels. All coefficients carry the expected sign.
(b)  RelPersInc ^=0.0680.063× gpop +0.719×SK+0.044× Educ .\widehat { \text { RelPersInc } } = - 0.068 - 0.063 \times \text { gpop } + 0.719 \times S _ { K } + 0.044 \times \text { Educ } .
For the OECD countries we get

 RelPersInc ^=0.3138.101× gpop +0.289×SK+0.047× Educ .\widehat { \text { RelPersInc } } = 0.313 - 8.101 \times \text { gpop } + 0.289 \times S _ { K } + 0.047 \times \text { Educ } .
(c)Given the critical value, the coefficient is statistically significant, that is, you can reject ?DOECD = 0.
(d)Given the critical value of 3.78 at the 1% level, you cannot reject the null hypothesis that the additional coefficients are all zero. The F-test is the proper procedure to use when testing for simultaneous restrictions.
(e)There is evidence that the slopes can be set equal. However, there seems to be a level difference between the two groups of countries.

(Requires Calculus)Show that for the log-log model the slope coefficient is the elasticity.

Free
(Essay)
4.7/5
(30)
Correct Answer:
Verified

Consider the deterministic part Y = AXβ1. Then ln(Y)= β0+ β1 ln(X), where β0 = ln(A). Now ln(Y)ln(X)\frac { \partial \ln ( Y ) } { \partial \ln ( X ) } = β1 = 1YY1YX=YXXY\frac { \frac { 1 } { Y } \partial Y } { \frac { 1 } { Y } \partial X } = \frac { \partial Y } { \partial X } \frac { X } { Y } Alternatively you can derive the same result by taking the derivative YX\frac { \partial Y } { \partial X } from Y = AXβ1.

Sketch for the log-log model what the relationship between Y and X looks like for various parameter values of the slope, i.e., β1 > 1; 0 < β1 < 1; β1 = (-1).

(Essay)
5.0/5
(38)

Sports economics typically looks at winning percentages of sports teams as one of various outputs, and estimates production functions by analyzing the relationship between the winning percentage and inputs. In Major League Baseball (MLB), the determinants of winning are quality pitching and batting. All 30 MLB teams for the 1999 season. Pitching quality is approximated by "Team Earned Run Average" (ERA), and hitting quality by "On Base Plus Slugging Percentage" (OPS). Summary of the Distribution of Winning Percentage, On Base Plus Slugging Percentage, and Team Earned Run Average for MLB in 1999  Sports economics typically looks at winning percentages of sports teams as one of various outputs, and estimates production functions by analyzing the relationship between the winning percentage and inputs. In Major League Baseball (MLB), the determinants of winning are quality pitching and batting. All 30 MLB teams for the 1999 season. Pitching quality is approximated by Team Earned Run Average (ERA), and hitting quality by On Base Plus Slugging Percentage (OPS). Summary of the Distribution of Winning Percentage, On Base Plus Slugging Percentage, and Team Earned Run Average for MLB in 1999   Your regression output is:  \begin{array} { l }  \widehat { \text { Winpct } } = - 0.19 - 0.099 \times \text { teamera } + 1.490 \times \text { ops, } R 2 = 0.92 , \text { SER } = 0.02 \text {. } \\ \quad\quad\quad\quad\quad\text { (0.08) (0.008) }\quad\quad\quad\quad\quad\quad(0.126) \\ \end{array}  (a)Interpret the regression. Are the results statistically significant and important? (b)There are two leagues in MLB, the American League (AL)and the National League (NL). One major difference is that the pitcher in the AL does not have to bat. Instead there is a designated hitter in the hitting line-up. You are concerned that, as a result, there is a different effect of pitching and hitting in the AL from the NL. To test this hypothesis, you allow the AL regression to have a different intercept and different slopes from the NL regression. You therefore create a binary variable for the American League (DAL)and estimate the following specification:  \begin{aligned} \widehat { \text { Winpct } = - } & 0.29 + 0.10 \times D A L - 0.100 \times \text { teamera } + 0.008 \times ( \text { DAL } \times \text { teamera } ) \\ & ( 0.12 ) ( 0.24 ) \quad\quad\quad\quad ( 0.008 )\quad\quad\quad\quad(0.018) \\\\ & + 1.622 ^ { * } \text { ops } - 0.187 ^ { * } ( D A L \times \text { ops } ) , R 2 = 0.92 , S E R = 0.02 . \\ & ( 0.163 ) \quad\quad\quad\quad ( 0.160 ) \end{aligned}  What is the regression for winning percentage in the AL and NL? Next, calculate the t-statistics and say something about the statistical significance of the AL variables. Since you have allowed all slopes and the intercept to vary between the two leagues, what would the results imply if all coefficients involving DAL were statistically significant? (c)You remember that sequentially testing the significance of slope coefficients is not the same as testing for their significance simultaneously. Hence you ask your regression package to calculate the F-statistic that all three coefficients involving the binary variable for the AL are zero. Your regression package gives a value of 0.35. Looking at the critical value from you F-table, can you reject the null hypothesis at the 1% level? Should you worry about the small sample size? Your regression output is: =-0.19-0.099\times teamera +1.490\times ops, R2=0.92, SER =0.02 . (0.08) (0.008) (0.126) (a)Interpret the regression. Are the results statistically significant and important? (b)There are two leagues in MLB, the American League (AL)and the National League (NL). One major difference is that the pitcher in the AL does not have to bat. Instead there is a "designated hitter" in the hitting line-up. You are concerned that, as a result, there is a different effect of pitching and hitting in the AL from the NL. To test this hypothesis, you allow the AL regression to have a different intercept and different slopes from the NL regression. You therefore create a binary variable for the American League (DAL)and estimate the following specification: 0.29+0.10\timesDAL-0.100\times teamera +0.008\times( DAL \times teamera ) (0.12)(0.24)(0.008)(0.018) +1.62 ops -0.18(DAL\times ops ),R2=0.92,SER=0.02. (0.163)(0.160) What is the regression for winning percentage in the AL and NL? Next, calculate the t-statistics and say something about the statistical significance of the AL variables. Since you have allowed all slopes and the intercept to vary between the two leagues, what would the results imply if all coefficients involving DAL were statistically significant? (c)You remember that sequentially testing the significance of slope coefficients is not the same as testing for their significance simultaneously. Hence you ask your regression package to calculate the F-statistic that all three coefficients involving the binary variable for the AL are zero. Your regression package gives a value of 0.35. Looking at the critical value from you F-table, can you reject the null hypothesis at the 1% level? Should you worry about the small sample size?

(Essay)
4.9/5
(31)

Using a spreadsheet program such as Excel, plot the following logistic regression function with a single X, Y^\hat { Y } i = 11+e(β^0+β^1Xi)\frac { 1 } { 1 + e ^ { - \left( \hat\beta _ { 0 } + \hat\beta _ { 1 } X _ { i } \right) } } , where β^\hat { \beta } 0 = - 4.13 and β^\hat { \beta } 1 = 5.37. Enter values of X in the first column starting from 0 and then incrementing these by 0.1 until you reach 2.0. Then enter the logistic function formula in the next column. Finally produce a scatter plot, connecting the predicted values with a line.

(Essay)
4.9/5
(42)

You have collected data for a cross-section of countries in two time periods, 1960 and 1997, say. Your task is to find the determinants for the Wealth of a Nation (per capita income)and you believe that there are three major determinants: investment in physical capital in both time periods (X1,T and X1,0), investment in human capital or education (X2,T and X2,0), and per capita income in the initial period (Y0). You run the following regression: ln(YT)= β0 + β1X1,T + β2X1,0 + β3X2,T + β4X1,0 + ln(Y0)+ uT One of your peers suggests that instead, you should run the growth rate in per capita income over the two periods on the change in physical and human capital. For those results to be a parsimonious presentation of your initial regression, what three restrictions would have to hold? How would you test for these? The same person also points out to you that the intercept vanishes in equations where the data is differenced. Is that correct?

(Essay)
4.8/5
(32)

Pages 283-284 in your textbook contain an analysis of the "Return to Education and the Gender Gap." Column (4)in Table 8.1 displays regression results using the 2009 Current Population Survey. The equation below shows the regression result for the same specification, but using the 2005 Current Population Survey. Interpret the major results. =1.215+0.0899\times educ -0.521\times DFemme +0.0180\times( DFemme \times educ ) (0.018) (0.0011) (0.022) (0.0016) +0.0232\times exper -0.000368\times exper 2-0.058\times Midwest -0.0098\times South -0.030\times West (0.0008) (0.000018) (0.006) (0.0078) (0.0030)

(Essay)
4.7/5
(35)

The best way to interpret polynomial regressions is to

(Multiple Choice)
4.8/5
(36)

You have learned that earnings functions are one of the most investigated relationships in economics. These typically relate the logarithm of earnings to a series of explanatory variables such as education, work experience, gender, race, etc. (a)Why do you think that researchers have preferred a log-linear specification over a linear specification? In addition to the interpretation of the slope coefficients, also think about the distribution of the error term. (b)To establish age-earnings profiles, you regress ln(Earn)on Age, where Earn is weekly earnings in dollars, and Age is in years. Plotting the residuals of the regression against age for 1,744 individuals looks as shown in the figure:  You have learned that earnings functions are one of the most investigated relationships in economics. These typically relate the logarithm of earnings to a series of explanatory variables such as education, work experience, gender, race, etc. (a)Why do you think that researchers have preferred a log-linear specification over a linear specification? In addition to the interpretation of the slope coefficients, also think about the distribution of the error term. (b)To establish age-earnings profiles, you regress ln(Earn)on Age, where Earn is weekly earnings in dollars, and Age is in years. Plotting the residuals of the regression against age for 1,744 individuals looks as shown in the figure:   Do you sense a problem? (c)You decide, given your knowledge of age-earning profiles, to allow the regression line to differ for the below and above 40 years age category. Accordingly you create a binary variable, Dage, that takes the value one for age 39 and below, and is zero otherwise. Estimating the earnings equation results in the following output (using heteroskedasticity-robust standard errors):  \begin{array} { l }  \widehat { \text { LnEarn } = } 6.92 - 3.13 \times \text { Dage } - 0.019 \times \text { Age } + 0.085 \times ( \text { Dage } \times \text { Age } ) , R 2 = 0.20 , \text { SER } = 0.721 . \\ \quad\quad\quad\quad( 38.33 ) ( 0.22 )\quad\quad\quad\quad(0.004)\quad\quad\quad\quad(0.005) \\ \end{array}  Sketch both regression lines: one for the age category 39 years and under, and one for 40 and above. Does it make sense to have a negative sign on the Age coefficient? Predict the ln(earnings)for a 30 year old and a 50 year old. What is the percentage difference between these two? (d)The F-statistic for the hypothesis that both slopes and intercepts are the same is 124.43. Can you reject the null hypothesis? (e)What other functional forms should you consider? Do you sense a problem? (c)You decide, given your knowledge of age-earning profiles, to allow the regression line to differ for the below and above 40 years age category. Accordingly you create a binary variable, Dage, that takes the value one for age 39 and below, and is zero otherwise. Estimating the earnings equation results in the following output (using heteroskedasticity-robust standard errors): 6.92-3.13\times Dage -0.019\times Age +0.085\times( Dage \times Age ),R2=0.20, SER =0.721. (38.33)(0.22)(0.004)(0.005) Sketch both regression lines: one for the age category 39 years and under, and one for 40 and above. Does it make sense to have a negative sign on the Age coefficient? Predict the ln(earnings)for a 30 year old and a 50 year old. What is the percentage difference between these two? (d)The F-statistic for the hypothesis that both slopes and intercepts are the same is 124.43. Can you reject the null hypothesis? (e)What other functional forms should you consider?

(Essay)
4.9/5
(36)

In nonlinear models, the expected change in the dependent variable for a change in one of the explanatory variables is given by

(Multiple Choice)
4.7/5
(36)

Consider the population regression of log earnings [Yi, where Yi = ln(Earningsi)] against two binary variables: whether a worker is married (D1i, where D1i=1 if the ith person is married)and the worker's gender (D2i, where D2i=1 if the ith person is female), and the product of the two binary variables Yi = β0 + β1D1i + β2D2i + β3(D1i×D2i)+ ui. The interaction term

(Multiple Choice)
4.7/5
(26)

To investigate whether or not there is discrimination against a sub-group of individuals, you regress the log of earnings on determining variables, such as education, work experience, etc., and a binary variable which takes on the value of one for individuals in that sub-group and is zero otherwise. You consider two possible specifications. First you run two separate regressions, one for the observations that include the sub-group and one for the others. Second, you run a single regression, but allow for a binary variable to appear in the regression. Your professor suggests that the second equation is better for the task at hand, as long as you allow for a shift in both the intercept and the slopes. Explain her reasoning.

(Essay)
4.8/5
(37)

Show that for the following regression model Yt = eβ0+β1×t+u{ } _ { e } \beta _ { 0 } + \beta _ { 1 \times t + u } where t is a time trend, which takes on the values 1, 2, …,T, β1 represents the instantaneous ("continuous compounding")growth rate. Show how this rate is related to the proportionate rate of growth, which is calculated from the relationship Yt = Y0 × (1 + g)t when time is measured in discrete intervals.

(Essay)
4.9/5
(32)

Females, it is said, make 70 cents to the dollar in the United States. To investigate this phenomenon, you collect data on weekly earnings from 1,744 individuals, 850 females and 894 males. Next, you calculate their average weekly earnings and find that the females in your sample earned $346.98, while the males made $517.70. (a)Calculate the female earnings in percent of the male earnings. How would you test whether or not this difference is statistically significant? Give two approaches. (b)A peer suggests that this is consistent with the idea that there is discrimination against females in the labor market. What is your response? (c)You recall from your textbook that additional years of experience are supposed to result in higher earnings. You reason that this is because experience is related to "on the job training." One frequently used measure for (potential)experience is "Age-Education-6." Explain the underlying rationale. Assuming, heroically, that education is constant across the 1,744 individuals, you consider regressing earnings on age and a binary variable for gender. You estimate two specifications initially: =323.70+5.15\times Age -169.78\times Female ,R2=0.13,SER=274.75 (21.18)(0.55)(13.06) =5.44+0.015\times Age -0.421\times Female, =0.17, SER =0.75 (0.08)(0.002) (0.036) where Earn are weekly earnings in dollars, Age is measured in years, and Female is a binary variable, which takes on the value of one if the individual is a female and is zero otherwise. Interpret each regression carefully. For a given age, how much less do females earn on average? Should you choose the second specification on grounds of the higher regression R2? (d)Your peer points out to you that age-earning profiles typically take on an inverted U-shape. To test this idea, you add the square of age to your log-linear regression. 3.04+0.147\times Age -0.421\times Female -0.0016, (0.18)(0.009)(0.033)(0.0001) R2=0.28,SER=0.68R ^ { 2 } = 0.28 , S E R = 0.68 Interpret the results again. Are there strong reasons to assume that this specification is superior to the previous one? Why is the increase of the Age coefficient so large relative to its value in (c)? (e)What other factors may play a role in earnings determination?

(Essay)
4.8/5
(29)

Indicate whether or not you can linearize the regression functions below so that OLS estimation methods can be applied: (a)Yi = eβ0+β1Xi+ui{ } _ { e } \beta _ { 0 } + \beta _ { 1 } X _ { i } + u _ { i } (b)Yi = β1\beta _ { 1 } X1iβ1X _ { 1 i } ^ { \beta _ { 1 } } X2iβ2X _ { 2 i } ^ { \beta _ { 2 } } + ui

(Essay)
4.8/5
(27)

The binary variable interaction regression

(Multiple Choice)
4.9/5
(34)

The following are properties of the logarithm function with the exception of

(Multiple Choice)
4.8/5
(27)

Consider a typical beta convergence regression function from macroeconomics, where the growth of a country's per capita income is regressed on the initial level of per capita income and various other economic and socio-economic variables. Assume that two of these variables are the average number of years of education in the specific country and a binary variable which indicates whether or not the country experienced a significant number of years of civil war/unrest. Explain why it would make sense to have these two variables enter separately and also why you should use an interaction term. What signs would you expect on the three coefficients?

(Essay)
4.8/5
(38)

You have been asked by your younger sister to help her with a science fair project. During the previous years she already studied why objects float and there also was the inevitable volcano project. Having learned regression techniques recently, you suggest that she investigate the weight-height relationship of 4th to 6th graders. Her presentation topic will be to explain how people at carnivals predict weight. You collect data for roughly 100 boys and girls between the ages of nine and twelve and estimate for her the following relationship: =45.59+4.32\times,=0.55, SER =15.69 (3.81)(0.46) where Weight is in pounds, and Height4 is inches above 4 feet. (a)Interpret the results. (b)You remember from the medical literature that females in the adult population are, on average, shorter than males and weigh less. You also seem to have heard that females, controlling for height, are supposed to weigh less than males. To see if this relationship holds for children, you add a binary variable (DFY)that takes on the value one for girls and is zero otherwise. You estimate the following regression function: =36.27+17.33\times DFY +5.32\times Height 4-1.83\times( DFY \times Height 4) (5.99)(7.36)(0.80)(0.90) R2=0.58,SER=15.41R ^ { 2 } =0.58 , S E R = 15.41 Are the signs on the new coefficients as expected? Are the new coefficients individually statistically significant? Write down and sketch the regression function for boys and girls separately. (c)The medical literature provides you with the following information for median height and weight of nine- to twelve-year-olds: Median Height and Weight for Children, Age 9-12 Boys' Weight Boys' Height Girls' Weight Girls' Height 9-year-old 60 52 60 49 10-year-old 70 54 70 52 11-year-old 77 56 80 57 12-year-old 87 58.5 92 60 Insert two height/weight measures each for boys and girls and see how accurate your predictions are. (d)The F-statistic for testing that the intercept and slope for boys and girls are identical is 2.92. Find the critical values at the 5% and 1% level, and make a decision. Allowing for a different intercept with an identical slope results in a t-statistic for DFY of (-0.35). Having identical intercepts but different slopes gives a t-statistic on (DFYHeight4)of (-0.35)also. Does this affect your previous conclusion? (e)Assume that you also wanted to test if the relationship changes by age. Briefly outline how you would specify the regression including the gender binary variable and an age binary variable (Older)that takes on a value of one for eleven to twelve year olds and is zero otherwise. Indicate in a table of two rows and two columns how the estimated relationship would vary between younger girls, older girls, younger boys, and older boys.

(Essay)
4.9/5
(34)
Showing 1 - 20 of 63
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)