Deck 7: Basic Methods for Establishing Causal Inference

Full screen (f)
exit full mode
Question
A control variable is a variable included in a regression equation whose purpose is to:

A) alleviate an endogeneity problem.
B) improve the fit of the model.
C) provide a placebo test.
D) ensure the sample is representative.
Use Space or
up arrow
down arrow
to flip the card.
Question
It is often times necessary to make reference to a base group in regression analysis, which denotes what?

A) The preferred selection of control variables to have in the regression equation.
B) The set of homoscedastic errors.
C) The excluded dummy variable among a set of dummy variables representing a categorical, ordinal or interval variable.
D) The control group in a randomized control trial.
Question
What are the two primary criteria for identifying "good" controls?

A) They are likely correlated with the treatment and they do not influence the outcome per se.
B) They are likely uncorrelated with the treatment and they influence the outcome.
C) They are likely correlated with the treatment and do not influence the outcome.
D) They are likely correlated with treatment and they influence the outcome.
Question
What condition best describes the endogeneity problem?

A) The variance of the errors (Ui) depends on Xi.
B) Some variables within Xi are perfectly correlated with other variables in Xi.
C) The distribution of the errors (Ui) is non-normal.
D) One of the Xi variables is correlated with the error term (Ui).
Question
In determining the causal effect of Price on Sales, if advertising spend is a good control variable, how will the correlation of the error terms from the regression, Salesi = β0 + β1Pricei + Ui, and Salesi = α0 + α1Pricei + α2Advertisingi + Vi and price be related?

A) Cor(Ui, Vi) = 0
B) Cor(Ui, Pricei) = 0, Cor(Pricei, Vi) = 0
C) Cor(Ui, Pricei) ≠ 0, Cor(Pricei, Vi) = 0
D) Cor(Ui, Pricei) = 0, Cor(Pricei, Vi) ≠ 0
Question
Sometimes including independent variables in a regression serve as a "data sanity check," in so much as they facilitate a:

A) more efficient estimate of the treatment effect in question.
B) more unbiased estimate of the treatment effect
C) comparison between the estimated coefficient for that variable and the value for that coefficient as predicted by theory.
D) secondary measure of the standard error of the treatment effect of interest.
Question
A selected sample is one that is:

A) non-normal.
B) nonrandom.
C) distributed according to the Student t distribution.
D) efficient.
Question
If the decision to include average tenure at the company as a control variable in an attempt to estimate if West coast store locations are outperforming store locations in the rest of the country (i.e., the regression, StoreProfitsi = β0 + β1West Coasti + β2Avg.Tenurei + Ui), which of the following conditions need to be true in order for the estimate of β1 to be the same regardless of whether Avg. Tenure is used as a control or not?

A) The p-value on average tenure is low.
B) The p-value on average tenure is above 0.10
C) Average tenure and West Coast are uncorrelated.
D) The semi-partial correlation of Store Profits on West Coast holding Avg. Tenure fixed is zero.
Question
A dummy variable is:

A) a variable with a low t-stat and high p-value.
B) a placeholder for a variable to be gathered later.
C) a dichotomous variable that is used to indicate the presence or absence of a characteristic.
D) an unbiased estimate of the true variable.
Question
In attempting to estimate the causal effect of employee hours worked on total units produced, you include the age of the production equipment as a control variable. Which of the following conditions would not be a good reason to include this control variable?

A) Older production equipment often requires more employees to work on them.
B) Older production equipment often is less productive (i.e., produce fewer units on average).
C) The age distribution of the plants is highly non-normal.
D) The semi-partial correlation of total units produced on age of the machine holding number of employees fixed is positive and statistically significant.
Question
Suppose you are estimating the following model: Yi = β0 + β1Xi + Ui. Suppose also that you only observe values of Y that are above 50. What is the consequence of this selection on the values of Y?

A) Your estimate for β1 will be biased.
B) Your model will suffer from the heteroscedasticity problem.
C) Your estimate for β1 will be biased and your model will suffer from the heteroscedasticity problem.
D) None of the answers is correct.
Question
A representative sample is one whose:

A) distribution is approximately normal as it gets larger.
B) sample is drawn randomly.
C) distribution approximately matches that of the population, for a subset of observed, independent variables.
D) distribution exactly matches the population distribution for the outcome variable.
Question
An ordinal variable is one that contains categories that:

A) have an obvious order, but the difference in values is not meaningful.
B) do not have an obvious ordering.
C) have an obvious ordering, and the difference in values is meaningful.
D) None of the answers is correct.
Question
Suppose you know the determining function for what drives restaurant sales is of the following form WeeklySalesi = β0 + β1WeeklyPromoSpendi + β2HolidayWeeki + Ui. Suppose further that your client told you that she is spending $1,000 per week on weekly promotions, except during the summer where she doubles it. Which statement best describes the role of "summer weeks" in estimating the causal effect of weekly promotion spent using the determining function above?

A) If summer weeks are contained in Ui (i.e., they affect Sales), then there is an endogeneity problem.
B) If summer weeks leads to more promotion and doesn't affect sales, then there is an endogeneity problem.
C) There is no potential endogeneity problem with the determining function above.
D) Because weekly promotions are unrelated to whether or not it's a holiday week, there is no endogeneity problem.
Question
Irrelevant variables are ones that:

A) do not affect the outcome.
B) affect the outcome but only in the population.
C) affect the outcome in the sample.
D) have variables with low p-values.
Question
Which step is not involved with constructing a representative sample?

A) Choose the independent variables you want to match.
B) Collect the sample by randomly sampling from each stratum (defined by your pre-chosen independent variables).
C) Use information about the population to stratify (categorize) each of the chosen variables.
D) Run a t-test to check if the outcomes are different across stratums.
Question
The presence of a confounding factor will lead to failure of which of the critical assumptions used to justify claiming your estimates consistently estimate a causal effect?

A) The determining function is linear in parameters.
B) The sample is a random sample from the population.
C) The errors will be homoscedastic.
D) The model's error terms and the treatments will all be uncorrelated.
Question
Assuming that you are trying to determine the true effect of how years of education affect career earnings (Earningsi = β0 + β1Educationi + Ui), and that this effect is positive . What would be the likely induced correlation between Education and Ui of receiving a sample that only had individuals that made over $85,000?

A) Negative
B) Positive
C) There is not enough information.
D) None of the answers is correct.
Question
Suppose you are estimating the following model: Yi = β0 + β1Xi + Ui. You believe the variance of the unobserved factors (U) varies with X. If this is true, what is the consequence?

A) Your estimate for β1 will be biased.
B) Your estimate of β0 will be biased.
C) Your estimate for β0 and β1 will be biased.
D) None of the above answers is correct
Question
Suppose you have estimated the following regression: Refrigerator Salesi = β0 + β1Pricei + Ui. However, when presenting your results, someone in the audience claims your estimate is biased because your sample only contains sales figures for refrigerators that are priced over $300. Their point is:

A) correct. You have a selected sample on the Y variable, so your estimate is biased.
B) correct. You have a selected sample on the X variable, so your estimate is biased.
C) incorrect. You have a selected sample on Y variable, but this does not bias your estimate.
D) incorrect. You have a selected sample on X variable, but this does not bias your estimate.
Question
Does the interpretation on the estimated coefficient on the treatment change if you use a proxy variable to control for a confounding factor?

A) No, the estimated coefficient is a consistent estimate of the causal effect.
B) No, but instead of getting causal estimate you're only estimating partial correlations.
C) Yes, the coefficient on the treatment is biased.
D) Yes, the coefficient on the treatment suffers from the simultaneity bias.
Question
The successful use of a proxy variable to control for a confounding factor will allow you to accomplish all of the following except:

A) consistently estimate the treatment effect of interest.
B) limit the endogeneity problem associated with the confounding factor.
C) uncover the size of the semi-partial correlation of the confounding factor and the outcome.
D) conduct appropriate hypothesis testing for the treatment effect of interest.
Question
In estimating the effect of price on sales, what is likely to be a confounding factor that one would at best have only a proxy variable for?

A) Price
B) Sales
C) Month
D) Brand awareness
Question
Dropping irrelevant variables from a regression equation might provide a better regression in what sense?

A) Only when the irrelevant variables are uncorrelated with the treatment(s).
B) Might facilitate more efficient/precise estimates of the treatment effect (smaller standard errors).
C) Only if the irrelevant variables have low p-values.
D) Only if the irrelevant variables have many outliers.
Question
Which of the following variables might be a proxy variable for the confounding factor of cognitive ability in a sample of workers and their ability to generate sales?

A) Test scores
B) Past sales records
C) Tenure at the company
D) Age
Question
The determining function that drives movie ticket sales is given by the following equation: Salesit = α0 + α1 HolidaySeasonit + α2CastAwarenessit + Uit, where the unit of observation is a particular movie (i) in month (t). Suppose one wanted to use cumulative past movie appearances of the entire cast of a movie as a proxy for CastAwareness. Which of the following conditions would not be one required for this to be an adequate proxy variable?

A) CumulativePastMovieAppearances to be correlated with CastAwareness.
B) HolidaySeason, CastAwareness, and CumulativePastMovieAppearances to be uncorrelated with "other factors" (Uit).
C) CastAwareness to be uncorrelated with HolidaySeason.
D) HolidaySeason and CumulativePastMovieAppearances to be uncorrelated with "other factors" affecting CastAwareness.
Question
The use of a proxy variable changes how you must interpret which of the following statistics?

A) R-squared
B) P-value of coefficient on treatment
C) Standard errors on treatment
D) None of the answers is correct.
Question
Suppose in an attempt to estimate the effect of how listing with a real estate agent impacts a house's selling price you estimate the following regression: Pricei = 10.2 (3.2) + 2.3 (0.8) Agent + -2.9 (0.3) Distance to Downtowni, where Distance to Downtown is a proxy variable, which can be used to control for the desirability of the location, and the standard errors for each coefficient are reported in parenthesis. How should we interpret the regression results for the coefficient on Agent?

A) The unconditional correlation between housing prices and house sales with a listing agent are positive.
B) The unconditional correlation between whether a listing is listed with an agent and how far it is from downtown is negative.
C) Listings with an agent sell for higher prices holding fixed the desirability of the location of the listing.
D) None of the answers is correct.
Question
Using a proxy variable can be appropriate in all of the following settings except estimating a:

A) multiple linear regression.
B) regression with both treatment and control variables.
C) regression where the outcome is measured with error.
D) regression where the proxy is uncorrelated with confounding factor causing the endogeneity problem.
Question
Suppose you have the following regression results from a regression of home prices on house attributes for a random sample of house transactions:  Coefficierts  Stardard Error  Iratercept 16310.04114.5 Nurrber of Bedroorrs 7295.31399. Number of Bathroorrs 23473.04032.0\begin{array} { | l | r | r | } \hline & \text { Coefficierts } & \text { Stardard Error } \\\hline \text { Iratercept } & 16310.0 & 4114.5 \\\hline \text { Nurrber of Bedroorrs } & 7295.3 & 1399 . \\\hline \text { Number of Bathroorrs } & 23473.0 & 4032.0 \\\hline\end{array} r-squared = 0.302 Adjusted r-squared = 0.299 If we assume that the proper model to predict the market value of houses is given by this regression, and we also happen to know that number of bathrooms and number of bedrooms is uncorrelated both in the sample and in the target population of house sales, why might we still want to include number of bathrooms in a regression to identify the causal effect of number of bedrooms on home prices?

A) It lowers the adjusted r-squared.
B) It increases the r-squared value.
C) It provides a sanity check on our regression.
D) It also leads to more consistent estimates of the treatment effect.
Question
The determining function that drives share of accepted job offers for a company is given by the following equation: AcceptedOfferst = α0 + α1StartingSalaryt + α2EconomicClimatet + Ut, where the unit of observation is particular month (t). Suppose one wanted to use the national unemployment rate (unemploymentt) as a proxy for EconomicClimate. Which of the following describes a condition required to hold for this to be an adequate proxy variable?

A) α2 < 0
B) StartingSalary, EconomicClimate, and Unemployment rate to be uncorrelated with "other factors" (Ut)
C) α2 > 0
D) EconomicClimate is correlated with StartingSalary
Question
Your first candidate for a regression to identify the effect of X1 on Y is: Yi = β0 + β1X1i + β2X2i + Ui, where X2 is a control variable. Suppose a member of your consulting team suggests to you that another variable Ri should be a control that is also included in your regression, in order for you not to be worried about the endogeneity problem. What condition could you credibly test using data on Y, X1, X2, and R which could justify the inclusion of Ri as a control in your regression?

A) Ri has no effect per se on Yi.
B) Ri has a strong correlation with X2.
C) Ri is exogenous.
D) spcorr(Ri, Yi (Xi1, Xi2)) ≠ 0
Question
If you are using number of competitors as a control variable and the local unemployment rate as a proxy for economic climate in a regression to estimate the effect of price on sales, all the following variables will show up in the regression except for:

A) the number of competitors.
B) price.
C) economic climate.
D) sales.
Question
The determining function that drives share of accepted job offers for a company is given by the following equation: AcceptedOfferst = α0 + α1StartingSalaryt + α2EconomicClimatet + Ut, where the unit of observation is particular month (t). Suppose one wanted to use the national unemployment rate (unemploymentt) as a proxy for EconomicClimate (i.e., ran the above regression replacing the UnemploymentRate with EconomicClimate). How should we interpret the estimated coefficient on the UnemploymentRate (i.e., the proxy variable)?

A) A good estimate for α2 in the above determining function.
B) A good estimate for α1 in the above determining function.
C) A good estimate for α1 + α2 in the above determining function.
D) None of the answers is correct.
Question
A proxy variable is a variable used in a regression equation in order to:

A) report the r-squared value appropriately.
B) make the adjustment for the adjusted r-squared.
C) proxy for a confounding factor in an attempt to alleviate the endogeneity problem.
D) improve the standard errors on the treatment effect.
Question
In estimating the effect of a difference in earnings for ivy league graduates, non-ivy league college graduates, and non-college graduates the following regression is run: Earningsi = β0 + β1IvyLeaguei + β2NonIvyCollegeGraduatei + Ui. What does the coefficient β1 represent?

A) The additional increase (decrease) in earnings from going from graduating college to graduating from an Ivy league school.
B) The average earnings of an Ivy league graduate.
C) The additional increase (decrease) in earnings from going from no college degree to an Ivy league degree.
D) The average earnings of an ivy league graduate conditional on graduating from any college.
Question
Which of the following correlations would be an acceptable way to show that the variable (Pi) is a suitable proxy for Ai in the regression Yi = β0 + β1Ti + β2Ai + Ui?

A) Corr(Yi, Ti) ≠ 0
B) spCorr(Yi, Ti (Pi)) ≠ 0
C) pCorr(Yi, Pi; Ti) ≠ 0
D) None of the answers is correct.
Question
Suppose you have the following regression results from a regression of home prices on house attributes for a random sample of house transactions:  Coefficierts  Stardard Error  Iratercept 16310.04114.5 Nurrber of Bedroorrs 7295.31399. Number of Bathroorrs 23473.04032.0\begin{array} { | l | r | r | } \hline & \text { Coefficierts } & \text { Stardard Error } \\\hline \text { Iratercept } & 16310.0 & 4114.5 \\\hline \text { Nurrber of Bedroorrs } & 7295.3 & 1399 . \\\hline \text { Number of Bathroorrs } & 23473.0 & 4032.0 \\\hline\end{array} r-squared = 0.302 Adjusted r-squared = 0.299 If we assume that the proper model to predict the market value of houses is given by this regression, and thus we are getting unbiased estimates of the true relationships between number of bedrooms/bathrooms and sales price, what is the effect on sales price of increasing the number of bathrooms in a house by one, holding number of bedrooms fixed?

A) Approximately, $16,310 dollars.
B) Approximately, $ 7,295 + 23,473 = $30,768.
C) Approximately, $23,473.
D) Approximately, an increase of 30%.
Question
The determining function that drives share of accepted job offers for a company is given by the following equation: AcceptedOfferst = α0 + α1StartingSalaryt + α2EconomicClimatet + Ut, where the unit of observation is particular month (t). Suppose one wanted to use the national unemployment rate (unemploymentt) as a proxy for EconomicClimate. Which of the following would be a condition that would rule out the unemployment rate being a good proxy for economic climate?

A) α2 < 0
B) Unemployment rate to be correlated with "other factors" (Ut)
C) α2 > 0
D) EconomicClimate is correlated with StartingSalary
Question
Suppose you have the following regression results from a regression of home prices on house attributes for a random sample of house transactions:  Coefficierts  Stardard Error  Iratercept 16310.04114.5 Nurrber of Bedroorrs 7295.31399. Number of Bathroorrs 23473.04032.0\begin{array} { | l | r | r | } \hline & \text { Coefficierts } & \text { Stardard Error } \\\hline \text { Iratercept } & 16310.0 & 4114.5 \\\hline \text { Nurrber of Bedroorrs } & 7295.3 & 1399 . \\\hline \text { Number of Bathroorrs } & 23473.0 & 4032.0 \\\hline\end{array} Given these results which additional condition would be sufficient to ensure number of bathrooms satisfies the "primary criteria" for a good control variable in attempting to identify the causal effect of number of bedrooms on house prices?

A) The number of bathrooms is correlated with house prices.
B) The number of bedrooms is correlated with house prices.
C) The number of bathrooms is correlated with the number of bedrooms.
D) The p-value for the number of bedrooms.
Question
Can we estimate the following equation using standard linear regression techniques: Yi = β0 + β1Xi + β2Xi2 + Ui?

A) No, this is not linear in X.
B) No, we cannot isolate the treatment.
C) Yes, this is linear in X.
D) Yes, this is linear in our parameters.
Question
Suppose you've run a regression relating log(Output) to log(Worker Hours) in Excel. You are willing to make the necessary assumptions to deduce causality and run hypothesis tests. Your results are as follows:  Coefficients  Standard Error t Stat P-value  Irtercept 17.458304428.470755840.6132012970.541240031 Log(Worker Hours) 2.6994832870.7117290963.7928522250.000264681\begin{array} { l c c c c } & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & P \text {-value } \\\hline \text { Irtercept } & 17.4583044 & 28.47075584 & 0.613201297 & 0.541240031 \\\text { Log(Worker Hours) } & 2.699483287 & 0.711729096 & 3.792852225 & 0.000264681\end{array} How should you interpret the coefficient on Log(Worker Hours) of 2.69?

A) A 1% increase in Worker Hours leads to a 2.69% increase in Output.
B) A 1 unit increase in Worker Hours leads to a 2.69% increase in Output.
C) A 1% increase in Worker Hours leads to a .0269% increase in Output.
D) A 2.69% increase in Worker Hours leads to a 1% increase in Output.
Question
Suppose you've regressed profits on price, assuming a quadratic functional form. Your regression equation is: Profitsi = β0 + β1Pricei + β2Pricei2 + Ui. What is the marginal effect of price in this equation?

A) β0 + β1 + β2
B) β1 + β2 × Pricei
C) β0 + β1 + 2β2
D) β1 + 2β2 × Pricei
Question
Suppose the fitted (assumed causal) regression line for your data is as follows: Log(Output) = 6.2 + 0.4 × Log(Labor) + 0.5 × Log(Capital)
Interpret the coefficient on Log(Labor).

A) Holding capital fixed, when labor goes up by 1, output goes up by 0.4
B) Holding capital fixed, when labor goes up by 1, output goes up by 6.6
C) Holding capital fixed, when labor goes up by 1%, output goes up by 0.4%
D) Holding capital fixed, when labor goes up by 1%, output goes up by 40 units
Question
Which of the following regressions would yield a coefficient estimate that would be directly interpreted as the price elasticity of demand?

A) Quantityi = α0 + α1log(Pricei)
B) log(Quantityi) = α0 + α1log(Pricei)
C) Quantityi = α0 + α1Pricei
D) Quantityi = α0 - α1log(Pricei)
Question
The percentage change in one variable with a percentage change in another is known as a(n):

A) percentage point change.
B) log-level change.
C) elasticity.
D) linear effect.
Question
Suppose the relationship between temperature and demand for electricity is non-linear, due to the high demand for electricity when it is both very cold and very hot outside. What is the risk of estimating the effect of price on demand for electricity by the regression equation: Electricity Consumptiont = α0 + α1Pricet + α2Temperaturet + Ut?

A) The estimate of α1 will be biased.
B) The sample will not be representative.
C) Electricity consumption will be censored from above.
D) None of the answers is correct.
Question
Which of the following cannot be estimated using traditional linear regression techniques?

A) log(Y) = β0 + β1log(X) + Ui
B) log(Y) = β0 + β1X + Ui
C) Y = β0 + β1log(X) + Ui
D) None of the answers is correct.
Question
Why is the use of polynomial functional forms typical in trying to estimate non-linear functional forms?

A) They yield more efficient estimates than other non-linear functional forms.
B) They offer a high degree of flexibility.
C) They provide coefficients that have elasticities as interpretations.
D) They have standard errors that are easier to interpret.
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/49
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 7: Basic Methods for Establishing Causal Inference
1
A control variable is a variable included in a regression equation whose purpose is to:

A) alleviate an endogeneity problem.
B) improve the fit of the model.
C) provide a placebo test.
D) ensure the sample is representative.
A
2
It is often times necessary to make reference to a base group in regression analysis, which denotes what?

A) The preferred selection of control variables to have in the regression equation.
B) The set of homoscedastic errors.
C) The excluded dummy variable among a set of dummy variables representing a categorical, ordinal or interval variable.
D) The control group in a randomized control trial.
C
3
What are the two primary criteria for identifying "good" controls?

A) They are likely correlated with the treatment and they do not influence the outcome per se.
B) They are likely uncorrelated with the treatment and they influence the outcome.
C) They are likely correlated with the treatment and do not influence the outcome.
D) They are likely correlated with treatment and they influence the outcome.
D
4
What condition best describes the endogeneity problem?

A) The variance of the errors (Ui) depends on Xi.
B) Some variables within Xi are perfectly correlated with other variables in Xi.
C) The distribution of the errors (Ui) is non-normal.
D) One of the Xi variables is correlated with the error term (Ui).
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
5
In determining the causal effect of Price on Sales, if advertising spend is a good control variable, how will the correlation of the error terms from the regression, Salesi = β0 + β1Pricei + Ui, and Salesi = α0 + α1Pricei + α2Advertisingi + Vi and price be related?

A) Cor(Ui, Vi) = 0
B) Cor(Ui, Pricei) = 0, Cor(Pricei, Vi) = 0
C) Cor(Ui, Pricei) ≠ 0, Cor(Pricei, Vi) = 0
D) Cor(Ui, Pricei) = 0, Cor(Pricei, Vi) ≠ 0
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
6
Sometimes including independent variables in a regression serve as a "data sanity check," in so much as they facilitate a:

A) more efficient estimate of the treatment effect in question.
B) more unbiased estimate of the treatment effect
C) comparison between the estimated coefficient for that variable and the value for that coefficient as predicted by theory.
D) secondary measure of the standard error of the treatment effect of interest.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
7
A selected sample is one that is:

A) non-normal.
B) nonrandom.
C) distributed according to the Student t distribution.
D) efficient.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
8
If the decision to include average tenure at the company as a control variable in an attempt to estimate if West coast store locations are outperforming store locations in the rest of the country (i.e., the regression, StoreProfitsi = β0 + β1West Coasti + β2Avg.Tenurei + Ui), which of the following conditions need to be true in order for the estimate of β1 to be the same regardless of whether Avg. Tenure is used as a control or not?

A) The p-value on average tenure is low.
B) The p-value on average tenure is above 0.10
C) Average tenure and West Coast are uncorrelated.
D) The semi-partial correlation of Store Profits on West Coast holding Avg. Tenure fixed is zero.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
9
A dummy variable is:

A) a variable with a low t-stat and high p-value.
B) a placeholder for a variable to be gathered later.
C) a dichotomous variable that is used to indicate the presence or absence of a characteristic.
D) an unbiased estimate of the true variable.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
10
In attempting to estimate the causal effect of employee hours worked on total units produced, you include the age of the production equipment as a control variable. Which of the following conditions would not be a good reason to include this control variable?

A) Older production equipment often requires more employees to work on them.
B) Older production equipment often is less productive (i.e., produce fewer units on average).
C) The age distribution of the plants is highly non-normal.
D) The semi-partial correlation of total units produced on age of the machine holding number of employees fixed is positive and statistically significant.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
11
Suppose you are estimating the following model: Yi = β0 + β1Xi + Ui. Suppose also that you only observe values of Y that are above 50. What is the consequence of this selection on the values of Y?

A) Your estimate for β1 will be biased.
B) Your model will suffer from the heteroscedasticity problem.
C) Your estimate for β1 will be biased and your model will suffer from the heteroscedasticity problem.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
12
A representative sample is one whose:

A) distribution is approximately normal as it gets larger.
B) sample is drawn randomly.
C) distribution approximately matches that of the population, for a subset of observed, independent variables.
D) distribution exactly matches the population distribution for the outcome variable.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
13
An ordinal variable is one that contains categories that:

A) have an obvious order, but the difference in values is not meaningful.
B) do not have an obvious ordering.
C) have an obvious ordering, and the difference in values is meaningful.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
14
Suppose you know the determining function for what drives restaurant sales is of the following form WeeklySalesi = β0 + β1WeeklyPromoSpendi + β2HolidayWeeki + Ui. Suppose further that your client told you that she is spending $1,000 per week on weekly promotions, except during the summer where she doubles it. Which statement best describes the role of "summer weeks" in estimating the causal effect of weekly promotion spent using the determining function above?

A) If summer weeks are contained in Ui (i.e., they affect Sales), then there is an endogeneity problem.
B) If summer weeks leads to more promotion and doesn't affect sales, then there is an endogeneity problem.
C) There is no potential endogeneity problem with the determining function above.
D) Because weekly promotions are unrelated to whether or not it's a holiday week, there is no endogeneity problem.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
15
Irrelevant variables are ones that:

A) do not affect the outcome.
B) affect the outcome but only in the population.
C) affect the outcome in the sample.
D) have variables with low p-values.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
16
Which step is not involved with constructing a representative sample?

A) Choose the independent variables you want to match.
B) Collect the sample by randomly sampling from each stratum (defined by your pre-chosen independent variables).
C) Use information about the population to stratify (categorize) each of the chosen variables.
D) Run a t-test to check if the outcomes are different across stratums.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
17
The presence of a confounding factor will lead to failure of which of the critical assumptions used to justify claiming your estimates consistently estimate a causal effect?

A) The determining function is linear in parameters.
B) The sample is a random sample from the population.
C) The errors will be homoscedastic.
D) The model's error terms and the treatments will all be uncorrelated.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
18
Assuming that you are trying to determine the true effect of how years of education affect career earnings (Earningsi = β0 + β1Educationi + Ui), and that this effect is positive . What would be the likely induced correlation between Education and Ui of receiving a sample that only had individuals that made over $85,000?

A) Negative
B) Positive
C) There is not enough information.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
19
Suppose you are estimating the following model: Yi = β0 + β1Xi + Ui. You believe the variance of the unobserved factors (U) varies with X. If this is true, what is the consequence?

A) Your estimate for β1 will be biased.
B) Your estimate of β0 will be biased.
C) Your estimate for β0 and β1 will be biased.
D) None of the above answers is correct
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
20
Suppose you have estimated the following regression: Refrigerator Salesi = β0 + β1Pricei + Ui. However, when presenting your results, someone in the audience claims your estimate is biased because your sample only contains sales figures for refrigerators that are priced over $300. Their point is:

A) correct. You have a selected sample on the Y variable, so your estimate is biased.
B) correct. You have a selected sample on the X variable, so your estimate is biased.
C) incorrect. You have a selected sample on Y variable, but this does not bias your estimate.
D) incorrect. You have a selected sample on X variable, but this does not bias your estimate.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
21
Does the interpretation on the estimated coefficient on the treatment change if you use a proxy variable to control for a confounding factor?

A) No, the estimated coefficient is a consistent estimate of the causal effect.
B) No, but instead of getting causal estimate you're only estimating partial correlations.
C) Yes, the coefficient on the treatment is biased.
D) Yes, the coefficient on the treatment suffers from the simultaneity bias.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
22
The successful use of a proxy variable to control for a confounding factor will allow you to accomplish all of the following except:

A) consistently estimate the treatment effect of interest.
B) limit the endogeneity problem associated with the confounding factor.
C) uncover the size of the semi-partial correlation of the confounding factor and the outcome.
D) conduct appropriate hypothesis testing for the treatment effect of interest.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
23
In estimating the effect of price on sales, what is likely to be a confounding factor that one would at best have only a proxy variable for?

A) Price
B) Sales
C) Month
D) Brand awareness
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
24
Dropping irrelevant variables from a regression equation might provide a better regression in what sense?

A) Only when the irrelevant variables are uncorrelated with the treatment(s).
B) Might facilitate more efficient/precise estimates of the treatment effect (smaller standard errors).
C) Only if the irrelevant variables have low p-values.
D) Only if the irrelevant variables have many outliers.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
25
Which of the following variables might be a proxy variable for the confounding factor of cognitive ability in a sample of workers and their ability to generate sales?

A) Test scores
B) Past sales records
C) Tenure at the company
D) Age
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
26
The determining function that drives movie ticket sales is given by the following equation: Salesit = α0 + α1 HolidaySeasonit + α2CastAwarenessit + Uit, where the unit of observation is a particular movie (i) in month (t). Suppose one wanted to use cumulative past movie appearances of the entire cast of a movie as a proxy for CastAwareness. Which of the following conditions would not be one required for this to be an adequate proxy variable?

A) CumulativePastMovieAppearances to be correlated with CastAwareness.
B) HolidaySeason, CastAwareness, and CumulativePastMovieAppearances to be uncorrelated with "other factors" (Uit).
C) CastAwareness to be uncorrelated with HolidaySeason.
D) HolidaySeason and CumulativePastMovieAppearances to be uncorrelated with "other factors" affecting CastAwareness.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
27
The use of a proxy variable changes how you must interpret which of the following statistics?

A) R-squared
B) P-value of coefficient on treatment
C) Standard errors on treatment
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
28
Suppose in an attempt to estimate the effect of how listing with a real estate agent impacts a house's selling price you estimate the following regression: Pricei = 10.2 (3.2) + 2.3 (0.8) Agent + -2.9 (0.3) Distance to Downtowni, where Distance to Downtown is a proxy variable, which can be used to control for the desirability of the location, and the standard errors for each coefficient are reported in parenthesis. How should we interpret the regression results for the coefficient on Agent?

A) The unconditional correlation between housing prices and house sales with a listing agent are positive.
B) The unconditional correlation between whether a listing is listed with an agent and how far it is from downtown is negative.
C) Listings with an agent sell for higher prices holding fixed the desirability of the location of the listing.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
29
Using a proxy variable can be appropriate in all of the following settings except estimating a:

A) multiple linear regression.
B) regression with both treatment and control variables.
C) regression where the outcome is measured with error.
D) regression where the proxy is uncorrelated with confounding factor causing the endogeneity problem.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
30
Suppose you have the following regression results from a regression of home prices on house attributes for a random sample of house transactions:  Coefficierts  Stardard Error  Iratercept 16310.04114.5 Nurrber of Bedroorrs 7295.31399. Number of Bathroorrs 23473.04032.0\begin{array} { | l | r | r | } \hline & \text { Coefficierts } & \text { Stardard Error } \\\hline \text { Iratercept } & 16310.0 & 4114.5 \\\hline \text { Nurrber of Bedroorrs } & 7295.3 & 1399 . \\\hline \text { Number of Bathroorrs } & 23473.0 & 4032.0 \\\hline\end{array} r-squared = 0.302 Adjusted r-squared = 0.299 If we assume that the proper model to predict the market value of houses is given by this regression, and we also happen to know that number of bathrooms and number of bedrooms is uncorrelated both in the sample and in the target population of house sales, why might we still want to include number of bathrooms in a regression to identify the causal effect of number of bedrooms on home prices?

A) It lowers the adjusted r-squared.
B) It increases the r-squared value.
C) It provides a sanity check on our regression.
D) It also leads to more consistent estimates of the treatment effect.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
31
The determining function that drives share of accepted job offers for a company is given by the following equation: AcceptedOfferst = α0 + α1StartingSalaryt + α2EconomicClimatet + Ut, where the unit of observation is particular month (t). Suppose one wanted to use the national unemployment rate (unemploymentt) as a proxy for EconomicClimate. Which of the following describes a condition required to hold for this to be an adequate proxy variable?

A) α2 < 0
B) StartingSalary, EconomicClimate, and Unemployment rate to be uncorrelated with "other factors" (Ut)
C) α2 > 0
D) EconomicClimate is correlated with StartingSalary
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
32
Your first candidate for a regression to identify the effect of X1 on Y is: Yi = β0 + β1X1i + β2X2i + Ui, where X2 is a control variable. Suppose a member of your consulting team suggests to you that another variable Ri should be a control that is also included in your regression, in order for you not to be worried about the endogeneity problem. What condition could you credibly test using data on Y, X1, X2, and R which could justify the inclusion of Ri as a control in your regression?

A) Ri has no effect per se on Yi.
B) Ri has a strong correlation with X2.
C) Ri is exogenous.
D) spcorr(Ri, Yi (Xi1, Xi2)) ≠ 0
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
33
If you are using number of competitors as a control variable and the local unemployment rate as a proxy for economic climate in a regression to estimate the effect of price on sales, all the following variables will show up in the regression except for:

A) the number of competitors.
B) price.
C) economic climate.
D) sales.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
34
The determining function that drives share of accepted job offers for a company is given by the following equation: AcceptedOfferst = α0 + α1StartingSalaryt + α2EconomicClimatet + Ut, where the unit of observation is particular month (t). Suppose one wanted to use the national unemployment rate (unemploymentt) as a proxy for EconomicClimate (i.e., ran the above regression replacing the UnemploymentRate with EconomicClimate). How should we interpret the estimated coefficient on the UnemploymentRate (i.e., the proxy variable)?

A) A good estimate for α2 in the above determining function.
B) A good estimate for α1 in the above determining function.
C) A good estimate for α1 + α2 in the above determining function.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
35
A proxy variable is a variable used in a regression equation in order to:

A) report the r-squared value appropriately.
B) make the adjustment for the adjusted r-squared.
C) proxy for a confounding factor in an attempt to alleviate the endogeneity problem.
D) improve the standard errors on the treatment effect.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
36
In estimating the effect of a difference in earnings for ivy league graduates, non-ivy league college graduates, and non-college graduates the following regression is run: Earningsi = β0 + β1IvyLeaguei + β2NonIvyCollegeGraduatei + Ui. What does the coefficient β1 represent?

A) The additional increase (decrease) in earnings from going from graduating college to graduating from an Ivy league school.
B) The average earnings of an Ivy league graduate.
C) The additional increase (decrease) in earnings from going from no college degree to an Ivy league degree.
D) The average earnings of an ivy league graduate conditional on graduating from any college.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
37
Which of the following correlations would be an acceptable way to show that the variable (Pi) is a suitable proxy for Ai in the regression Yi = β0 + β1Ti + β2Ai + Ui?

A) Corr(Yi, Ti) ≠ 0
B) spCorr(Yi, Ti (Pi)) ≠ 0
C) pCorr(Yi, Pi; Ti) ≠ 0
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
38
Suppose you have the following regression results from a regression of home prices on house attributes for a random sample of house transactions:  Coefficierts  Stardard Error  Iratercept 16310.04114.5 Nurrber of Bedroorrs 7295.31399. Number of Bathroorrs 23473.04032.0\begin{array} { | l | r | r | } \hline & \text { Coefficierts } & \text { Stardard Error } \\\hline \text { Iratercept } & 16310.0 & 4114.5 \\\hline \text { Nurrber of Bedroorrs } & 7295.3 & 1399 . \\\hline \text { Number of Bathroorrs } & 23473.0 & 4032.0 \\\hline\end{array} r-squared = 0.302 Adjusted r-squared = 0.299 If we assume that the proper model to predict the market value of houses is given by this regression, and thus we are getting unbiased estimates of the true relationships between number of bedrooms/bathrooms and sales price, what is the effect on sales price of increasing the number of bathrooms in a house by one, holding number of bedrooms fixed?

A) Approximately, $16,310 dollars.
B) Approximately, $ 7,295 + 23,473 = $30,768.
C) Approximately, $23,473.
D) Approximately, an increase of 30%.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
39
The determining function that drives share of accepted job offers for a company is given by the following equation: AcceptedOfferst = α0 + α1StartingSalaryt + α2EconomicClimatet + Ut, where the unit of observation is particular month (t). Suppose one wanted to use the national unemployment rate (unemploymentt) as a proxy for EconomicClimate. Which of the following would be a condition that would rule out the unemployment rate being a good proxy for economic climate?

A) α2 < 0
B) Unemployment rate to be correlated with "other factors" (Ut)
C) α2 > 0
D) EconomicClimate is correlated with StartingSalary
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
40
Suppose you have the following regression results from a regression of home prices on house attributes for a random sample of house transactions:  Coefficierts  Stardard Error  Iratercept 16310.04114.5 Nurrber of Bedroorrs 7295.31399. Number of Bathroorrs 23473.04032.0\begin{array} { | l | r | r | } \hline & \text { Coefficierts } & \text { Stardard Error } \\\hline \text { Iratercept } & 16310.0 & 4114.5 \\\hline \text { Nurrber of Bedroorrs } & 7295.3 & 1399 . \\\hline \text { Number of Bathroorrs } & 23473.0 & 4032.0 \\\hline\end{array} Given these results which additional condition would be sufficient to ensure number of bathrooms satisfies the "primary criteria" for a good control variable in attempting to identify the causal effect of number of bedrooms on house prices?

A) The number of bathrooms is correlated with house prices.
B) The number of bedrooms is correlated with house prices.
C) The number of bathrooms is correlated with the number of bedrooms.
D) The p-value for the number of bedrooms.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
41
Can we estimate the following equation using standard linear regression techniques: Yi = β0 + β1Xi + β2Xi2 + Ui?

A) No, this is not linear in X.
B) No, we cannot isolate the treatment.
C) Yes, this is linear in X.
D) Yes, this is linear in our parameters.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
42
Suppose you've run a regression relating log(Output) to log(Worker Hours) in Excel. You are willing to make the necessary assumptions to deduce causality and run hypothesis tests. Your results are as follows:  Coefficients  Standard Error t Stat P-value  Irtercept 17.458304428.470755840.6132012970.541240031 Log(Worker Hours) 2.6994832870.7117290963.7928522250.000264681\begin{array} { l c c c c } & \text { Coefficients } & \text { Standard Error } & t \text { Stat } & P \text {-value } \\\hline \text { Irtercept } & 17.4583044 & 28.47075584 & 0.613201297 & 0.541240031 \\\text { Log(Worker Hours) } & 2.699483287 & 0.711729096 & 3.792852225 & 0.000264681\end{array} How should you interpret the coefficient on Log(Worker Hours) of 2.69?

A) A 1% increase in Worker Hours leads to a 2.69% increase in Output.
B) A 1 unit increase in Worker Hours leads to a 2.69% increase in Output.
C) A 1% increase in Worker Hours leads to a .0269% increase in Output.
D) A 2.69% increase in Worker Hours leads to a 1% increase in Output.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
43
Suppose you've regressed profits on price, assuming a quadratic functional form. Your regression equation is: Profitsi = β0 + β1Pricei + β2Pricei2 + Ui. What is the marginal effect of price in this equation?

A) β0 + β1 + β2
B) β1 + β2 × Pricei
C) β0 + β1 + 2β2
D) β1 + 2β2 × Pricei
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
44
Suppose the fitted (assumed causal) regression line for your data is as follows: Log(Output) = 6.2 + 0.4 × Log(Labor) + 0.5 × Log(Capital)
Interpret the coefficient on Log(Labor).

A) Holding capital fixed, when labor goes up by 1, output goes up by 0.4
B) Holding capital fixed, when labor goes up by 1, output goes up by 6.6
C) Holding capital fixed, when labor goes up by 1%, output goes up by 0.4%
D) Holding capital fixed, when labor goes up by 1%, output goes up by 40 units
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
45
Which of the following regressions would yield a coefficient estimate that would be directly interpreted as the price elasticity of demand?

A) Quantityi = α0 + α1log(Pricei)
B) log(Quantityi) = α0 + α1log(Pricei)
C) Quantityi = α0 + α1Pricei
D) Quantityi = α0 - α1log(Pricei)
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
46
The percentage change in one variable with a percentage change in another is known as a(n):

A) percentage point change.
B) log-level change.
C) elasticity.
D) linear effect.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
47
Suppose the relationship between temperature and demand for electricity is non-linear, due to the high demand for electricity when it is both very cold and very hot outside. What is the risk of estimating the effect of price on demand for electricity by the regression equation: Electricity Consumptiont = α0 + α1Pricet + α2Temperaturet + Ut?

A) The estimate of α1 will be biased.
B) The sample will not be representative.
C) Electricity consumption will be censored from above.
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
48
Which of the following cannot be estimated using traditional linear regression techniques?

A) log(Y) = β0 + β1log(X) + Ui
B) log(Y) = β0 + β1X + Ui
C) Y = β0 + β1log(X) + Ui
D) None of the answers is correct.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
49
Why is the use of polynomial functional forms typical in trying to estimate non-linear functional forms?

A) They yield more efficient estimates than other non-linear functional forms.
B) They offer a high degree of flexibility.
C) They provide coefficients that have elasticities as interpretations.
D) They have standard errors that are easier to interpret.
Unlock Deck
Unlock for access to all 49 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 49 flashcards in this deck.