Exam 9: Bivariate Correlation and Regression

arrow
  • Select Tags
search iconSearch Question
  • Select Tags

To learn about the accuracy of a relationship between two variables (continuous), as well as its strength and direction, the most appropriate statistical test to run is

Free
(Multiple Choice)
4.8/5
(28)
Correct Answer:
Verified

D

Assume you have computed b = 6. Interpret your result in words.

Free
(Essay)
4.9/5
(35)
Correct Answer:
Verified

With every unit increase in IV (x), the DV (y) increases by 6 units.

You are interested in the relationship between the visual consumption of crime shows (average per day in hours) and levels of fear of crime (on a scale of 0-30). You have asked a random sample of 14 individuals (random digit dialing) how many hours they watch TV crime shows (per week) and about their level of fear of crime on a scale from 1 to 30. You select an alpha level of 0.05. Because you are uncertain about the direction of the relationship, you choose to utilize two-tailed t values. a. State your null and alternative hypotheses. b. Utilizing the data from the table presented below: i. Compute beta (b). Interpret your result. ii. Compute the constant (a) and interpret your result. iii. Determine the strength and direction of the relationship by computing Pearson's r. Interpret your result. iv. Compute r2 and interpret your result. v. Compute the prediction error you could possibly have knowing nothing about the distribution of the independent variable (hours of crime shows watched). vi. Compute the unexplained variance. vii. Compute the explained variance. c. Make a decision rule to determine significance. d. Make a decision and interpret your results. e. Compute the standard error of estimate. Case 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Free
(Essay)
4.9/5
(31)
Correct Answer:
Verified

a. H0: There is no statistically significant relationship between visual consumption of crime shows (average per day) and levels of fear of crime (scale 0-30; treated as a continuous variable).
H1: There is a statistically significant relationship between visual consumption of crime shows (average per day) and levels of fear of crime (scale 0-30; treated as a continuous variable).
b.
 Case xyx2y2xy1012014402311912133321141212249188132416254191636176652025400100771149121778916812561449818643241441092581625225111124121576264121918191331191213314211412122 Totals 732165453,6961,311\begin{array} { | r | r | r | r | r | r | } \hline \text { Case } & x & y & x ^ { 2 } & y ^ { 2 } & x y \\\hline 1 & 0 & 12 & 0 & 144 & 0 \\\hline 2 & 3 & 11 & 9 & 121 & 33 \\\hline 3 & 2 & 11 & 4 & 121 & 22 \\\hline 4 & 9 & 18 & 81 & 324 & 162 \\\hline 5 & 4 & 19 & 16 & 361 & 76 \\\hline 6 & 5 & 20 & 25 & 400 & 100 \\\hline 7 & 7 & 11 & 49 & 121 & 77 \\\hline 8 & 9 & 16 & 81 & 256 & 144 \\\hline 9 & 8 & 18 & 64 & 324 & 144 \\\hline 10 & 9 & 25 & 81 & 625 & 225 \\\hline 11 & 11 & 24 & 121 & 576 & 264 \\\hline 12 & 1 & 9 & 1 & 81 & 9 \\\hline 13 & 3 & 11 & 9 & 121 & 33 \\\hline 14 & 2 & 11 & 4 & 121 & 22 \\\hline \text { Totals } & 73 & 216 & 545 & 3,696 & 1,311 \\\hline\end{array}
 i. b=SSyxSSx=NΣXY(ΣX)(ΣY)NΣX2(ΣX)2=14×131173×21614×5455,329=2,5862,301=1.1239\text { i. } b = \frac { S S _ { y x } } { S S _ { x } } = \frac { N \Sigma X Y - ( \Sigma X ) ( \Sigma Y ) } { N \Sigma X ^ { 2 } - ( \Sigma X ) ^ { 2 } } = \frac { 14 \times 1311 - 73 \times 216 } { 14 \times 545 - 5,329 } = \frac { 2,586 } { 2,301 } = 1.1239
With every unit increase in visual consumption of TV crime shows (unit = 1 hour), the level of fear of crime increases by 1.1239 units.
ii.
xy Hours of  crime  shows  Fear of  crime 012311211918419520711916818925112419311211 Totals 73216 Means 5.21428615.42857\begin{array} { | l | r | r | } \hline & x & { y } \\\hline & \begin{array} { l } \text { Hours of } \\\text { crime } \\\text { shows }\end{array} & { \begin{array} { l } \text { Fear of } \\\text { crime }\end{array} } \\\hline & 0 & 12 \\\hline & 3 & 11 \\\hline & 2 & 11 \\\hline & 9 & 18 \\\hline & 4 & 19 \\\hline & 5 & 20 \\\hline & 7 & 11 \\\hline & 9 & 16 \\\hline & 8 & 18 \\\hline & 9 & 25 \\\hline & 11 & 24 \\\hline & 1 & 9 \\\hline & 3 & 11 \\\hline & 2 & 11 \\\hline \text { Totals } & 73 & 216 \\\hline \text { Means } & 5.214286 & 15.42857 \\\hline\end{array}
 Consant (a)=yˉbxˉ=15.42861.124×5.2143=9.5677\text { Consant } ( a ) = \bar { y } - b \bar { x } = 15.4286 - 1.124 \times 5.2143 = 9.5677
It can be predicted that an individual who does not watch any crime shows (x = 0) would still have a level of fear of crime of 9.5677 (y). In other words, the regression line intersects with the x-axis at 9.5677.
 iii. r=NΣXY(ΣX)(ΣY)[NΣX2(ΣX)2][NΣY2(ΣY)2]=2,5863,421.6207=0.7558\begin{array} { l } \text { iii. } \\r = \frac { N \Sigma X Y - ( \Sigma X ) ( \Sigma Y ) } { \sqrt { \left[ N \Sigma X ^ { 2 } - ( \Sigma X ) ^ { 2 } \right] \left[ N \Sigma Y ^ { 2 } - ( \Sigma Y ) ^ { 2 } \right] } } = \frac { 2,586 } { 3,421.6207 } = 0.7558 \\\end{array}
The correlation coefficient of 0.7558 indicates that there is a moderate to strong positive relationship between visual consumption of crime shows (hours/day) and the level of fear of crime (scale 0-30). That is, as the hours of crime shows watched by an individual increases, so does his/her level of fear of crime. Recall: this is a hypothetical example.
iv. r2= 0.75582 = 0.5712.
57.12% of the variance in level of fear of crime (scale 0-30) is accounted for by the average number of hours of crime shows watched per week. In other words, when knowledge of the distribution of hours of crime shows watched is taken into account, we improve our ability to predict values on the independent variable (level of fear of crime) by a factor of 57.12%.
.
This value represents the total amount of (squared) errors we could possibly make if we knew absolutely nothing about the distribution of hours of TV crime shows watched.
v.
 Case xyyy10123.428611.755323114.428619.612532114.428619.612549182.57146.61209854193.571412.754965204.571420.897777114.428619.612589160.57140.32649898182.57146.612098109259.571491.61171111248.571473.468912196.428641.3269133114.428619.6125142114.428619.6125 Totals 732160.0004363.4286\begin{array} { | r | r | r | r | r | } \hline { \text { Case } } & x & y & y & y \\\hline 1 & 0 & 12 & - 3.4286 & 11.7553 \\\hline 2 & 3 & 11 & - 4.4286 & 19.6125 \\\hline 3 & 2 & 11 & - 4.4286 & 19.6125 \\\hline 4 & 9 & 18 & 2.5714 & 6.612098 \\\hline 5 & 4 & 19 & 3.5714 & 12.7549 \\\hline 6 & 5 & 20 & 4.5714 & 20.8977 \\\hline 7 & 7 & 11 & - 4.4286 & 19.6125 \\\hline 8 & 9 & 16 & 0.5714 & 0.326498 \\\hline 9 & 8 & 18 & 2.5714 & 6.612098 \\\hline 10 & 9 & 25 & 9.5714 & 91.6117 \\\hline 11 & 11 & 24 & 8.5714 & 73.4689 \\\hline 12 & 1 & 9 & - 6.4286 & 41.3269 \\\hline 13 & 3 & 11 & - 4.4286 & 19.6125 \\\hline 14 & 2 & 11 & - 4.4286 & 19.6125 \\\hline \text { Totals } & 73 & 216 & - 0.0004 & 363.4286 \\\hline\end{array}
vi. First you must compute y? for every case observed: y? = a + bx
Σ\Sigma (y - y?)2 = 155.8357.
The number of possible errors we could make (155.8357) trying to predict y is substantially reduced by knowing the distribution of hours of TV crime shows watched.
 case xyyyy(yy)210129.56772.43235.916083231112.93941.93943.761272321111.81550.81550.66504491819.68281.68282.831816541914.06334.936724.37101652015.18724.812823.16304771117.4356.43541.40923891619.68283.682813.56302981818.55890.55890.3123691092519.68285.317228.2726211112421.93062.06944.282416121910.69161.69162.8615111331112.93941.93943.7612721421111.81550.81550.66504 Totals 73216155.8357\begin{array}{|r|r|r|l|r|r|}\hline \text { case } & x & y & y^{\prime} & y-y^{\prime} & \left(y-y^{\prime}\right)^{2} \\\hline 1 & 0 & 12 & 9.5677 & 2.4323 & 5.916083 \\\hline 2 & 3 & 11 & 12.9394 & -1.9394 & 3.761272 \\\hline 3 & 2 & 11 & 11.8155 & -0.8155 & 0.66504 \\\hline 4 & 9 & 18 & 19.6828 & -1.6828 & 2.831816 \\\hline 5 & 4 & 19 & 14.0633 & 4.9367 & 24.37101 \\\hline 6 & 5 & 20 & 15.1872 & 4.8128 & 23.16304 \\\hline 7 & 7 & 11 & 17.435 & -6.435 & 41.40923 \\\hline 8 & 9 & 16 & 19.6828 & -3.6828 & 13.56302 \\\hline 9 & 8 & 18 & 18.5589 & -0.5589 & 0.312369 \\\hline 10 & 9 & 25 & 19.6828 & 5.3172 & 28.27262 \\\hline 11 & 11 & 24 & 21.9306 & 2.0694 & 4.282416 \\\hline 12 & 1 & 9 & 10.6916 & -1.6916 & 2.861511 \\\hline 13 & 3 & 11 & 12.9394 & -1.9394 & 3.761272 \\\hline 14 & 2 & 11 & 11.8155 & -0.8155 & 0.66504 \\\hline \text { Totals } & 73 & 216 & & & 155.8357 \\\hline\end{array}

vii. SStotal = SSexplained + SSerror
SSexplained = 363.4286 - 155.8357 = 207.5929
With the knowledge of the distribution of hours of TV crime shows (x) watched, we can explain 207.5929 of the variance within the dependent variable (numbers of violent crimes/month/100 inhabitants).
c. .
We can reject the null hypothesis if t exceeds ±2.228 (df = 14 - 2 = 10).
d. We use the t distribution to determine significance.
t=rn21r2=0.755814210.5712=0.7558×5.2901=3.9983t = r \sqrt { \frac { n - 2 } { 1 - r ^ { 2 } } } = 0.7558 \sqrt { \frac { 14 - 2 } { 1 - 0.5712 } } = 0.7558 \times 5.2901 = 3.9983
We can reject the null hypothesis because t exceeds ±2.228 (t(12)=3.9983; p < 0.05). This means there is a statistically significant relationship between hours of TV crime shows watched and levels of fear of crime.
e.
Sy.x=Σ(yy)2n2=155.835712=3.6037S _ { y . x } = \sqrt { \frac { \Sigma \left( y - y ^ { \prime } \right) ^ { 2 } } { n - 2 } } = \sqrt { \frac { 155.8357 } { 12 } } = 3.6037

Academic literature found evidence that there is a relationship between temperature and the frequency of the occurrence of violent crimes. Utilizing data derived from the local police department (number of violent crimes/month/100 inhabitants) and meteorological data (average temperature/month), you want to determine the relationship between temperature (IV) and the frequency of violent crime (DV). Find the distribution of both variables in the table below. a. Use a spreadsheet program to create a scatterplot or draw it by hand. Describe what you see. b. Compute beta (b) and interpret your result. c. Compute the constant (a) and interpret your result. d. Assume you want to predict the frequency of violent crimes in a month with an average temperature of 90.5 degrees Fahrenheit. Compute y. Academic literature found evidence that there is a relationship between temperature and the frequency of the occurrence of violent crimes. Utilizing data derived from the local police department (number of violent crimes/month/100 inhabitants) and meteorological data (average temperature/month), you want to determine the relationship between temperature (IV) and the frequency of violent crime (DV). Find the distribution of both variables in the table below. a. Use a spreadsheet program to create a scatterplot or draw it by hand. Describe what you see. b. Compute beta (b) and interpret your result. c. Compute the constant (a) and interpret your result. d. Assume you want to predict the frequency of violent crimes in a month with an average temperature of 90.5 degrees Fahrenheit. Compute y.

(Essay)
4.8/5
(35)

To predict a certain outcome having detailed knowledge about the independent variable, which is impacting the dependent variable, the most appropriate statistical test to run is

(Multiple Choice)
4.9/5
(39)

It is debated whether IQ scores influence criminal behavior (either directly or indirectly). It is argued that the majority of criminals have an IQ that lies 8 points below (92) the average IQ (100). You draw a random sample of 15 individuals of age 18+, assess the IQ of participants, and ask them about their criminal history (number of offenses committed, not including minor traffic violations). You select an alpha of 0.05. The results of your assessment are to be found in the table presented below. a. State your null and alternative hypotheses. b. State your decision rule to determine statistical significance. c. Utilizing the data from the table presented below: i. Compute beta (b). ii. Compute the constant (a). iii. Determine the strength and direction of the relationship by computing Pearson's r. iv. Compute r2. v. Compute the prediction error you could possibly have if you know nothing about the distribution of the independent variable (IQ score). vi. Compute the unexplained variance. vii. Compute the explained variance. d. Make a decision. e. Compute the standard error of estimate. f. INTERPRET ALL YOUR RESULTS. Case IQ (x) No. of offenses (y) 1 90 4 2 110 1 3 89 4 4 95 3 5 92 5 6 91 4 7 102 0 8 99 2 9 76 3 10 99 1 11 92 6 12 98 1 13 111 1 14 100 2 15 93 6

(Essay)
4.9/5
(34)

Using the example from problem 2: a. State your null and your alternative hypotheses. b. Determine the strength and direction of the relationship by computing Pearson's r. Interpret your results. c. Compute r2 and interpret your result.

(Essay)
4.9/5
(35)

To warm up, let us start with a problem that has a "perfect" regression line. Assume that the state prison wants to encourage prisoners to get involved in education. Thus, the prison administration offers that for every hour spent on education, inmates receive 5 additional minutes in the prison yard. a. Compute beta and interpret your finding. b. Compute the constant (y intercept or a) and interpret your finding. y 0 0 1 5 2 10 3 15 4 20 5 25 6 30 7 35 8 40 9 45 10 50 Totals Means c. Assume that John (inmate) has studied 17 hours in the previous week; how many minutes of additional yard time did he earn? Use the classic algebraic equation (y = a + bx) to calculate the amount of minutes earned and interpret your result. d. Next, you want to predict the total time an inmate is allowed to spend in the yard (weekly allowance + additional time earned). The weekly allowance regarding yard time is (without additional time earned) 630 minutes (10.5 hours). x y 0 630 1 635 2 640 3 645 4 650 5 655 6 660 7 665 8 670 9 675 10 680 Totals Means i. Compute beta. ii. Compute the constant (a or y-intercept). iii. How many minutes (total) is John allowed to spend in the prison yard if he has studied for 21 hours? Use the classic algebraic equation (y = a + bx).

(Essay)
4.8/5
(25)

Many formulas utilized in statistics are complex. However, one basic formula used in regression is a classic algebraic equation: y = a + bx. Explain the variables within this formula.

(Essay)
4.8/5
(31)
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)