Exam 7: Relationships in Data

arrow
  • Select Tags
search iconSearch Question
  • Select Tags

A statistician found data in an advertisement that listed the price of a diamond and the number of carats in the diamond. He entered the data (for 48 diamonds) into a computer package and fit a regression model for predicting price based on carats. He obtained the following computer output: A statistician found data in an advertisement that listed the price of a diamond and the number of carats in the diamond. He entered the data (for 48 diamonds) into a computer package and fit a regression model for predicting price based on carats. He obtained the following computer output:   -What percentage of the variability in prices is explained by the regression line with carats? -What percentage of the variability in prices is explained by the regression line with carats?

Free
(Essay)
4.9/5
(35)
Correct Answer:
Verified

The percentage of variability in prices explained by the regression line with carats is 97.8% (r2).

Scientists have studied whether one can predict temperature based on the frequency of a cricket's chirps. Consider the following two scatterplots based on data gathered in one study of 30 crickets, with temperature measured in degrees Fahrenheit and chirp frequency measured in chirps per minute: Scientists have studied whether one can predict temperature based on the frequency of a cricket's chirps. Consider the following two scatterplots based on data gathered in one study of 30 crickets, with temperature measured in degrees Fahrenheit and chirp frequency measured in chirps per minute:    a. If the goal is to predict temperature based on a cricket's chirps per minute, which is the appropriate scatterplot to examine, A or B? Explain briefly. Consider the following computer output:    b. Determine the value of the correlation coefficient between temperature and chirp frequency. c. What temperature would the regression model predict if the cricket were chirping at 110 chirps per minute? Is this prediction an example of extrapolation? Explain briefly. d. Identify the value of the test statistic for testing whether the population slope coefficient is zero. e. Produce and interpret a 90% confidence interval for the population slope coefficient. a. If the goal is to predict temperature based on a cricket's chirps per minute, which is the appropriate scatterplot to examine, A or B? Explain briefly. Consider the following computer output: Scientists have studied whether one can predict temperature based on the frequency of a cricket's chirps. Consider the following two scatterplots based on data gathered in one study of 30 crickets, with temperature measured in degrees Fahrenheit and chirp frequency measured in chirps per minute:    a. If the goal is to predict temperature based on a cricket's chirps per minute, which is the appropriate scatterplot to examine, A or B? Explain briefly. Consider the following computer output:    b. Determine the value of the correlation coefficient between temperature and chirp frequency. c. What temperature would the regression model predict if the cricket were chirping at 110 chirps per minute? Is this prediction an example of extrapolation? Explain briefly. d. Identify the value of the test statistic for testing whether the population slope coefficient is zero. e. Produce and interpret a 90% confidence interval for the population slope coefficient. b. Determine the value of the correlation coefficient between temperature and chirp frequency. c. What temperature would the regression model predict if the cricket were chirping at 110 chirps per minute? Is this prediction an example of extrapolation? Explain briefly. d. Identify the value of the test statistic for testing whether the population slope coefficient is zero. e. Produce and interpret a 90% confidence interval for the population slope coefficient.

Free
(Short Answer)
4.9/5
(35)
Correct Answer:
Verified

a. The scatterplot on the right (B) is the appropriate one, because the response variable (temperature) is on the vertical axis.
b. The correlation coefficient is the (positive, because the association is positive) square root of r2:.957.978r^2: \sqrt{.957} ≈ .978 .
c. The predicted temperature for a cricket chirping at 110 chirps per minute is 35.8+0.251×110=63.41F35.8 + 0.251 × 110 = 63.41 ^\circ{F} . This is not an example of extrapolation because 110 chirps per minute is near the center of the data.
d. The test statistic is 24.89. This is extremely large, so the p-value is essentially 0. The sample data provide overwhelming evidence there is a relationship between chirp frequency and temperature.
e. The critical value for 90% confidence and 30 - 2 = 28 degrees of freedom is t* = 1.701. The 90% confidence interval for the population slope coefficient is therefore 0.25116 ±\pm 1.701 × 0.01009, which is 0.25116 ±\pm 0.01716, which is the interval (0.23400, 0.26832). You can be 90% confident the predicted temperature increases between 0.234 and 0.268 degrees for each additional chirp per minute of a cricket.

For the homeowner's utility data studied in the previous question, the following regression output pertains to natural gas usage and mean temperature: For the homeowner's utility data studied in the previous question, the following regression output pertains to natural gas usage and mean temperature:    a. Report and interpret the value of the slope coefficient in context. b. Use the least squares line to predict the gas usage per day in a month with a mean temperature of 50 degrees. c. Determine the value of the correlation coefficient. d. What proportion of the variability in gas usage is explained by the least squares line with mean monthly temperature? e. Does the r<sup>2 </sup>value reveal what percentage of the data values fall on the least squares line?(Answer yes or no, but you need not bother to explain.) a. Report and interpret the value of the slope coefficient in context. b. Use the least squares line to predict the gas usage per day in a month with a mean temperature of 50 degrees. c. Determine the value of the correlation coefficient. d. What proportion of the variability in gas usage is explained by the least squares line with mean monthly temperature? e. Does the r2 value reveal what percentage of the data values fall on the least squares line?(Answer yes or no, but you need not bother to explain.)

Free
(Short Answer)
4.8/5
(31)
Correct Answer:
Verified

a. The slope coefficient is -0.2170. This means that for each additional increase of one degree in mean monthly temperature, the predicted natural gas usage decreases by .217 therms per day.
b. The predicted gas usage in a month with a mean temperature of 50 degrees is 15.370.2170×(50)=4.5215.37 - 0.2170 \times (50) = 4.52 therms per day.
c. The correlation coefficient is the negative square root of the r2r{}_2 value, which is .864.930-\sqrt{.864}\approx -.930 .
d. r2is86.4%r^2 is 86.4\% .
e. No. It's common for none, or a very small percentage, of data values to fall exactly on the line. This is not what the r2r^2 value measures.

The following data are the age (in years) and baby's birth weight (in grams) for a sample of 10 young women who gave birth as teenagers: The following data are the age (in years) and baby's birth weight (in grams) for a sample of 10 young women who gave birth as teenagers:    Consider the following summary statistics: Mean age: 17.0 years  SD ages: 1.49 years Mean birth weight: 3004 grams  SD birth weights: 414 grams Correlation coefficient between age and birth weight: .884 -Would you use this least squares line to predict the weight of a baby born to a 37-year-old mother? Explain. Consider the following summary statistics: Mean age: 17.0 years SD ages: 1.49 years Mean birth weight: 3004 grams SD birth weights: 414 grams Correlation coefficient between age and birth weight: .884 -Would you use this least squares line to predict the weight of a baby born to a 37-year-old mother? Explain.

(Essay)
4.9/5
(35)

A sample of students at a university took a test that diagnosed their learning styles as active or reflective and also as visual or verbal. Each student received a numerical score on the active/reflective style and also a numerical score on the visual/verbal style. The sample size was 39, and the sample correlation coefficient turned out to equal .273. -If the sample size was much larger, and the value of the sample correlation coefficient stayed the same, describe the impact on your test statistic, p-value, and conclusion.

(Short Answer)
4.9/5
(43)

The following data are the age (in years) and baby's birth weight (in grams) for a sample of 10 young women who gave birth as teenagers: The following data are the age (in years) and baby's birth weight (in grams) for a sample of 10 young women who gave birth as teenagers:    Consider the following summary statistics: Mean age: 17.0 years  SD ages: 1.49 years Mean birth weight: 3004 grams  SD birth weights: 414 grams Correlation coefficient between age and birth weight: .884 -Use the least squares line to predict the weight of a baby born to a 17-year-old mother. Consider the following summary statistics: Mean age: 17.0 years SD ages: 1.49 years Mean birth weight: 3004 grams SD birth weights: 414 grams Correlation coefficient between age and birth weight: .884 -Use the least squares line to predict the weight of a baby born to a 17-year-old mother.

(Short Answer)
4.9/5
(37)

The following scatterplots display the price vs. age for a sample of Hanoverian-bred dressage horses listed for sale on the Internet. The graph on the left displays these data for female horses, the graph on the right for male horses: The following scatterplots display the price vs. age for a sample of Hanoverian-bred dressage horses listed for sale on the Internet. The graph on the left displays these data for female horses, the graph on the right for male horses:    1. Is the oldest horse in this sample male or female? 2. Is the most expensive horse in this sample male or female? 3. Which horse would you predict to cost more: a 10-year-old male horse or a 10-year-old female horse? 4. Which gender has a positive association between price and age? 5. Which gender has the stronger association between price and age? 1. Is the oldest horse in this sample male or female? 2. Is the most expensive horse in this sample male or female? 3. Which horse would you predict to cost more: a 10-year-old male horse or a 10-year-old female horse? 4. Which gender has a positive association between price and age? 5. Which gender has the stronger association between price and age?

(Short Answer)
4.8/5
(43)

The following data are the age (in years) and baby's birth weight (in grams) for a sample of 10 young women who gave birth as teenagers: The following data are the age (in years) and baby's birth weight (in grams) for a sample of 10 young women who gave birth as teenagers:    Consider the following summary statistics: Mean age: 17.0 years  SD ages: 1.49 years Mean birth weight: 3004 grams  SD birth weights: 414 grams Correlation coefficient between age and birth weight: .884 -Use these summary statistics to determine the equation of the least squares line for predicting the baby's weight from the mother's age. Consider the following summary statistics: Mean age: 17.0 years SD ages: 1.49 years Mean birth weight: 3004 grams SD birth weights: 414 grams Correlation coefficient between age and birth weight: .884 -Use these summary statistics to determine the equation of the least squares line for predicting the baby's weight from the mother's age.

(Short Answer)
4.9/5
(44)

It can be shown that the sum of the residuals from a least squares line always equals zero. a. Does it follow that the mean of the residuals from a least squares line must always equal zero? Explain briefly. b. Does it follow that the median of the residuals from a least squares line must always equal zero? Explain briefly.

(Short Answer)
4.8/5
(42)

The following scatterplot displays the rushing yardage per game vs. number of rushes for LaDainian Tomlinson in the 2006 National Football League season. Also displayed is the least squares line. The following scatterplot displays the rushing yardage per game vs. number of rushes for LaDainian Tomlinson in the 2006 National Football League season. Also displayed is the least squares line.    The equation of the least squares line is predicted rushing yards = -3.69 + 5.37 × number of rushes. The value of r<sup>2</sup> is .332. -How many rushing yards does the least squares line predict for a game in which Tomlinson has 21 rushes? The equation of the least squares line is predicted rushing yards = -3.69 + 5.37 × number of rushes. The value of r2 is .332. -How many rushing yards does the least squares line predict for a game in which Tomlinson has 21 rushes?

(Essay)
4.8/5
(35)

The following scatterplot displays the rushing yardage per game vs. number of rushes for LaDainian Tomlinson in the 2006 National Football League season. Also displayed is the least squares line. The following scatterplot displays the rushing yardage per game vs. number of rushes for LaDainian Tomlinson in the 2006 National Football League season. Also displayed is the least squares line.    The equation of the least squares line is predicted rushing yards = -3.69 + 5.37 × number of rushes. The value of r<sup>2</sup> is .332. -Which game has the greatest residual (in absolute value)? Identify it by how many rushes Tomlinson had in that game and also by circling it on the graph. The equation of the least squares line is predicted rushing yards = -3.69 + 5.37 × number of rushes. The value of r2 is .332. -Which game has the greatest residual (in absolute value)? Identify it by how many rushes Tomlinson had in that game and also by circling it on the graph.

(Essay)
4.9/5
(32)

A statistician found data in an advertisement that listed the price of a diamond and the number of carats in the diamond. He entered the data (for 48 diamonds) into a computer package and fit a regression model for predicting price based on carats. He obtained the following computer output: A statistician found data in an advertisement that listed the price of a diamond and the number of carats in the diamond. He entered the data (for 48 diamonds) into a computer package and fit a regression model for predicting price based on carats. He obtained the following computer output:   -Comment on what is revealed by whether or not this confidence interval contains the value zero. -Comment on what is revealed by whether or not this confidence interval contains the value zero.

(Short Answer)
4.9/5
(43)

The following scatterplot displays the rushing yardage per game vs. number of rushes for LaDainian Tomlinson in the 2006 National Football League season. Also displayed is the least squares line. The following scatterplot displays the rushing yardage per game vs. number of rushes for LaDainian Tomlinson in the 2006 National Football League season. Also displayed is the least squares line.    The equation of the least squares line is predicted rushing yards = -3.69 + 5.37 × number of rushes. The value of r<sup>2</sup> is .332. -Identify and interpret the slope coefficient in this context. The equation of the least squares line is predicted rushing yards = -3.69 + 5.37 × number of rushes. The value of r2 is .332. -Identify and interpret the slope coefficient in this context.

(Short Answer)
4.8/5
(35)

a. Suppose everyone in your class scores exactly half as many points on the final exam as on the midterm exam. Report the value of the correlation coefficient between midterm exam score and final exam score. (No explanation is necessary.) b. Suppose everyone in your class scores exactly ten points lower on the final exam as on the midterm exam. Report the value of the correlation coefficient between midterm exam score and final exam score. (No explanation is necessary.)

(Short Answer)
4.9/5
(35)

Think of an example, not discussed in class or in your text, of two quantitative variables for which you would expect a fairly strong association but for which the relationship is not cause and effect. a. State the two variables clearly as variables. b. Describe whether you expect the association to be positive or negative. Briefly explain why you expect this. c. Briefly explain why you think the relationship between the variables is not cause and effect.

(Essay)
4.8/5
(32)

A statistician found data in an advertisement that listed the price of a diamond and the number of carats in the diamond. He entered the data (for 48 diamonds) into a computer package and fit a regression model for predicting price based on carats. He obtained the following computer output: A statistician found data in an advertisement that listed the price of a diamond and the number of carats in the diamond. He entered the data (for 48 diamonds) into a computer package and fit a regression model for predicting price based on carats. He obtained the following computer output:   -Determine a 99% confidence interval for the value of the slope coefficient in the population. -Determine a 99% confidence interval for the value of the slope coefficient in the population.

(Short Answer)
4.9/5
(40)

A sample of students at a university took a test that diagnosed their learning styles as active or reflective and also as visual or verbal. Each student received a numerical score on the active/reflective style and also a numerical score on the visual/verbal style. The sample size was 39, and the sample correlation coefficient turned out to equal .273. -State the hypotheses for testing whether there is a positive correlation between these variables in the population of all students at this university.

(Short Answer)
5.0/5
(40)

In a large statistics class, students have taken two exams. For each of the following situations, report what you would expect for the value of the correlation coefficient between exam 1 score and exam 2 score. Give a brief justification for your answer. 1. Every student scores ten points lower on exam 2 than on exam 1. 2. Every student scores twice as many points on exam 2 than on exam 1. 3. Every student scores half as many points on exam 2 than on exam 1. 4. Every student guesses randomly on every question on both exams. 5. Every student scores 100 points for his/her combined score on the two exams.

(Short Answer)
4.9/5
(46)

A sample of students at a university took a test that diagnosed their learning styles as active or reflective and also as visual or verbal. Each student received a numerical score on the active/reflective style and also a numerical score on the visual/verbal style. The sample size was 39, and the sample correlation coefficient turned out to equal .273. -Determine the p-value as accurately as possible.

(Short Answer)
4.8/5
(27)

The following scatterplots display the age (in months) at which a child first speaks and the child's score on a cognitive aptitude test taken later in childhood. Graph A (on the left) displays all of the data in a sample, and Graph B (on the right) excludes the two children who took longer than 24 months to speak. The following scatterplots display the age (in months) at which a child first speaks and the child's score on a cognitive aptitude test taken later in childhood. Graph A (on the left) displays all of the data in a sample, and Graph B (on the right) excludes the two children who took longer than 24 months to speak.    In one of these graphs the correlation coefficient is -.034, and in the other graph the correlation coefficient is -.640. a. Identify which correlation coefficient goes with Graph A (on the left). Briefly explain your choice. b. Write a paragraph, as if to the parent of an infant, summarizing what these scatterplots reveal about whether there is a relationship between the age at which a child first speaks and his/her cognitive aptitude. In one of these graphs the correlation coefficient is -.034, and in the other graph the correlation coefficient is -.640. a. Identify which correlation coefficient goes with Graph A (on the left). Briefly explain your choice. b. Write a paragraph, as if to the parent of an infant, summarizing what these scatterplots reveal about whether there is a relationship between the age at which a child first speaks and his/her cognitive aptitude.

(Short Answer)
4.8/5
(40)
Showing 1 - 20 of 35
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)