Deck 4: Describing Bivariate Numerical Data
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/33
Play
Full screen (f)
Deck 4: Describing Bivariate Numerical Data
1

False
2

False
3
The Des Moines Register reported the ratings of high school sportsmanship as compiled
by the Iowa High School Athletic Association. The participants and coaches from each
school were rated by referees. (1 = superior, 5 = unsatisfactory.) A regression analysis
of data on the average scores given to football players and coaches is shown below.
a) Interpret the value of the correlation between the ratings of coaches and
participants.
b) Interpret the value of the coefficient of determination.
c) Interpret the value of the standard deviation about the least squares line.
by the Iowa High School Athletic Association. The participants and coaches from each
school were rated by referees. (1 = superior, 5 = unsatisfactory.) A regression analysis
of data on the average scores given to football players and coaches is shown below.

participants.
b) Interpret the value of the coefficient of determination.
c) Interpret the value of the standard deviation about the least squares line.

4
The use of small aircraft with human observers is common in wildlife studies where
the goal is to estimate the abundance of different species. A more economical
strategy might be to use drones. The idea is that a drone would fly over the area of
interest and take photographs for computer analysis. The plot below is from a test run
of drones over 10 areas in South Central Florida. Bird decoys were used to test the
accuracy of the analysis of the photographs. A scatterplot of the actual number of
decoys and the number acquired from applying photographic analysis is shown
below.
the goal is to estimate the abundance of different species. A more economical
strategy might be to use drones. The idea is that a drone would fly over the area of
interest and take photographs for computer analysis. The plot below is from a test run
of drones over 10 areas in South Central Florida. Bird decoys were used to test the
accuracy of the analysis of the photographs. A scatterplot of the actual number of
decoys and the number acquired from applying photographic analysis is shown
below.

Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
5
One of the properties of correlation coefficient, r, is: "The value of r does not depend
on the unit of measurement for either variable." In your own words, what does this
mean?
on the unit of measurement for either variable." In your own words, what does this
mean?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
6
If on average y increases as x increases, the correlation coefficient is
positive.
positive.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
7
The coefficient of determination is equal to the positive square root of the
correlation coefficient, r.
correlation coefficient, r.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
8
Early humans were similar in shape to most modern large primates. The data below
are average male hind limb and forelimb lengths for different species of early
hominids (humans and their ancestors.)
a) What is the value of the correlation coefficient for these data?
c) Suppose these species are representative of all species of early human ancestors.
If a new homonin species dating from about the same time were to be discovered
with an average hind limb length of 500 mm, what would you predict to be the
average forelimb length of this species?
are average male hind limb and forelimb lengths for different species of early
hominids (humans and their ancestors.)


If a new homonin species dating from about the same time were to be discovered
with an average hind limb length of 500 mm, what would you predict to be the
average forelimb length of this species?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
9
What is it that the correlation coefficient measures?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
10
The slope of the least squares line for predicting y from x and the slope of
the least squares line for predicting x from y are equal.
the least squares line for predicting x from y are equal.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
11
What is it that the correlation coefficient measures?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
12
A transformation, or re-expression, of a variable is accomplished by
substituting a function of the variable in place of the variable in further
analyses.
substituting a function of the variable in place of the variable in further
analyses.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
13
The value of the correlation coefficient, r, is always between 0 and 1.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
14
The breeding success of ground-nesting birds in high altitudes can be affected by the
depth of winter snow. The plot below relates the Spring percentage of White-tailed
Ptarmigan hens hatching at least one egg, to the amount of snowfall in the Sierra
Nevada mountain range the previous Winter.
a) The least sq uares line is
Graph
this line on the scatterplot above.
b) The least squares line is the line that minimizes the sum of the squared residuals.
Using your line in part (a), graphically represent the residual associated with the
snow depth of 50 cm on the scatterplot.
depth of winter snow. The plot below relates the Spring percentage of White-tailed
Ptarmigan hens hatching at least one egg, to the amount of snowfall in the Sierra
Nevada mountain range the previous Winter.


this line on the scatterplot above.
b) The least squares line is the line that minimizes the sum of the squared residuals.
Using your line in part (a), graphically represent the residual associated with the
snow depth of 50 cm on the scatterplot.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
15
The standard deviation about the least squares line is roughly the typical
amount by which an observation deviates from the least squares line.
amount by which an observation deviates from the least squares line.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
16
The correlation coefficient, r, does not depend on the units of
measurement of the two variables.
measurement of the two variables.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
17
The data below were gathered on a random sample of 7 male banded black-footed
albatrosses of known age. In an effort to monitor diseases of these animals, biologists
would like to be able to estimate the lifespan of healthy albatrosses in the larger
population. In males of this species gonad size (the size of the sex gland) is
associated with age.

albatrosses of known age. In an effort to monitor diseases of these animals, biologists
would like to be able to estimate the lifespan of healthy albatrosses in the larger
population. In males of this species gonad size (the size of the sex gland) is
associated with age.


Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
18
The least squares line passes through the point 

Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
19
The slope of the least squares line is the amount by which y increases, on
average, as x increases by one unit.
average, as x increases by one unit.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
20
One of the properties of Pearson's r is: "The value of r does not depend on which of
the two variables is labeled as x." In your own words, what does this mean?
the two variables is labeled as x." In your own words, what does this mean?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
21
Assessing the "goodness" of a regression line involves considering several aspects of
the fit. Consider the characteristics below. How does each contribute to an
assessment of fit? That is, for each characteristic, what about the given characteristic
would indicate that the regression line is "good"?
a) The shape of the scatter plot
b) The correlation coefficient
c) The standard deviation of the residuals
d) The coefficient of determination
the fit. Consider the characteristics below. How does each contribute to an
assessment of fit? That is, for each characteristic, what about the given characteristic
would indicate that the regression line is "good"?
a) The shape of the scatter plot
b) The correlation coefficient
c) The standard deviation of the residuals
d) The coefficient of determination
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
22
The study of prehistoric birds depends on imprints of a prehistoric creature's remains in
stone, commonly known as fossils. To study ancient ecosystems effectively it would be
useful know the actual mass of individual birds, but this information is not preserved in
the fossil record. It seems reasonable that the biomechanics of birds is much the same
today as in the past. For example, today's relationship between the wing length and total
weight of a bird should be very similar to that for birds from the distant past. The wing
lengths of ancient birds are readily obtainable from the fossil record, but the weight is
not. A regression model expressing the relationship between wing length and total
weight of modern birds could be used to estimate the mass of similar prehistoric birds.
Data for some species of modern birds of prey and are given below.
Biological theory suggests that the
relationship between the weight of these
animals and their wing length could be
modeled using an exponential model.
Perform the appropriate transformation of
variable(s) and fit an exponential model to
the data.
a) What is the resulting best fit line using
the transformed data?
b) What is the predicted log of bird weight for a species with wing length L = 56.0 ?
Show your work below.
stone, commonly known as fossils. To study ancient ecosystems effectively it would be
useful know the actual mass of individual birds, but this information is not preserved in
the fossil record. It seems reasonable that the biomechanics of birds is much the same
today as in the past. For example, today's relationship between the wing length and total
weight of a bird should be very similar to that for birds from the distant past. The wing
lengths of ancient birds are readily obtainable from the fossil record, but the weight is
not. A regression model expressing the relationship between wing length and total
weight of modern birds could be used to estimate the mass of similar prehistoric birds.
Data for some species of modern birds of prey and are given below.

Biological theory suggests that the

animals and their wing length could be
modeled using an exponential model.
Perform the appropriate transformation of
variable(s) and fit an exponential model to
the data.
a) What is the resulting best fit line using
the transformed data?
b) What is the predicted log of bird weight for a species with wing length L = 56.0 ?
Show your work below.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
23
life without their physical capture and handling. In a recent study of
bobcat (Lynx rufus) abundance, camera traps were placed at varying
distances from a road. The data on trapping success from 8 trapping
stations are presented in the table at right. The trapping success is
Remote camera trapping is used to detect and monitor elusive wild-
defined as the number of captures per 100 trap-nights.
Hemorrhagic disease in white-tailed deer is caused by a virus known as EHD.
Immunity is given to fawns by transfer of EHD antibodies from the mother. In a
study to determine how long the maternal antibodies last, blood samples were taken
from a large sample of fawns of varying ages. The mean levels of EHD antibody
concentration and the associated ages of fawns are given in the table below.
After using the data to fit a straight line model, Eˆ = a + bW , significant curvature was
detected in the residual plot. Two nonlinear models were chosen for further analysis,
the exponential and the power models. (For these data, common logs were used to
perform the transformations.) The computer output for these models is given below,
and the residual plots are on the next page.
a) For the exponential model, calculate the predicted logarithm of the EHD antibody
concentration for an age of 5 weeks.
b) Generally speaking, which of the two models, power or exponential, is a better
choice for predicting the logarithm of the EHD antibody concentration? Provide
statistical justification for your choice based on both the residual plot and the
numerical summary statistics above.
c) The researchers want use their model to predict EHD antibody concentrations for
fawns up to 24 weeks of age. Do you think this would be reasonable? Explain
why or why not.
bobcat (Lynx rufus) abundance, camera traps were placed at varying
distances from a road. The data on trapping success from 8 trapping
stations are presented in the table at right. The trapping success is
Remote camera trapping is used to detect and monitor elusive wild-

Hemorrhagic disease in white-tailed deer is caused by a virus known as EHD.
Immunity is given to fawns by transfer of EHD antibodies from the mother. In a
study to determine how long the maternal antibodies last, blood samples were taken
from a large sample of fawns of varying ages. The mean levels of EHD antibody
concentration and the associated ages of fawns are given in the table below.
After using the data to fit a straight line model, Eˆ = a + bW , significant curvature was
detected in the residual plot. Two nonlinear models were chosen for further analysis,
the exponential and the power models. (For these data, common logs were used to
perform the transformations.) The computer output for these models is given below,
and the residual plots are on the next page.


concentration for an age of 5 weeks.
b) Generally speaking, which of the two models, power or exponential, is a better
choice for predicting the logarithm of the EHD antibody concentration? Provide
statistical justification for your choice based on both the residual plot and the
numerical summary statistics above.
c) The researchers want use their model to predict EHD antibody concentrations for
fawns up to 24 weeks of age. Do you think this would be reasonable? Explain
why or why not.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
24
a) What is the equation of the least-squares
line for predicting tenacity using amino
b) Graph the least squares best fit line on the scatter plot
that appears on the next page.
acid ratio
c) Approximately what proportion of the variability in
tenacity is explained by the linear relationship
between tenacity and the amino acid ratio?
protein and biodegradable. It would be beneficial
to be able to assess the delicacy of a fabric before
making decisions about displaying it in a
museum. Chemical analysis might give some
evidence about the brittle nature of a fabric. Bio-
chemical data were acquired from the linings of
Some delicate fabrics are natural silks, made of
sixteen 19th and early 20th century Japanese
kimonos. Investigators measured the
concentration of certain amino acids ("Amino
acid ratio") as well as the breaking stress
("tenacity") of the 16 kimono fabrics.
line for predicting tenacity using amino
b) Graph the least squares best fit line on the scatter plot
that appears on the next page.
acid ratio

tenacity is explained by the linear relationship
between tenacity and the amino acid ratio?


to be able to assess the delicacy of a fabric before
making decisions about displaying it in a
museum. Chemical analysis might give some
evidence about the brittle nature of a fabric. Bio-
chemical data were acquired from the linings of
Some delicate fabrics are natural silks, made of

kimonos. Investigators measured the
concentration of certain amino acids ("Amino
acid ratio") as well as the breaking stress
("tenacity") of the 16 kimono fabrics.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
25
Does the transformed model appear to be no improvement over the linear model, a
slight improvement, or a significant improvement? Justify your response with an
appropriate statistical argument.
slight improvement, or a significant improvement? Justify your response with an
appropriate statistical argument.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
26
Assessing the "goodness" of a regression line involves considering several aspects of
the fit. Consider the characteristics below. How does each contribute to an
assessment of fit? That is, for each characteristic, what about the given characteristic
would indicate that the regression line is "good"?
a) The shape of the residual plot
b) The correlation coefficient
c) The existence of outliers
d) The coefficient of determination
the fit. Consider the characteristics below. How does each contribute to an
assessment of fit? That is, for each characteristic, what about the given characteristic
would indicate that the regression line is "good"?
a) The shape of the residual plot
b) The correlation coefficient
c) The existence of outliers
d) The coefficient of determination
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
27
The Des Moines Register reported the ratings of high school sportsmanship as compiled
by the Iowa High School Athletic Association. The participants and coaches from each
school were rated by referees. (1 = superior, 5 = unsatisfactory.) A regression analysis
of data on the average scores given to wrestling spectators and coaches is shown below.
a) Interpret the value of the correlation between the ratings of spectators and
wrestlers?
b) Interpret the value of the coefficient of determination.
c) Interpret the value of the standard deviation about the least squares line.
by the Iowa High School Athletic Association. The participants and coaches from each
school were rated by referees. (1 = superior, 5 = unsatisfactory.) A regression analysis
of data on the average scores given to wrestling spectators and coaches is shown below.

wrestlers?
b) Interpret the value of the coefficient of determination.
c) Interpret the value of the standard deviation about the least squares line.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
28
As early as 3 years of age, children begin to show preferences for playing with
members of their own sex, and report having more same-sex than opposite-sex
friends. Researchers believe that this may be the result of perceived differences in personality. In a study of 3rd and 4th graders' views on a number personality traits,
children were asked to rate on a "5-point" scale:
A scatterplot of the data is presented below. A single point represents the (average
girls' rating, average boys' rating) for a given trait.
a) Circle the single point that represents the most influential observation. What
aspect of this point makes it the most influential?
b) Suppose a personality trait similar to those used in the survey was given an
average of 0.0 ("can't tell") by the girls. The predicted boys' average rating
would be closest to which of the 5 categories described above?
c) The traits plotted above are those the researchers believe are "positive" traits, such
as "mature," "honest," and "polite." The researchers thought that on average girls
would rate these positive traits as characteristic of girls to a greater extent than
boys would. What aspects of the plot and/or regression analysis presented above
are consistent with this thinking?
members of their own sex, and report having more same-sex than opposite-sex
friends. Researchers believe that this may be the result of perceived differences in personality. In a study of 3rd and 4th graders' views on a number personality traits,
children were asked to rate on a "5-point" scale:

girls' rating, average boys' rating) for a given trait.

aspect of this point makes it the most influential?
b) Suppose a personality trait similar to those used in the survey was given an
average of 0.0 ("can't tell") by the girls. The predicted boys' average rating
would be closest to which of the 5 categories described above?
c) The traits plotted above are those the researchers believe are "positive" traits, such
as "mature," "honest," and "polite." The researchers thought that on average girls
would rate these positive traits as characteristic of girls to a greater extent than
boys would. What aspects of the plot and/or regression analysis presented above
are consistent with this thinking?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
29
What is the best fit line using the
transformed data?
transformed data?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
30
The study of prehistoric birds depends on imprints of a prehistoric creature's remains in
stone, commonly known as fossils. To study ancient ecosystems effectively it would be
useful know the actual mass of individual birds, but this information is not preserved in
the fossil record. It seems reasonable that the biomechanics of birds is much the same
today as in the past. For example, today's relationship between the wing length and total
weight of a bird should be very similar to that for birds from the distant past. The wing
lengths of ancient birds are readily obtainable from the fossil record, but the weight is
not. A regression model expressing the relationship between wing length and total
weight of modern birds could be used to estimate the mass of similar prehistoric birds.
Data for some species of modern birds of prey and are given below.
Investigators would like to model the relationship between Wing Length and Weight.
The least squares line for predicting total weight using wing length as a predictor is of
interest.
a) What is the equation of the least-squares line?
b) Graph the least-squares line on the scatter plot below.
c) Approximately what proportion of the
variability in weight is explained by the
wing length?
stone, commonly known as fossils. To study ancient ecosystems effectively it would be
useful know the actual mass of individual birds, but this information is not preserved in
the fossil record. It seems reasonable that the biomechanics of birds is much the same
today as in the past. For example, today's relationship between the wing length and total
weight of a bird should be very similar to that for birds from the distant past. The wing
lengths of ancient birds are readily obtainable from the fossil record, but the weight is
not. A regression model expressing the relationship between wing length and total
weight of modern birds could be used to estimate the mass of similar prehistoric birds.
Data for some species of modern birds of prey and are given below.

Investigators would like to model the relationship between Wing Length and Weight.
The least squares line for predicting total weight using wing length as a predictor is of
interest.
a) What is the equation of the least-squares line?
b) Graph the least-squares line on the scatter plot below.
c) Approximately what proportion of the
variability in weight is explained by the
wing length?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
31
As early as 3 years of age, children begin to show preferences for playing with
members of their own sex, and report having more same-sex than opposite-sex
friends. Researchers believe that this may be the result of perceived differences in personality. In a study of 3rd and 4th graders' views on a number personality traits,
children were asked to rate on a "5-point" scale:
A scatterplot of the data is presented below. A single point represents the (average
girls' rating, average boys' rating) for a given trait.
a) Circle the single point which represents the most influential observation. What
aspect of this point makes it the most influential?
b) Suppose a personality trait similar to those used in the survey was given an
average of 0.0 ("can't tell") by the girls. The predicted boys' average rating would
be closest to which of the 5 categories described above?
c) The traits plotted above are those the researchers believe are "positive" traits, such
as "mature," "honest," and "polite." The researchers thought that on average girls
would rate these positive traits as characteristic of girls to a greater extent than
boys would. What aspects of the plot and/or regression analysis presented above
are consistent with this thinking?
members of their own sex, and report having more same-sex than opposite-sex
friends. Researchers believe that this may be the result of perceived differences in personality. In a study of 3rd and 4th graders' views on a number personality traits,
children were asked to rate on a "5-point" scale:

girls' rating, average boys' rating) for a given trait.

aspect of this point makes it the most influential?
b) Suppose a personality trait similar to those used in the survey was given an
average of 0.0 ("can't tell") by the girls. The predicted boys' average rating would
be closest to which of the 5 categories described above?
c) The traits plotted above are those the researchers believe are "positive" traits, such
as "mature," "honest," and "polite." The researchers thought that on average girls
would rate these positive traits as characteristic of girls to a greater extent than
boys would. What aspects of the plot and/or regression analysis presented above
are consistent with this thinking?
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
32
life without their physical capture and handling. In a recent study of
bobcat (Lynx rufus) abundance, camera traps were placed at varying
distances from a road. The data on trapping success from 8 trapping
stations are presented in the table at right. The trapping success is
Remote camera trapping is used to detect and monitor elusive wild-
defined as the number of captures per 100 trap-nights.
a) What is the least squares line for predicting trap success using
the distance from the road?
b) Sketch the least squares line on the
scatter plot.
d) One of the trap sites is 1459 meters from the road. Calculate the residual for this
trap site.
e) Suppose that the trap sites you analyzed above are representative of the
population of all trap sites. In general, do you think the linear model is a good
model for these data? Justify your answer by appealing to the scatterplot and
other summary statistics.
bobcat (Lynx rufus) abundance, camera traps were placed at varying
distances from a road. The data on trapping success from 8 trapping
stations are presented in the table at right. The trapping success is
Remote camera trapping is used to detect and monitor elusive wild-

a) What is the least squares line for predicting trap success using
the distance from the road?
b) Sketch the least squares line on the
scatter plot.


trap site.
e) Suppose that the trap sites you analyzed above are representative of the
population of all trap sites. In general, do you think the linear model is a good
model for these data? Justify your answer by appealing to the scatterplot and
other summary statistics.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck
33
life without their physical capture and handling. In a recent study of
bobcat (Lynx rufus) abundance, camera traps were placed at varying
distances from a road. The data on trapping success from 8 trapping
stations are presented in the table at right. The trapping success is
Remote camera trapping is used to detect and monitor elusive wild-
defined as the number of captures per 100 trap-nights.
When children are discharged from the hospital their parents may still need to provide
substantial care, such as using a feeding tube. It is difficult for parents to know how
far to insert the tube, especially with rapidly growing infants. Health care
professionals believe it may be possible for parents to measure their child's height
and use a regression equation to calculate the appropriate insertion length. At a major
children's hospital, children and adolescents' heights (H) and esophagus lengths (E)
were measured and a regression analysis performed. The data from this analysis is
summarized below:
a) For a child with a height one standard deviation above the mean height, what
would be the predicted esophagus length?
b) What proportion of the variability in esophagus length is accounted for by height?
c) From the information presented above, does it appear that the esophagus length
can be accurately predicted from the height of young patients? Provide statistical
evidence for your response.
bobcat (Lynx rufus) abundance, camera traps were placed at varying
distances from a road. The data on trapping success from 8 trapping
stations are presented in the table at right. The trapping success is
Remote camera trapping is used to detect and monitor elusive wild-

When children are discharged from the hospital their parents may still need to provide
substantial care, such as using a feeding tube. It is difficult for parents to know how
far to insert the tube, especially with rapidly growing infants. Health care
professionals believe it may be possible for parents to measure their child's height
and use a regression equation to calculate the appropriate insertion length. At a major
children's hospital, children and adolescents' heights (H) and esophagus lengths (E)
were measured and a regression analysis performed. The data from this analysis is
summarized below:

would be the predicted esophagus length?
b) What proportion of the variability in esophagus length is accounted for by height?
c) From the information presented above, does it appear that the esophagus length
can be accurately predicted from the height of young patients? Provide statistical
evidence for your response.
Unlock Deck
Unlock for access to all 33 flashcards in this deck.
Unlock Deck
k this deck