Deck 3: Numerical Methods for Describing Data Distributions
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/37
Play
Full screen (f)
Deck 3: Numerical Methods for Describing Data Distributions
1
The variance is the positive square root of the standard deviation.
False
2
The following nitrous oxide emissions (in thousands of tons) were recorded for the 48 states in the continental United States:
Use the above data to compute the values in the five-number summary.
A)Minimum = 0, lower quartile = 17.5, median = 34, upper quartile = 66, maximum = 155
B)Minimum = 0, lower quartile = 11, median = 33, upper quartile = 64, maximum = 150
C)Minimum = 0, lower quartile = 18.5, median = 34, upper quartile = 63, maximum = 151
D)Minimum = 0, lower quartile = 16.5, median = 31, upper quartile = 60, maximum = 130
E)Minimum = 0, lower quartile = 14, median = 40, upper quartile = 63, maximum = 130


Use the above data to compute the values in the five-number summary.
A)Minimum = 0, lower quartile = 17.5, median = 34, upper quartile = 66, maximum = 155
B)Minimum = 0, lower quartile = 11, median = 33, upper quartile = 64, maximum = 150
C)Minimum = 0, lower quartile = 18.5, median = 34, upper quartile = 63, maximum = 151
D)Minimum = 0, lower quartile = 16.5, median = 31, upper quartile = 60, maximum = 130
E)Minimum = 0, lower quartile = 14, median = 40, upper quartile = 63, maximum = 130
Minimum = 0, lower quartile = 17.5, median = 34, upper quartile = 66, maximum = 155
3
By definition, an outlier is "extreme" if it is more than 3.0 iqr away from the closest quartile.
True
4
The accompanying data are consistent with summary statistics in a paper investigating the effect of the shape of drinking glasses. The data are the actual amount (in ml) poured into a tall, slender glass by individuals asked to pour 1.5 ounces (44.3 ml) into the glass. Compute the values of the mean and standard deviation.
A)
46.30 ounces;
28.24 ounces
B)
44.24 ounces;
17.18 ounces
C)
59.23 ounces;
16.71 ounces
D)
42.90 ounces;
15.23 ounces
E)
66.55 ounces;
20.23 ounces

A)


B)


C)


D)


E)


Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
5
Exhibit 3-2
In 1990 the New York Times reported the average number of square feet per standing passenger in 1987 and 1988 for 19 subway stops. Although the sampling method was not reported, we will presume that these data represent a random sample of days during the morning and evening rush hours. The NYC Transit Authority managers attempted to improve the space problem on subway cars (more space is better--trust us!) by adding cars to trains during the rush hours. They gathered the 1988 data to check on their efforts after one year. The data are in the table below.
Refer to Exhibit 3-2.
(a)Construct a comparative stem & leaf plot of the space per standing passenger for the morning rushes of 1987 vs. the morning rushes of 1988.
(b)Using your plot in part (a), describe the differences and similarities in the distributions of the morning standing room for the two years.
In 1990 the New York Times reported the average number of square feet per standing passenger in 1987 and 1988 for 19 subway stops. Although the sampling method was not reported, we will presume that these data represent a random sample of days during the morning and evening rush hours. The NYC Transit Authority managers attempted to improve the space problem on subway cars (more space is better--trust us!) by adding cars to trains during the rush hours. They gathered the 1988 data to check on their efforts after one year. The data are in the table below.

Refer to Exhibit 3-2.
(a)Construct a comparative stem & leaf plot of the space per standing passenger for the morning rushes of 1987 vs. the morning rushes of 1988.
(b)Using your plot in part (a), describe the differences and similarities in the distributions of the morning standing room for the two years.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
6
The mean is the middle value of an ordered data set.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
7
The questions below refer to decisions made by statisticians and referred to in the text. Answer each of the following questions in a few sentences:
(a)Why was
rather than
chosen to be used in building the formula for the variance?
(b)Why is s more commonly used than s2 as a more useful measure of variability?
(c)Why was
chosen over
when using s2 to estimate σ2?
(d)What factor should be considered when choosing between the median or mean as a description of the "center" of a data set?
(a)Why was


(b)Why is s more commonly used than s2 as a more useful measure of variability?
(c)Why was


(d)What factor should be considered when choosing between the median or mean as a description of the "center" of a data set?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
8
Costs per serving (in cents) for 16 high-fiber cereals rated very good or good by Consumer Reports are shown below:
Compute the mean and standard deviation for the above data set.
A)
50.4 cents;
11.1 cents
B)
40.4 cents;
15.1 cents
C)
19.4 cents;
12.1 cents
D)
30.4 cents;
10.1 cents
E)
48.2 cents;
17.2 cents

Compute the mean and standard deviation for the above data set.
A)


B)


C)


D)


E)


Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
9
Exhibit 3-2
In 1990 the New York Times reported the average number of square feet per standing passenger in 1987 and 1988 for 19 subway stops. Although the sampling method was not reported, we will presume that these data represent a random sample of days during the morning and evening rush hours. The NYC Transit Authority managers attempted to improve the space problem on subway cars (more space is better--trust us!) by adding cars to trains during the rush hours. They gathered the 1988 data to check on their efforts after one year. The data are in the table below.
Refer to Exhibit 3-2.
The MTA guidelines in 1987 specified a minimum of 3 square feet per standing passenger. The engineers would like to report standardized measures (z-scores) of this target value. That is, for each year and time of day, they will report how far away the target value of 3 feet is relative to the different distributions.
(a)Consider the original passenger space data for the morning rush in 1987, used in Exhibit 3-2. What are the mean and standard deviation for the sample?
=
s =
(b)How many standard deviations above/below the mean is the target value of 3 feet for the distribution in part (a)?
In 1990 the New York Times reported the average number of square feet per standing passenger in 1987 and 1988 for 19 subway stops. Although the sampling method was not reported, we will presume that these data represent a random sample of days during the morning and evening rush hours. The NYC Transit Authority managers attempted to improve the space problem on subway cars (more space is better--trust us!) by adding cars to trains during the rush hours. They gathered the 1988 data to check on their efforts after one year. The data are in the table below.

Refer to Exhibit 3-2.
The MTA guidelines in 1987 specified a minimum of 3 square feet per standing passenger. The engineers would like to report standardized measures (z-scores) of this target value. That is, for each year and time of day, they will report how far away the target value of 3 feet is relative to the different distributions.
(a)Consider the original passenger space data for the morning rush in 1987, used in Exhibit 3-2. What are the mean and standard deviation for the sample?

s =
(b)How many standard deviations above/below the mean is the target value of 3 feet for the distribution in part (a)?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
10
The dot plot below shows the annual salaries, in thousands of dollars, of a sample of non-management staff who work for a small banking company.
Which of the following statistics would be the most reasonable estimate of the typical annual salary?
A)mean
B)mode
C)median
D)interquartile range
E)range

A)mean
B)mode
C)median
D)interquartile range
E)range
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
11
The interquartile range is resistant to the effect of outliers.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
12
A common topic of study among biologists who study social insects is colony integration, i.e. the ways members of a society interact with and influence each other. Previous research suggests that the queen of a wasp nest seems to encourage worker activity. That research is anecdotal, and quantitative information is lacking. The following data were gathered from wasp nests in California and Iowa. The nests were studied very early in their development, so each wasp could be individually identified using standard marking procedures, and their activity levels monitored reliably.
The table below contains the proportions of time spent on the nest for 12 randomly selected colonies in Iowa. (The colony numbers are for identification in the table only.) There is only one queen per nest, so the proportion is for her only. The workers proportion is the average proportion for all the worker bees in the nest.
(a)Construct a comparative (back to back) stem & leaf plot of the proportions of time spent on the nest by Queens, and by Workers for these colonies.
(b)Using your plot in part (a), describe the differences and similarities in the distributions of the proportion of time spent on the nest by workers and queens.
The table below contains the proportions of time spent on the nest for 12 randomly selected colonies in Iowa. (The colony numbers are for identification in the table only.) There is only one queen per nest, so the proportion is for her only. The workers proportion is the average proportion for all the worker bees in the nest.

(a)Construct a comparative (back to back) stem & leaf plot of the proportions of time spent on the nest by Queens, and by Workers for these colonies.
(b)Using your plot in part (a), describe the differences and similarities in the distributions of the proportion of time spent on the nest by workers and queens.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
13
The table below summarizes the number of minutes spent exercising each day for a sample of 30 female college students.
Construct a graphical display of the data distribution and then indicate what summary measures you would use to describe center and spread.
A)center = mean; spread = standard deviation
B)center = mean; spread = interquartile range
C)center = median; spread = standard deviation
D)center = median; spread = interquartile range
E)center = mode; spread = range

A)center = mean; spread = standard deviation
B)center = mean; spread = interquartile range
C)center = median; spread = standard deviation
D)center = median; spread = interquartile range
E)center = mode; spread = range
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
14
Suppose that the distribution of weekly water usage for single-family homes in a particular city is mound shaped and approximately symmetric. The mean is 1,300 gallons, and the standard deviation is 200 gallons.
What is the approximate value of the 98th percentile?
A)1,700 gallons
B)1,100 gallons
C)1,500 gallons
D)1,900 gallons
E)1,300 gallons
What is the approximate value of the 98th percentile?
A)1,700 gallons
B)1,100 gallons
C)1,500 gallons
D)1,900 gallons
E)1,300 gallons
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
15
The mean number of text messages sent per month by customers of a cell phone service provider is 1,850, and the standard deviation is 800. Find the z-score associated with 3,000 messages sent.
A)z = 2.75
B)z = 1.44
C)z = -1.44
D)z = 1.19
E)z = 40.66
A)z = 2.75
B)z = 1.44
C)z = -1.44
D)z = 1.19
E)z = 40.66
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
16
If there are no outliers, a skeletal and modified boxplot can differ in the length of the box, but not in the whisker lengths.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the five boxplots shown in figure corresponds to a data set with the 5-number summary below?
Minimum: 60
Lower Quartile: 69
Median: 76
Upper Quartile: 82
Maximum: 89
A)Boxplot 1
B)Boxplot 2
C)Boxplot 3
D)Boxplot 4
E)Boxplot 5
Minimum: 60
Lower Quartile: 69
Median: 76
Upper Quartile: 82
Maximum: 89

A)Boxplot 1
B)Boxplot 2
C)Boxplot 3
D)Boxplot 4
E)Boxplot 5
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
18
For any given data set, the median must be greater than or equal to the lower quartile, and less than or equal to the upper quartile.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
19
Data on tipping percent for 20 restaurant tables are:
Compute the value of the median for the above data set.
A)The median tipping percentage is 20%.
B)The median tipping percentage is 22%.
C)The median tipping percentage is 22.5%.
D)The median tipping percentage is 23.5%.
E)The median tipping percentage is 28.2%.

Compute the value of the median for the above data set.
A)The median tipping percentage is 20%.
B)The median tipping percentage is 22%.
C)The median tipping percentage is 22.5%.
D)The median tipping percentage is 23.5%.
E)The median tipping percentage is 28.2%.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
20
Data on weekday exercise time (in minutes) for 20 males are shown below. Compute the interquartile range.
Male-Weekday
A)56.6
B)58.6
C)64.75
D)62.25
E)67.5
Male-Weekday

A)56.6
B)58.6
C)64.75
D)62.25
E)67.5
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
21
A wide variety of oak trees grow in the United States. In one study a sample of acorns was collected from different locations, and their volumes, in cm3, were recorded. In the table below are summary statistics for these data.
(a)Describe a procedure that uses these some or all of these summary statistics to determine whether outliers are present in the data.
(b)Using your procedure from part (a), determine if there are outliers in these data.

(a)Describe a procedure that uses these some or all of these summary statistics to determine whether outliers are present in the data.
(b)Using your procedure from part (a), determine if there are outliers in these data.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
22
Spotrac publishes professional football players' salaries. The table below contains information about base salaries for fifteen players with the highest salaries in Buffalo Bills team in the year 2016. ( HYPERLINK "http://www.spotrac.com/nfl/rankings/2016/base/buffalo-bills/"
Choose the most appropriate description for the center of this data.
A)On average, the base salary is 3.5 millions of dollars, because the mean of the data set is 3.5.This means that the base salary for each player in Buffalo Bills is 3.5 millions of dollars.
B)On average, the base salary is 2.6 millions of dollars, because the median of the data set is 2.6.This means that 2.6 millions of dollars is the typical value of base salary for players in Buffalo Bills.
C)On average, the base salary is 3.5 millions of dollars, because the mean of the data set is 3.5.This means that 2.6 millions of dollars is the typical value of base salary for players in Buffalo Bills.
D)On average, the base salary is 2.6 millions of dollars, because the median of the data set is 2.6.This means that the base salary for each player is 2.6 millions of dollars.

A)On average, the base salary is 3.5 millions of dollars, because the mean of the data set is 3.5.This means that the base salary for each player in Buffalo Bills is 3.5 millions of dollars.
B)On average, the base salary is 2.6 millions of dollars, because the median of the data set is 2.6.This means that 2.6 millions of dollars is the typical value of base salary for players in Buffalo Bills.
C)On average, the base salary is 3.5 millions of dollars, because the mean of the data set is 3.5.This means that 2.6 millions of dollars is the typical value of base salary for players in Buffalo Bills.
D)On average, the base salary is 2.6 millions of dollars, because the median of the data set is 2.6.This means that the base salary for each player is 2.6 millions of dollars.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
23
Which statistical parameters of the numerical data distribution are commonly used to describe a variability of the distribution?
A)mode and interquartile range
B)mean and standard deviation
C)median and interquartile range
D)mean and median
E)interquartile range and standard deviation
A)mode and interquartile range
B)mean and standard deviation
C)median and interquartile range
D)mean and median
E)interquartile range and standard deviation
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
24
Astronomers are interested in the recessional velocity of galaxies--that is, the speed at which they are moving away from the Milky Way. The accompanying table contains the recessional velocities for a sample of galaxies, measured in km/sec. Negative velocity indicates the galaxy is moving towards us.
(a)Calculate these numerical summaries:
The mean _______________
The standard deviation _______________
The median _______________
The interquartile range _______________
(b)Construct a skeletal box plot for these data.
(c)Judging from the data and your responses in parts (a) and (b), would you say this distribution is skewed or approximately symmetric? Justify your response using appropriate statistical terminology.

(a)Calculate these numerical summaries:
The mean _______________
The standard deviation _______________
The median _______________
The interquartile range _______________
(b)Construct a skeletal box plot for these data.
(c)Judging from the data and your responses in parts (a) and (b), would you say this distribution is skewed or approximately symmetric? Justify your response using appropriate statistical terminology.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
25
The Territory of Iowa was initially surveyed in the 1830's. The surveyors were very careful to note the trees and vegetation; it was believed at that time that the richness of the soil could be measured by the density of trees encountered. The sample of Ash tree diameters from the original survey of what is now Linn County, Iowa, is presented in the stem and leaf plot below. The display uses five lines for each stem. Thus, "1t|" is the stem for diameters of 12 and 13, "1f|" for 14 and 15, "1s|" for 16 and 17, and so on. (The "t" then stands for leaves that are twos and threes, the "f" for leaves of fours and fives, etc.)The mean diameter of ash trees in this sample is 11.500 inches, and the standard deviation is 3.842 inches.
(a)What is the approximate diameter of an ash tree at the 20th percentile in this distribution?
(b)The Empirical Rule would suggest that 68% of ash tree diameters are between what two values?

(a)What is the approximate diameter of an ash tree at the 20th percentile in this distribution?
(b)The Empirical Rule would suggest that 68% of ash tree diameters are between what two values?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
26
The % by weight of the element Lead was calculated for the coins minted in Rome. Some of the summary statistics are given in the table below.
(a)Describe a procedure that uses these some or all of these summary statistics to determine whether outliers are present in the data.
(b)Using your procedure from part (a), determine if there are outliers in these data.

(a)Describe a procedure that uses these some or all of these summary statistics to determine whether outliers are present in the data.
(b)Using your procedure from part (a), determine if there are outliers in these data.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
27
The National U.S. Bureau of Labor Statistics (https://www.bls.gov/web/laus/laumstrk.htm) published the average unemployment rate in November 2017 for States (see table below). Find the center of this data set and interpret it.
Choose the most appropriate description for the center of this data.
A)On average, the unemployment rate is 4.1% because the mean of the data set is 4.1.This means that the unemployment rate for each state is 4.1%.
B)On average, the unemployment rate is 4.0% because the median of the data set is 4.0.This means that 4.0% is the typical value of unemployment rate for the states.
C)On average, the unemployment rate is 4.1% because the mean of the data set is 4.1.This means that 4.1% is the typical value of unemployment rate for the states.
D)On average, the unemployment rate is 4.0% because the median of the data set is 4.0.It is This means that the unemployment rate for each state is 4.0%.

Choose the most appropriate description for the center of this data.
A)On average, the unemployment rate is 4.1% because the mean of the data set is 4.1.This means that the unemployment rate for each state is 4.1%.
B)On average, the unemployment rate is 4.0% because the median of the data set is 4.0.This means that 4.0% is the typical value of unemployment rate for the states.
C)On average, the unemployment rate is 4.1% because the mean of the data set is 4.1.This means that 4.1% is the typical value of unemployment rate for the states.
D)On average, the unemployment rate is 4.0% because the median of the data set is 4.0.It is This means that the unemployment rate for each state is 4.0%.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
28
Exhibit 3-1
In recent years there has been considerable discussion about the appropriateness of the body shapes and proportions of Ken and Barbie dolls. These dolls are very popular, and there is some concern that the dolls may be viewed as having the "ideal body shape," potentially leading young children to risk anorexia in pursuit of that ideal. Researchers investigating the dolls' body shapes scaled Ken and Barbie up to a common height of 170.18 cm (5' 7") and compared them to body measurements of active adults. Common measures of body shape are the chest (bust), waist, and hip circumferences. These measurements for Ken and Barbie and their reference groups are presented in the table below:
For the following questions, suppose that the researchers' scaled up dolls suddenly found themselves in the human world of actual men and women.
Refer to Exhibit 3-1.
(a)Convert Barbie's chest, waist, and hips measurements to z-scores. Which of those measures appears to be the most different from Barbie's reference group? Justify your response with an appropriate statistical argument.
(b)The z-scores for Ken's Chest, Waist, and Hips when compared to active male adults are approximately −3.4, −2.5, and −3.2 respectively. Do these z-scores provide evidence to justify the claim that the Ken doll is a thin representation of adult men? Justify your response
(c)If women's waist measurements are approximately normally distributed, based on the sample above what is the approximate percentile of an 80 cm waist?
In recent years there has been considerable discussion about the appropriateness of the body shapes and proportions of Ken and Barbie dolls. These dolls are very popular, and there is some concern that the dolls may be viewed as having the "ideal body shape," potentially leading young children to risk anorexia in pursuit of that ideal. Researchers investigating the dolls' body shapes scaled Ken and Barbie up to a common height of 170.18 cm (5' 7") and compared them to body measurements of active adults. Common measures of body shape are the chest (bust), waist, and hip circumferences. These measurements for Ken and Barbie and their reference groups are presented in the table below:

Refer to Exhibit 3-1.
(a)Convert Barbie's chest, waist, and hips measurements to z-scores. Which of those measures appears to be the most different from Barbie's reference group? Justify your response with an appropriate statistical argument.
(b)The z-scores for Ken's Chest, Waist, and Hips when compared to active male adults are approximately −3.4, −2.5, and −3.2 respectively. Do these z-scores provide evidence to justify the claim that the Ken doll is a thin representation of adult men? Justify your response
(c)If women's waist measurements are approximately normally distributed, based on the sample above what is the approximate percentile of an 80 cm waist?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
29
The Territory of Iowa was initially surveyed in the 1830's. The surveyors were very careful to note the trees and vegetation; it was believed at that time that the richness of the soil could be measured by the density of trees encountered. The sample of hickory tree diameters from the original survey of what is now Linn County, Iowa, is presented in the stem and leaf plot below. The display uses five lines for each stem. Thus, "1t|" is the stem for diameters of 12 and 13, "1f|" for 14 and 15, "1s|" for 16 and 17, and so on. (The "t" then stands for leaves that are twos and threes, the "f" for leaves of fours and fives, etc.)The mean diameter of hickory trees in this sample is 11.849 inches, and the standard deviation is 3.995 inches.
(a)What is the approximate diameter of a hickory tree at the 20th percentile in this distribution?
(b)The Empirical Rule would suggest that 68% of hickory tree diameters are between what two values?

(a)What is the approximate diameter of a hickory tree at the 20th percentile in this distribution?
(b)The Empirical Rule would suggest that 68% of hickory tree diameters are between what two values?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
30
The data in the table below are from observations taken on Roman coins dating from the first three centuries AD. Historians believe that different mints might reveal themselves in different trace element profiles in the coins; these coins are known to have been minted in Rome. The technique of Atomic Absorption Spectrometry was used to estimate the % by weight of various elements in these coins; the % by weight that is gold is presented here.
(a)Calculate these numerical summaries:
The mean _______________
The standard deviation _______________
The median _______________
The interquartile range _______________
(b)Construct a skeletal box plot for these data.
(c)Judging from the data and your responses in parts (a) and (b), would you say this distribution is skewed or approximately symmetric? Justify your response using appropriate statistical terminology.

(a)Calculate these numerical summaries:
The mean _______________
The standard deviation _______________
The median _______________
The interquartile range _______________
(b)Construct a skeletal box plot for these data.
(c)Judging from the data and your responses in parts (a) and (b), would you say this distribution is skewed or approximately symmetric? Justify your response using appropriate statistical terminology.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
31
Exhibit 3-1
In recent years there has been considerable discussion about the appropriateness of the body shapes and proportions of Ken and Barbie dolls. These dolls are very popular, and there is some concern that the dolls may be viewed as having the "ideal body shape," potentially leading young children to risk anorexia in pursuit of that ideal. Researchers investigating the dolls' body shapes scaled Ken and Barbie up to a common height of 170.18 cm (5' 7") and compared them to body measurements of active adults. Common measures of body shape are the chest (bust), waist, and hip circumferences. These measurements for Ken and Barbie and their reference groups are presented in the table below:
For the following questions, suppose that the researchers' scaled up dolls suddenly found themselves in the human world of actual men and women.
Refer to Exhibit 3-1.
(a)Convert Ken's chest, waist, and hips measurements to z-scores. Which of those measures appears to be the most different from Ken's reference group? Justify your response with an appropriate statistical argument.
(b)The z-scores for Barbie's Chest, Waist, and Hips when compared to active female adults are approximately −1.4, −6.2, and −4.7 respectively. Do these z-scores provide evidence to justify the claim that the Barbie doll is a thin representation of adult women? Justify your response with an appropriate statistical argument.
(c)If men's waist measurements are approximately normally distributed, based on the sample above what is the approximate percentile of a 100 cm waist?
In recent years there has been considerable discussion about the appropriateness of the body shapes and proportions of Ken and Barbie dolls. These dolls are very popular, and there is some concern that the dolls may be viewed as having the "ideal body shape," potentially leading young children to risk anorexia in pursuit of that ideal. Researchers investigating the dolls' body shapes scaled Ken and Barbie up to a common height of 170.18 cm (5' 7") and compared them to body measurements of active adults. Common measures of body shape are the chest (bust), waist, and hip circumferences. These measurements for Ken and Barbie and their reference groups are presented in the table below:

Refer to Exhibit 3-1.
(a)Convert Ken's chest, waist, and hips measurements to z-scores. Which of those measures appears to be the most different from Ken's reference group? Justify your response with an appropriate statistical argument.
(b)The z-scores for Barbie's Chest, Waist, and Hips when compared to active female adults are approximately −1.4, −6.2, and −4.7 respectively. Do these z-scores provide evidence to justify the claim that the Barbie doll is a thin representation of adult women? Justify your response with an appropriate statistical argument.
(c)If men's waist measurements are approximately normally distributed, based on the sample above what is the approximate percentile of a 100 cm waist?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
32
The data in the table below are the changes in the amount of space available to standing passengers at the 19 stops between 1987 and 1988.
In the table below, summary information is presented for these data.
(a)Using the raw data and summary information presented in the tables above, construct box plots to compare the changes in available space the morning and afternoon. (Reminder: Don't forget to check for outliers!)(b)The Transit System wishes to know if their efforts to improve the standing space were successful. (Remember, more space is better!) Their engineers had suggested that the changes in the system would, on average, be more successful at increasing the available space in the morning than in the afternoon. Does the data support this initial belief? What specific aspects of the plot in part (a) support your answer?
(c)Using your box plots in part (a), write a short paragraph for the New York Times describing the success the Transit System had in increasing the available space per passenger. Note any differences in success between the morning rush and the afternoon rush.



(a)Using the raw data and summary information presented in the tables above, construct box plots to compare the changes in available space the morning and afternoon. (Reminder: Don't forget to check for outliers!)(b)The Transit System wishes to know if their efforts to improve the standing space were successful. (Remember, more space is better!) Their engineers had suggested that the changes in the system would, on average, be more successful at increasing the available space in the morning than in the afternoon. Does the data support this initial belief? What specific aspects of the plot in part (a) support your answer?
(c)Using your box plots in part (a), write a short paragraph for the New York Times describing the success the Transit System had in increasing the available space per passenger. Note any differences in success between the morning rush and the afternoon rush.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
33
To compare the average weekly earnings of full-time wages of male and female workers, twenty men and twenty women were selected. The following table contains average weekly wages of each person in the current month.
Use boxplot to make comparisons between weekly salary for man and woman. 
A)From the comparative boxplot, you can see that both data distributions are approximately symmetric, but wages of male workers tend to be higher than wages of female workers.We can conclude that wages of male workers are greater than wages of female workers.
B)From the comparative boxplot, you can see that both data distributions are approximately symmetric, but wages of female workers tend to be higher than wages of male workers.We can conclude that wages of female workers are greater than wages of male workers.
C)From the comparative boxplot, you can see that both data distributions are skewed to the left, but the mean of the data for men lies higher than the mean of the data for women.So, we can conclude that wages of male workers are greater than wages of female workers.
D)From the comparative boxplot, you can see that both data distributions are skewed to the left, but the mean of the data for women lies higher than the mean of the data for men.So, we can conclude that wages of male workers are greater than wages of female workers.
E)It is not a representative sample, and we cannot make any conclusions about the difference between the wages of male and female workers.


A)From the comparative boxplot, you can see that both data distributions are approximately symmetric, but wages of male workers tend to be higher than wages of female workers.We can conclude that wages of male workers are greater than wages of female workers.
B)From the comparative boxplot, you can see that both data distributions are approximately symmetric, but wages of female workers tend to be higher than wages of male workers.We can conclude that wages of female workers are greater than wages of male workers.
C)From the comparative boxplot, you can see that both data distributions are skewed to the left, but the mean of the data for men lies higher than the mean of the data for women.So, we can conclude that wages of male workers are greater than wages of female workers.
D)From the comparative boxplot, you can see that both data distributions are skewed to the left, but the mean of the data for women lies higher than the mean of the data for men.So, we can conclude that wages of male workers are greater than wages of female workers.
E)It is not a representative sample, and we cannot make any conclusions about the difference between the wages of male and female workers.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
34
In order to attribute changes in nest activity to the active influence of the queen, 9 nests were randomly selected for experimental treatment. The normal activity of the nests were measured, and treatments were administered as described below:
Queens and Workers: Experimental Treatments
Normal treatment:
The observations were made of the undisturbed state of the nest as originally located.
Queen removed:
The queen was removed from the nest for about an hour.
Queen cooled/inactive:
The queen was removed to a cool environment, making her inactive, and returned to the nest; thus, she is present in the nest, but not able to interact with the worker wasps.
The data for the control treatment and each of the experimental treatments is given in the table below. The Activity Levels are the average proportion of active nest time for workers who were present in the normal and two experimental treatment periods. For example, 0.50 would mean for that nest the average amount of time the workers were actively working was 0.50 of the available time during that treatment. (The colony numbers are for identification in the table only.)
(a)Construct comparative box plots of the active nest times for (a) normal, (b) Queen removed, and (c) Queen Cooled/Inactive. (Note: since the data are proportions, there are no actual units for the data.)(b)Two current theories about the queen's interaction with workers are that (A) the queen increases worker activity by her mere presence, or (B) she increases worker activity by interacting with worker bees. Based on your plot in part (a), which theory--if either--is supported by the data? Justify your conclusion with an appropriate statistical argument.
(c)What are the mean and standard deviation of the proportion of worker activity for undisturbed wasp nests.
=
s =
(d)One of the nests (Colony #4) has a proportion of worker activity of 0.46. How many standard deviations above/below the mean is the worker activity level in this nest?
Queens and Workers: Experimental Treatments
Normal treatment:
The observations were made of the undisturbed state of the nest as originally located.
Queen removed:
The queen was removed from the nest for about an hour.
Queen cooled/inactive:
The queen was removed to a cool environment, making her inactive, and returned to the nest; thus, she is present in the nest, but not able to interact with the worker wasps.
The data for the control treatment and each of the experimental treatments is given in the table below. The Activity Levels are the average proportion of active nest time for workers who were present in the normal and two experimental treatment periods. For example, 0.50 would mean for that nest the average amount of time the workers were actively working was 0.50 of the available time during that treatment. (The colony numbers are for identification in the table only.)


(a)Construct comparative box plots of the active nest times for (a) normal, (b) Queen removed, and (c) Queen Cooled/Inactive. (Note: since the data are proportions, there are no actual units for the data.)(b)Two current theories about the queen's interaction with workers are that (A) the queen increases worker activity by her mere presence, or (B) she increases worker activity by interacting with worker bees. Based on your plot in part (a), which theory--if either--is supported by the data? Justify your conclusion with an appropriate statistical argument.
(c)What are the mean and standard deviation of the proportion of worker activity for undisturbed wasp nests.

s =
(d)One of the nests (Colony #4) has a proportion of worker activity of 0.46. How many standard deviations above/below the mean is the worker activity level in this nest?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
35
Consider a study in which the heights of a sample of 1000 high school seniors were recorded. The mean height is 70" and the standard deviation of the heights is 3". It is observed that the height distribution is approximately normal.
(a)Approximately what percent of heights in this sample would exceed 79"?
(b)What is the approximate percentile of a senior who is 73" tall?
(c)When the data were summarized the value of the first quartile was written down but then smudged. There is general agreement that the writer meant to indicate either 66" or 68". Which of these values is most likely the correct one? Justify your answer with appropriate statistical reasoning.
(a)Approximately what percent of heights in this sample would exceed 79"?
(b)What is the approximate percentile of a senior who is 73" tall?
(c)When the data were summarized the value of the first quartile was written down but then smudged. There is general agreement that the writer meant to indicate either 66" or 68". Which of these values is most likely the correct one? Justify your answer with appropriate statistical reasoning.
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
36
Which statistical parameters of the numerical data distribution are commonly used to describe a center of the distribution?
A)mean and mode
B)mean and standard deviation
C)mode and median
D)mean and median
E)interquartile range and standard deviation
A)mean and mode
B)mean and standard deviation
C)mode and median
D)mean and median
E)interquartile range and standard deviation
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck
37
Grey Kangaroos are large, social marsupials, indigenous to Australia. As part of a study of the development of these creatures, biologists have measured various aspects of their skeletal structure. A sample of 148 palate widths was taken from skeletal remains and the data is presented in the stem-and-leaf plot below. The distributions of skeletal measures are generally thought to be approximately normal, which is consistent with the stem-and-leaf plot for this sample. The display uses five lines for each stem. Thus, "2t|" is the stem for palate widths of 22 and 23, "2f|" for 24 and 25, "2s|" for 26 and 27, and so on. (The "t" then stands for leaves that are twos and threes, the "f" for leaves of fours and fives, etc.) The mean palate width of this sample is 2.6 cm, and the standard deviation is 0.3 cm.
(a)If this sample is a good representation of the population of Grey Kangaroos, approximately what percent of palate widths in this sample would exceed 2.9 cm?
(b)What is the approximate percentile of a palate width that is 2.0 cm?
(c)One measure of variability is the length of an interval that contains the middle 90% of the values of a distribution. For these data, how long is that interval?

(a)If this sample is a good representation of the population of Grey Kangaroos, approximately what percent of palate widths in this sample would exceed 2.9 cm?
(b)What is the approximate percentile of a palate width that is 2.0 cm?
(c)One measure of variability is the length of an interval that contains the middle 90% of the values of a distribution. For these data, how long is that interval?
Unlock Deck
Unlock for access to all 37 flashcards in this deck.
Unlock Deck
k this deck