Deck 3: Finding Relationships Among Variables

Full screen (f)
exit full mode
Question
Correlation can be affected by the measurement scales applied to X and Y variables.
Use Space or
up arrow
down arrow
to flip the card.
Question
Correlation is a single-number summary of a scatterplot.
Question
Statisticians often refer to the pivot tables that display counts as contingency tables or crosstabs.
Question
To form a scatterplot of X versus Y,X and Y must be paired variables.
Question
A trend line on a scatterplot is a line or a curve that "fits" the scatter as well as possible.
Question
It is possible that the data points are close to a curve and have a correlation close to 0,because correlation is relevant only for measuring linear relationships.
Question
We must specify appropriate bins for side-by-side histograms in order to make fair comparisons of distributions by category.
Question
Relationships between two variables are less evident when counts are expressed as percentages of row totals or column totals.
Question
Problems in data analysis where we want to compare a numerical variable across two or more subpopulations are called comparison problems.
Question
We do not even try to interpret correlations numerically except possibly to check whether they are positive or negative.
Question
Strongly related variables have a relationship close to zero if the relationship is nonlinear.
Question
Side-by-side box plots allow you to quickly see how two or more categories of a numerical variable compare.
Question
If the standard deviation of X is 15,the covariance of X and Y is 94.5,the coefficient of correlation r = 0.90,then the variance of Y is 7.0.
Question
The cutoff for defining a large correlation is The cutoff for defining a large correlation is  <div style=padding-top: 35px>
Question
An example of a joint category of two variables is the count of all non-drinkers who are also nonsmokers.
Question
The advantage that the coefficient of correlation has over the covariance is that the former has a set lower and upper limit.
Question
The Filters field of a pivot table contains the data you want summarize.
Question
If the standard deviations of X and Y are 15.5 and 10.8,respectively,and the covariance of X and Y is 128.8,then the coefficient of correlation r is approximately 0.77.
Question
The correlation between two variables is a unitless and is always between -1 and +1.
Question
If the coefficient of correlation r = 0 .80,the standard deviations of X and Y are 20 and 25,respectively,then Cov(X,Y)must be 400.
Question
Correlation and covariance can be used to examine relationships between numerical variables as well as for categorical variables that have been coded numerically.
Question
We are usually on the lookout for large correlations near

A) +1
B) -1
C) Either of these options
D) Neither of these options
Question
The tool that provides useful information about a data set by breaking it down into categories is the

A) histogram
B) scatterplot
C) pivot table
D) spreadsheet
Question
To examine relationships between two categorical variables,we can use

A) counts and corresponding charts of the counts
B) scatterplots
C) histograms
D) none of these options
Question
Displaying all correlations between 0.6 and 0.999 on a scatterplot as green and all correlations between -1.0 and -0.6 as red is known as

A) rank-order formatting
B) categorical formatting
C) coded formatting
D) numerical formatting
E) conditional formatting
Question
The limitation of covariance as a descriptive measure of association is that it

A) only captures positive relationships
B) does not capture the units of the variables
C) is very sensitive to the units of the variables
D) is invalid if one of the variables is categorical
E) none of these options
Question
The scatterplot is a graphical technique used to make apparent the relationship between two numerical variables.
Question
Scatterplots are also referred to as

A) crosstabs
B) contingency charts
C) X-Y charts
D) all of these options
E) none of these options
Question
Correlation is useful only for

A) assessing the weakness of a linear relationship
B) conveying the same information in a simpler format than a scatterplot
C) measuring the strength of a linear relationship
D) automatically calculating covariances
E) measuring the strength of a nonlinear relationship
Question
A line or curve superimposed on a scatterplot to quantify an apparent relationship is known as a(n)

A) average
B) trend line
C) data point
D) positive variable
E) slope
Question
We study relationships among numerical variables using

A) correlation
B) covariance
C) scatterplot charts
D) all of these options
E) none of these options
Question
The Excel function that allows you to count using more than one criterion is

A) COUNTIF
B) COUNTIFS
C) SUMPRODUCT
D) VLOOKUP
E) HLOOKUP
Question
A useful way of comparing the distribution of a numerical variable across categories of some categorical variable is

A) side-by-side box plot
B) side-by-side pivot table
C) both of these options
D) neither of these options
Question
The most common data format is

A) long
B) short
C) stacked
D) unstacked
Question
Correlation and covariance measure

A) the strength of a linear relationship between two numerical variables
B) the direction of a linear relationship between two numerical variables
C) the strength and direction of a linear relationship between two numerical variables
D) the strength and direction of a linear relationship between two categorical variables
E) none of these options
Question
Tables used to display counts of a categorical variable are called

A) crosstabs
B) contingency tables
C) both of these options
D) neither of these options
Question
We can infer that there is a strong relationship between two numerical variables when

A) the points on a scatterplot cluster tightly around an upward sloping straight line
B) the points on a scatterplot cluster tightly around a downward sloping straight line
C) either of these options
D) neither of these options
Question
If the correlation of variables is close to 0,then we expect to see

A) an upward sloping cluster of points on the scatterplot
B) a downward sloping cluster of points on the scatterplot
C) a cluster of points around a trendline on the scatterplot
D) a cluster of points with no apparent relationship on the scatterplot
E) no explanation of what the scatterplot should look like based on the correlation
Question
Example of comparison problems include

A) salary broken down by male and female subpopulations
B) cost of living broken down by region of a country
C) recovery rate for a disease broken down by patients who have taken a drug and patients who have taken a placebo
D) Starting salary of recent graduates broken down by academic major
E) all of these options
Question
Counts for categorical variable are often expressed as percentages of the total.
Question
The four areas of a pivot table are

A) Crosstabs,Fields,Rows,Columns
B) Data,Count,Contingency,Percentage
C) Filters,Rows,Columns,Values
D) Sort,Rows,Columns,Count
Question
Changing the location of fields in a pivot table is known as

A) slicing
B) dicing
C) sorting
D) pivoting
Question
Which of the following are considered numerical summary measures?

A) mean and variance
B) variance and correlation
C) correlation and covariance
D) covariance and variance
E) first quartile and third quartile
Question
A scatterplot allows one to see

A) whether there is any relationship between two variables
B) what type of relationship there is between two variables
C) Both options are correct.
D) Neither option is correct.
An economic development researcher wants to understand the relationship between the average monthly expenditure on utilities for households in a particular middle-class neighborhood and each of the following household variables: family size,approximate location of the household within the neighborhood,and indication of whether those surveyed owned or rented their home,gross annual income of the first household wage earner,gross annual income of the second household wage earner (if applicable),size of the monthly home mortgage or rent payment,and the total indebtedness (excluding the value of a home mortgage)of the household.
The correlation for each pairing of variables are shown in the table below:
Table of correlations
 Family Size Location Ownership First hoome Second lnoome  Monthly Payment Utilties Debt:  Family Size1.000Location 0.0101.000Dunership 0.0250.3861.000First hoome 0.0630.5370.4451.000Second hoome 0.0580.5080.4240.8841.000Monthly Payment 0.0760.5110.5520.5140.4781.000 Utilities0.2560.3460.9350.3880.3660.4891.000Debt 0.2940.4610.7440.5600.5230.6050.7781.000\begin{array}{| l| lllll| l}\begin{array}{l|llll}\hline &\text { Family Size }&\text {Location }&\text {Ownership }&\text {First hoome }&\text {Second lnoome }&\text { Monthly Payment }&\text {Utilties }&\text {Debt: }\\\hline \text { Family Size}&1.000 & & & & \\ \text {Location }&-0.010 & 1.000 & & & \\ \text {Dunership }&-0.025 & -0.386 & 1.000 & & \\ \text {First hoome }&-0.063 & -0.537 & 0.445 & 1.000 & \\ \text {Second hoome }&-0.058 & -0.508 & 0.424 & 0.884 & 1.000 \\ \text {Monthly Payment }&-0.076 & -0.511 & 0.552 & 0.514 & 0.478 &1.000\\ \text { Utilities}&0.256 & -0.346 & 0.935 & 0.388 & 0.366&0.489&1.000 \\ \text {Debt }&0.294 & -0.461 & 0.744 & 0.560 & 0.523&0.605&0.778&1.000\\\hline \end{array}\end{array}
Question
The tables of counts that result from pivot tables are often called

A) samples
B) sub-tables
C) specimens
D) crosstabs
Question
One characteristic of "paired variables" is

A) one is a negative value and the other is a positive value
B) both are positive values
C) they have the same number of observations
D) they have a variable number of observations
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/46
auto play flashcards
Play
simple tutorial
Full screen (f)
exit full mode
Deck 3: Finding Relationships Among Variables
1
Correlation can be affected by the measurement scales applied to X and Y variables.
False
2
Correlation is a single-number summary of a scatterplot.
True
3
Statisticians often refer to the pivot tables that display counts as contingency tables or crosstabs.
True
4
To form a scatterplot of X versus Y,X and Y must be paired variables.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
5
A trend line on a scatterplot is a line or a curve that "fits" the scatter as well as possible.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
6
It is possible that the data points are close to a curve and have a correlation close to 0,because correlation is relevant only for measuring linear relationships.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
7
We must specify appropriate bins for side-by-side histograms in order to make fair comparisons of distributions by category.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
8
Relationships between two variables are less evident when counts are expressed as percentages of row totals or column totals.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
9
Problems in data analysis where we want to compare a numerical variable across two or more subpopulations are called comparison problems.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
10
We do not even try to interpret correlations numerically except possibly to check whether they are positive or negative.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
11
Strongly related variables have a relationship close to zero if the relationship is nonlinear.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
12
Side-by-side box plots allow you to quickly see how two or more categories of a numerical variable compare.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
13
If the standard deviation of X is 15,the covariance of X and Y is 94.5,the coefficient of correlation r = 0.90,then the variance of Y is 7.0.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
14
The cutoff for defining a large correlation is The cutoff for defining a large correlation is
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
15
An example of a joint category of two variables is the count of all non-drinkers who are also nonsmokers.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
16
The advantage that the coefficient of correlation has over the covariance is that the former has a set lower and upper limit.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
17
The Filters field of a pivot table contains the data you want summarize.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
18
If the standard deviations of X and Y are 15.5 and 10.8,respectively,and the covariance of X and Y is 128.8,then the coefficient of correlation r is approximately 0.77.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
19
The correlation between two variables is a unitless and is always between -1 and +1.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
20
If the coefficient of correlation r = 0 .80,the standard deviations of X and Y are 20 and 25,respectively,then Cov(X,Y)must be 400.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
21
Correlation and covariance can be used to examine relationships between numerical variables as well as for categorical variables that have been coded numerically.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
22
We are usually on the lookout for large correlations near

A) +1
B) -1
C) Either of these options
D) Neither of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
23
The tool that provides useful information about a data set by breaking it down into categories is the

A) histogram
B) scatterplot
C) pivot table
D) spreadsheet
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
24
To examine relationships between two categorical variables,we can use

A) counts and corresponding charts of the counts
B) scatterplots
C) histograms
D) none of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
25
Displaying all correlations between 0.6 and 0.999 on a scatterplot as green and all correlations between -1.0 and -0.6 as red is known as

A) rank-order formatting
B) categorical formatting
C) coded formatting
D) numerical formatting
E) conditional formatting
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
26
The limitation of covariance as a descriptive measure of association is that it

A) only captures positive relationships
B) does not capture the units of the variables
C) is very sensitive to the units of the variables
D) is invalid if one of the variables is categorical
E) none of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
27
The scatterplot is a graphical technique used to make apparent the relationship between two numerical variables.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
28
Scatterplots are also referred to as

A) crosstabs
B) contingency charts
C) X-Y charts
D) all of these options
E) none of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
29
Correlation is useful only for

A) assessing the weakness of a linear relationship
B) conveying the same information in a simpler format than a scatterplot
C) measuring the strength of a linear relationship
D) automatically calculating covariances
E) measuring the strength of a nonlinear relationship
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
30
A line or curve superimposed on a scatterplot to quantify an apparent relationship is known as a(n)

A) average
B) trend line
C) data point
D) positive variable
E) slope
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
31
We study relationships among numerical variables using

A) correlation
B) covariance
C) scatterplot charts
D) all of these options
E) none of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
32
The Excel function that allows you to count using more than one criterion is

A) COUNTIF
B) COUNTIFS
C) SUMPRODUCT
D) VLOOKUP
E) HLOOKUP
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
33
A useful way of comparing the distribution of a numerical variable across categories of some categorical variable is

A) side-by-side box plot
B) side-by-side pivot table
C) both of these options
D) neither of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
34
The most common data format is

A) long
B) short
C) stacked
D) unstacked
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
35
Correlation and covariance measure

A) the strength of a linear relationship between two numerical variables
B) the direction of a linear relationship between two numerical variables
C) the strength and direction of a linear relationship between two numerical variables
D) the strength and direction of a linear relationship between two categorical variables
E) none of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
36
Tables used to display counts of a categorical variable are called

A) crosstabs
B) contingency tables
C) both of these options
D) neither of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
37
We can infer that there is a strong relationship between two numerical variables when

A) the points on a scatterplot cluster tightly around an upward sloping straight line
B) the points on a scatterplot cluster tightly around a downward sloping straight line
C) either of these options
D) neither of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
38
If the correlation of variables is close to 0,then we expect to see

A) an upward sloping cluster of points on the scatterplot
B) a downward sloping cluster of points on the scatterplot
C) a cluster of points around a trendline on the scatterplot
D) a cluster of points with no apparent relationship on the scatterplot
E) no explanation of what the scatterplot should look like based on the correlation
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
39
Example of comparison problems include

A) salary broken down by male and female subpopulations
B) cost of living broken down by region of a country
C) recovery rate for a disease broken down by patients who have taken a drug and patients who have taken a placebo
D) Starting salary of recent graduates broken down by academic major
E) all of these options
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
40
Counts for categorical variable are often expressed as percentages of the total.
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
41
The four areas of a pivot table are

A) Crosstabs,Fields,Rows,Columns
B) Data,Count,Contingency,Percentage
C) Filters,Rows,Columns,Values
D) Sort,Rows,Columns,Count
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
42
Changing the location of fields in a pivot table is known as

A) slicing
B) dicing
C) sorting
D) pivoting
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
43
Which of the following are considered numerical summary measures?

A) mean and variance
B) variance and correlation
C) correlation and covariance
D) covariance and variance
E) first quartile and third quartile
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
44
A scatterplot allows one to see

A) whether there is any relationship between two variables
B) what type of relationship there is between two variables
C) Both options are correct.
D) Neither option is correct.
An economic development researcher wants to understand the relationship between the average monthly expenditure on utilities for households in a particular middle-class neighborhood and each of the following household variables: family size,approximate location of the household within the neighborhood,and indication of whether those surveyed owned or rented their home,gross annual income of the first household wage earner,gross annual income of the second household wage earner (if applicable),size of the monthly home mortgage or rent payment,and the total indebtedness (excluding the value of a home mortgage)of the household.
The correlation for each pairing of variables are shown in the table below:
Table of correlations
 Family Size Location Ownership First hoome Second lnoome  Monthly Payment Utilties Debt:  Family Size1.000Location 0.0101.000Dunership 0.0250.3861.000First hoome 0.0630.5370.4451.000Second hoome 0.0580.5080.4240.8841.000Monthly Payment 0.0760.5110.5520.5140.4781.000 Utilities0.2560.3460.9350.3880.3660.4891.000Debt 0.2940.4610.7440.5600.5230.6050.7781.000\begin{array}{| l| lllll| l}\begin{array}{l|llll}\hline &\text { Family Size }&\text {Location }&\text {Ownership }&\text {First hoome }&\text {Second lnoome }&\text { Monthly Payment }&\text {Utilties }&\text {Debt: }\\\hline \text { Family Size}&1.000 & & & & \\ \text {Location }&-0.010 & 1.000 & & & \\ \text {Dunership }&-0.025 & -0.386 & 1.000 & & \\ \text {First hoome }&-0.063 & -0.537 & 0.445 & 1.000 & \\ \text {Second hoome }&-0.058 & -0.508 & 0.424 & 0.884 & 1.000 \\ \text {Monthly Payment }&-0.076 & -0.511 & 0.552 & 0.514 & 0.478 &1.000\\ \text { Utilities}&0.256 & -0.346 & 0.935 & 0.388 & 0.366&0.489&1.000 \\ \text {Debt }&0.294 & -0.461 & 0.744 & 0.560 & 0.523&0.605&0.778&1.000\\\hline \end{array}\end{array}
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
45
The tables of counts that result from pivot tables are often called

A) samples
B) sub-tables
C) specimens
D) crosstabs
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
46
One characteristic of "paired variables" is

A) one is a negative value and the other is a positive value
B) both are positive values
C) they have the same number of observations
D) they have a variable number of observations
Unlock Deck
Unlock for access to all 46 flashcards in this deck.
Unlock Deck
k this deck
locked card icon
Unlock Deck
Unlock for access to all 46 flashcards in this deck.