Exam 17: Data Mining

Bridget has partitioned data into two subsets.The original file contains 300,000 observations.The subset she is currently working with has 60,000 observations.Which subset is she most likely to be using?

Free

(Multiple Choice)

4.8/5

(38)

Question 1

Correct Answer:

Verified

If the regression coefficient estimate from a logistic regression is positive,the probability of the dependent variable taking on a value of 1

Free

(Multiple Choice)

4.9/5

(30)

Question 2

Correct Answer:

Verified

Which methodology is used to group products that customers purchase together?

Free

(Multiple Choice)

4.8/5

(30)

Question 3

Correct Answer:

Verified

Clustering tried to group entities into _____ clusters,based on the value of their variables.

(Multiple Choice)

4.8/5

(32)

Question 4

Cluster analysis tries to group observations into clusters so that observations within a cluster a different and observations in different clusters are similar.

(True/False)

4.9/5

(34)

Question 5

In K-Means clustering,K refers to the

(Multiple Choice)

4.8/5

(37)

Question 6

Lift is the increase in the number of purchasers over the typical number of purchasers.

(True/False)

4.9/5

(36)

Question 7

When using data partitioning,the first subset,usually with about 70% to 80% of the records,is called the training data set.

(True/False)

4.8/5

(33)

Question 8

A data mart is typically smaller than a data warehouse.

(True/False)

4.7/5

(27)

Question 9

Logistic regression and neural networks use complex nonlinear functions to capture the relationship between explanatory variables and categorical dependent variables.

(True/False)

4.7/5

(34)

Question 10

It is very useful to partition a large data set into all of the following subsets,except the _____ data set.

(Multiple Choice)

4.7/5

(29)

Question 11

Unsupervised methods have no

(Multiple Choice)

4.9/5

(37)

Question 12

Classification analysis attempts to find variables that are related to a quantitative variable.

(True/False)

4.9/5

(41)

Question 13

The predicted value from a logistic regression will be

(Multiple Choice)

4.9/5

(36)

Question 14

A neural network methodology attempts to mimic

(Multiple Choice)

4.9/5

(35)

Question 15

Suppose the odds of Team A winning are 5 to 1.Then,the odds ratio is

(Multiple Choice)

4.8/5

(44)

Question 16

Mya is investigating the factors that impact soda consumption.She examines a host of variables that help explain the amount consumed.Which type of data mining methodology is she most likely to use?

(Multiple Choice)

4.8/5

(44)

Question 17

Megan is examining the likelihood of people riding the subway.The dependent variable takes on the value of 1 if the individual rides the subway and 0 otherwise.Therefore,she could use logistic regression to examine this question.

(True/False)

4.9/5

(29)

Question 18

When using data partitioning,the second subset,which usually contains the records that were not included in the training data,is called the prediction data set.

(True/False)

4.8/5

(29)

Question 19

The testing set in data partitioning is the

(Multiple Choice)

4.9/5

(39)

Question 20

Showing 1 - 20 of 30

Bridget has partitioned data into two subsets.The original file contains 300,000 observations.The subset she is currently working with has 60,000 observations.Which subset is she most likely to be using?

If the regression coefficient estimate from a logistic regression is positive,the probability of the dependent variable taking on a value of 1

Which methodology is used to group products that customers purchase together?

Clustering tried to group entities into _____ clusters,based on the value of their variables.

Cluster analysis tries to group observations into clusters so that observations within a cluster a different and observations in different clusters are similar.

In K-Means clustering,K refers to the

Lift is the increase in the number of purchasers over the typical number of purchasers.

When using data partitioning,the first subset,usually with about 70% to 80% of the records,is called the training data set.

A data mart is typically smaller than a data warehouse.

Logistic regression and neural networks use complex nonlinear functions to capture the relationship between explanatory variables and categorical dependent variables.

It is very useful to partition a large data set into all of the following subsets,except the _____ data set.

Unsupervised methods have no

Classification analysis attempts to find variables that are related to a quantitative variable.

The predicted value from a logistic regression will be

A neural network methodology attempts to mimic

Suppose the odds of Team A winning are 5 to 1.Then,the odds ratio is

Mya is investigating the factors that impact soda consumption.She examines a host of variables that help explain the amount consumed.Which type of data mining methodology is she most likely to use?

Megan is examining the likelihood of people riding the subway.The dependent variable takes on the value of 1 if the individual rides the subway and 0 otherwise.Therefore,she could use logistic regression to examine this question.

When using data partitioning,the second subset,which usually contains the records that were not included in the training data,is called the prediction data set.

The testing set in data partitioning is the

Introduction to Business Analytics

Describing the Distribution of a Variable

Finding Relationships Among Variables

Business Intelligence Bifor Data Analysis

Probability and Probability Distributions

Decision Making Under Uncertainty

Sampling and Sampling Distributions

Confidence Interval Estimation

Hypothesis Testing

Regression Analysis: Estimating Relationships

Regression Analysis: Statistical Inference

Time Series Analysis and Forecasting

Introduction to Optimization Modeling

Optimization Models

Introduction to Simulation Modeling

Simulation Models

Analysis of Variance and Experimental Design

Statistical Process Control

Filters