Exam 17: Data Mining

arrow
  • Select Tags
search iconSearch Question
flashcardsStudy Flashcards
  • Select Tags

Bridget has partitioned data into two subsets.The original file contains 300,000 observations.The subset she is currently working with has 60,000 observations.Which subset is she most likely to be using?

Free
(Multiple Choice)
4.8/5
(38)
Correct Answer:
Verified

C

If the regression coefficient estimate from a logistic regression is positive,the probability of the dependent variable taking on a value of 1

Free
(Multiple Choice)
4.9/5
(30)
Correct Answer:
Verified

C

Which methodology is used to group products that customers purchase together?

Free
(Multiple Choice)
4.8/5
(30)
Correct Answer:
Verified

A

Clustering tried to group entities into _____ clusters,based on the value of their variables.

(Multiple Choice)
4.8/5
(32)

Cluster analysis tries to group observations into clusters so that observations within a cluster a different and observations in different clusters are similar.

(True/False)
4.9/5
(34)

In K-Means clustering,K refers to the

(Multiple Choice)
4.8/5
(37)

Lift is the increase in the number of purchasers over the typical number of purchasers.

(True/False)
4.9/5
(36)

When using data partitioning,the first subset,usually with about 70% to 80% of the records,is called the training data set.

(True/False)
4.8/5
(33)

A data mart is typically smaller than a data warehouse.

(True/False)
4.7/5
(27)

Logistic regression and neural networks use complex nonlinear functions to capture the relationship between explanatory variables and categorical dependent variables.

(True/False)
4.7/5
(34)

It is very useful to partition a large data set into all of the following subsets,except the _____ data set.

(Multiple Choice)
4.7/5
(29)

Unsupervised methods have no

(Multiple Choice)
4.9/5
(37)

Classification analysis attempts to find variables that are related to a quantitative variable.

(True/False)
4.9/5
(41)

The predicted value from a logistic regression will be

(Multiple Choice)
4.9/5
(36)

A neural network methodology attempts to mimic

(Multiple Choice)
4.9/5
(35)

Suppose the odds of Team A winning are 5 to 1.Then,the odds ratio is

(Multiple Choice)
4.8/5
(44)

Mya is investigating the factors that impact soda consumption.She examines a host of variables that help explain the amount consumed.Which type of data mining methodology is she most likely to use?

(Multiple Choice)
4.8/5
(44)

Megan is examining the likelihood of people riding the subway.The dependent variable takes on the value of 1 if the individual rides the subway and 0 otherwise.Therefore,she could use logistic regression to examine this question.

(True/False)
4.9/5
(29)

When using data partitioning,the second subset,which usually contains the records that were not included in the training data,is called the prediction data set.

(True/False)
4.8/5
(29)

The testing set in data partitioning is the

(Multiple Choice)
4.9/5
(39)
Showing 1 - 20 of 30
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)