Exam 12: Introduction to Data Mining

arrow
  • Select Tags
search iconSearch Question
flashcardsStudy Flashcards
  • Select Tags

In classification, which of the following would be considered as a categorical variable of interest for a credit approval decision for a requester?

Free
(Multiple Choice)
4.8/5
(43)
Correct Answer:
Verified

D

If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two objects of a cluster?

Free
(Multiple Choice)
4.7/5
(34)
Correct Answer:
Verified

A

In a k-Nearest Neighbors algorithm, similarity of records is based on the ________.

Free
(Multiple Choice)
4.8/5
(40)
Correct Answer:
Verified

A

If bs are weights, Xs are input values, and c is a constant or intercept, provide the equation for discriminant functions, L.

(Multiple Choice)
4.8/5
(39)

Spam filtering for e-mails can be seen as an example of which of the following types of approaches of data mining?

(Multiple Choice)
4.9/5
(36)

Which of the following is a definition of distance between two clusters in a complete linkage clustering?

(Multiple Choice)
4.9/5
(33)

Divisive clustering method is different from agglomerative clustering methods in that divisive clustering methods ________.

(Multiple Choice)
4.9/5
(32)

The strength of the association rule, known as lift, is calculated as the ratio of the ________.

(Multiple Choice)
5.0/5
(32)

Expected confidence assumes independence between the consequent and the antecedent.

(True/False)
4.8/5
(37)

A musical instruments retailer has 10,000 point-of-sale transactions out of which 1500 sales included both items of electric guitars and guitar cases, and out of which 750 had sales of new strings.If the electric guitars are considered A, the guitar cases are considered B, and the strings are considered C, then the associate rule for these sales become "If A and B are purchased, then C is also purchased." Calculate the confidence level, expected confidence level, and lift for this rule, given that total transactions for C is 3000.

(Essay)
4.8/5
(35)

When using logistic regression, where p being the probability that the dependent variable Y = 1, X₁, X₂ ..., Xk are the independent variables, and β₀, β₁, β₂ ..., βk are unknown regression coefficients, ________ is called the odds of belonging to category 1(Y = 1).

(Multiple Choice)
4.9/5
(36)

Which of the following features of classification, used in Excel, for a particular database will necessarily be coded to a certain value?

(Multiple Choice)
4.8/5
(35)

Which of the following is true of random partitioning?

(Multiple Choice)
4.9/5
(37)

Explain how data-mining using lagging and leading measures of the cause-and-effect model can help managers make business decisions.

(Essay)
4.9/5
(42)

Which of the following would be considered a lagging measure in a restaurant using the cause-and-modeling method of data mining?

(Multiple Choice)
4.7/5
(37)

Which of the following is true of logistic regression as a classifying method?

(Multiple Choice)
4.7/5
(38)

Which of the following is true of the value of k in the k-Nearest Neighbors algorithm?

(Multiple Choice)
4.8/5
(35)

In cluster analysis, the objects within clusters should exhibit a high amount of dissimilarity.

(True/False)
4.8/5
(31)

The data mining approach called ________ involves the developing of analytic models to describe the relationship between metrics that drive business performance like profitability, customer satisfaction, or employee satisfaction.

(Multiple Choice)
4.9/5
(33)

The accuracy of the model on the test data gives a realistic estimate of the performance of the model on completely unseen data.

(True/False)
4.8/5
(37)
Showing 1 - 20 of 53
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)