Exam 4: Descriptive Data Mining
Exam 1: Introduction49 Questions
Exam 2: Descriptive Statistics84 Questions
Exam 3: Data Visualization69 Questions
Exam 4: Descriptive Data Mining56 Questions
Exam 5: Probability: an Introduction to Modeling Uncertainty62 Questions
Exam 6: Statistical Inference62 Questions
Exam 7: Linear Regression71 Questions
Exam 8: Time Series Analysis and Forecasting58 Questions
Exam 9: Predictive Data Mining40 Questions
Exam 10: Spreadsheet Models64 Questions
Exam 11: Linear Optimization Models58 Questions
Exam 12: Integer Linear Optimization Models56 Questions
Exam 13: Nonlinear Optimization Models55 Questions
Exam 14: Monte Carlo Simulation59 Questions
Exam 15: Decision Analysis58 Questions
Select questions type
The process of eliminating variables from formal analysis without losing any crucial information is called
(Multiple Choice)
4.9/5
(32)
If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two observations of a cluster?
(Multiple Choice)
4.8/5
(37)
The strength of the association rule is known as ____________ and is calculated as the ratio of the confidence of an association rule to the benchmark confidence.
(Multiple Choice)
4.8/5
(32)
In preparing categorical variables for analysis, it is usually best to
(Multiple Choice)
4.9/5
(40)
Using the data given, apply hierarchical clustering with 5 clusters using Wait Time (min), Purchase Amount ($),
Customer Age, and Customer Satisfaction Rating as variables. Be sure to Normalize input data in Step 2 of the
XLMiner Hierarchical Clustering procedure. Use Ward's method as the clustering method.
a. Use a PivotTable on the data in the HC_Clusters1 worksheet to compute the cluster centers for the five clusters
in the hierarchical clustering.
b. Identify the cluster with the largest average waiting time. Using all the variables, how would you characterize
this cluster?
c. Identify the smallest cluster.
d. By examining the dendrogram on the HC_Dendrogram worksheet (as well as the sequence of clustering stages
in HC_Output1), what number of clusters seems to be the most natural fit based on the distance?
(Essay)
4.9/5
(35)
Heirarchial clusting using ____________ results in a sequence of aggregated clusters that minimizes the loss of information between the individual observation level and the cluster level
(Multiple Choice)
4.8/5
(34)
A ___________ refers to the number of times a collection of items occur together in a transaction data set.
(Multiple Choice)
4.9/5
(32)
____________________ clustering method defines the similarity between two clusters as the similarity of the pair of observations (one from each cluster) that are the most different.
(Short Answer)
4.8/5
(33)
Platinum Gym has 10,000 gyms members out of which 1500 memberships included Unlimited Fitness Training and use of the tanning salon, and out of which 750 included Unlimited Hydromassage. If the Fitness Training are considered A, the use of the tanning salon are considered B, and the Hydromassage are considered C, then the associate rule for these sales become "If A and B are purchased, then C is also purchased." Calculate the confidence level.
(Short Answer)
4.8/5
(40)
The lift ratio of an association rule with a confidence value of 0.45 and in which the consequent occurs in 6 out of 10 cases is
(Multiple Choice)
4.8/5
(32)
___________________ can be used to partition observations in a manner to obtain clusters with the least amount of information loss due to the aggregation.
(Multiple Choice)
4.9/5
(42)
In which of the following scenarios would it be appropriate to use hierarchical clustering?
(Multiple Choice)
4.8/5
(40)
Single linkage is a measure of calculating dissimilarity between clusters by
(Multiple Choice)
4.9/5
(39)
Jaccard's coefficient is different from the matching coefficient in that the former
(Multiple Choice)
4.8/5
(33)
A cluster's _____________ can be measured by the difference between the distance value at which a cluster is originally formed and the distance value at which it is merged with another cluster in a dendrogram.
(Multiple Choice)
4.8/5
(40)
Suppose we had a data set of from a call center where customers were asked to choose between the following three options:hear account information, billing questions, and customer service. Using the given order of the three options, and using 0-1 dummy variables to encode the categorical variables, which of the following combinations would yield an entry "customer service"?
(Multiple Choice)
4.8/5
(29)
Showing 21 - 40 of 56
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)