Exam 4: Descriptive Data Mining

arrow
  • Select Tags
search iconSearch Question
flashcardsStudy Flashcards
  • Select Tags

Complete linkage can be used to measure the distance between clusters that are the _________________ in cluster analysis.​

(Multiple Choice)
4.8/5
(43)

The __________ the lift ratio, the ____________ the association rule.

(Multiple Choice)
5.0/5
(35)

Using the data given, apply k-means clustering using Wait time (min) as the variable with k = 3. Be sure to Normalize input data, and specify 50 iterations and 10 random starts in Step 2 of the XLMiner k-Means Clustering procedure. Then create one distinct data set for each of the three resulting clusters for waiting time. a. For the observations composing the cluster which has the low waiting time, apply hierarchical clustering with Ward's method to form two clusters using Purchase Amount, Customer Age, and Customer Satisfaction Rating as variables. Be sure to Normalize input data in Step 2 of the XLMiner Hierarchical Clustering procedure. Using a PivotTable on the data in HC_Clusters, report the characteristics of each cluster. b. For the observations composing the cluster which has the medium waiting time, apply hierarchical clustering with Ward's method to form three clusters using Purchase Amount, Customer Age, and Customer Satisfaction Rating as variables. Be sure to Normalize input data in Step 2 of the XLMiner Hierarchical Clustering procedure. Using a PivotTable on the data in HC_Clusters, report the characteristics of each cluster. c. For the observations composing the cluster which has the high waiting time, apply hierarchical clustering with Ward's method to form two clusters using Purchase Amount, Customer Age, and Customer Satisfaction Rating as variables. Be sure to Normalize input data in Step 2 of the XLMiner Hierarchical Clustering procedure. Using a PivotTable on the data in HC_Clusters, report the characteristics of each cluster.

(Essay)
4.7/5
(38)

________________ is a measure that computes the dissimilarity between a cluster AB and a cluster C by averaging the distance between A and C and the distance between B and C.​

(Multiple Choice)
4.8/5
(40)

In which of the following data-mining process steps is the data manipulated to make it suitable for formal modeling?

(Multiple Choice)
4.9/5
(34)

When clustering only by dummy variables that represent categorical variables, the simplest measure of similarity between two observations is called the

(Multiple Choice)
4.8/5
(34)

__________________ is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.

(Multiple Choice)
4.9/5
(31)

To identify patterns across transactions, we can use

(Multiple Choice)
4.8/5
(45)

Using the data given, apply hierarchical clustering with 10 clusters using LandValue ($), BuildingValue ($), Acres, Age, and Price ($) as variables. Be sure to Normalize input data in Step 2 of the XLMiner Hierarchical Clustering procedure. Use Ward's method as the clustering method. a. Use a PivotTable on the data in the HC_Clusters1 worksheet to compute the cluster centers for the clusters in the hierarchical clustering. b. Identify the cluster with the largest average price. Using all the variables, how would you characterize this cluster? c. Identify the smallest cluster.

(Essay)
4.7/5
(40)

The strength of a cluster can be measured by comparing the average distance in a cluster to the distance between cluster centroids. One rule of thumb is that the ratio for between-cluster distance to within-cluster distance should exceed what value for useful clusters?

(Multiple Choice)
4.9/5
(38)

Euclidean distance can be used to measure the distance between________________ in cluster analysis.

(Multiple Choice)
4.9/5
(39)

Using the data given, apply k-means clustering using Price ($) as the variable with k = 3. Be sure to Normalize input data, and specify 50 iterations and 10 random starts in Step 2 of the XLMiner k-Means Clustering procedure. Then create one distinct data set for each of the three resulting clusters of price. a. For the observations composing the cluster with low home price, apply hierarchical clustering with Ward's method to form three clusters using Acres and Age as variables. Be sure to Normalize input data in Step 2 of the XLMiner Hierarchical Clustering procedure. Using a PivotTable on the data in HC_Clusters1, report the characteristics of each cluster. b. For the observations composing the cluster with medium home price, apply hierarchical clustering with Ward's method to form three clusters using Acres and Age as variables. Be sure to Normalize input data in Step 2 of the XLMiner Hierarchical Clustering procedure. Using a PivotTable on the data in HC_Clusters1, report the characteristics of each cluster. c. Comment on the cluster with high home price.

(Essay)
4.9/5
(48)

Platinum Gym has 10,000 gyms members out of which 1500 memberships included Unlimited Fitness Training and use of the tanning salon, and out of which 750 included Unlimited Hydromassage. If the Fitness Training are considered A, the use of the tanning salon are considered B, and the Hydromassage are considered C, then the associate rule for these sales become, "If A and B are purchased, then C is also purchased." Given total transactions for C is 3000. Calculate the lift for this rule.

(Short Answer)
4.7/5
(34)

Data preparation includes all of the following except which task?

(Multiple Choice)
4.8/5
(37)

k-means clustering is the process of

(Multiple Choice)
4.8/5
(37)

Which statement is true of an association rule?​

(Multiple Choice)
4.8/5
(35)
Showing 41 - 56 of 56
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)