Exam 4: Descriptive Data Mining
Exam 1: Introduction35 Questions
Exam 2: Descriptive Statistics65 Questions
Exam 3: Data Visualization47 Questions
Exam 4: Descriptive Data Mining44 Questions
Exam 5: Probability: an Introduction to Modeling Uncertainty36 Questions
Exam 6: Statistical Inference47 Questions
Exam 7: Linear Regression46 Questions
Exam 8: Time Series Analysis and Forecasting41 Questions
Exam 9: Predictive Data Mining38 Questions
Exam 10: Spreadsheet Models49 Questions
Exam 11: Monte Carlo Simulation41 Questions
Exam 12: Linear Optimization Models38 Questions
Exam 13: Integer Linear Optimization Models42 Questions
Exam 14: Nonlinear Optimization Models46 Questions
Exam 15: Decision Analysis40 Questions
Select questions type
Single linkage can be used to measure the distance between clusters that are the __________ in cluster analysis.
(Multiple Choice)
4.8/5
(30)
The strength of a cluster can be measured by comparing the average distance in a cluster to the distance between cluster centroids. One rule of thumb is that the ratio for between-cluster distance to within-cluster distance should exceed what value for useful clusters?
(Multiple Choice)
4.7/5
(28)
Suppose that the confidence of an association rule is 0.75 and the total number of transactions is 250. How many of those transactions support the consequent if the lift ratio is 1.875?
(Multiple Choice)
4.9/5
(39)
A method for modifying variables that reduces bias prior to cluster analysis is
(Multiple Choice)
4.9/5
(34)
A collection of text documents to be analyzed is called a ___________.
(Multiple Choice)
4.9/5
(40)
In preparing categorical variables for analysis, it is usually best to
(Multiple Choice)
4.9/5
(39)
The goal of __________ is to use the variable values to identify relationships between observations.
(Multiple Choice)
4.9/5
(36)
A tree diagram used to illustrate the sequence of nested clusters produced by hierarchical clustering is known as a
(Multiple Choice)
4.9/5
(30)
__________ is a measure of calculating dissimilarity between clusters by considering only the two most dissimilar observations in the two clusters.
(Multiple Choice)
4.9/5
(38)
If the Euclidean distance were to be represented in a right triangle, which of the following would be considered the distance between two observations of a cluster?
(Multiple Choice)
4.9/5
(42)
Euclidean distance can be used to measure the distance between __________ in cluster analysis.
(Multiple Choice)
4.8/5
(34)
Euclidean distance can be used to calculate the dissimilarity between two observations. Let u = (25, $350) correspond to a 25-year-old customer that spent $350 at Store A in the previous fiscal year. Let v = (53, $420) correspond to a 53-year-old customer that spent $4,100 at Store A in the previous fiscal year. Calculate the dissimilarity between these two observations using Euclidean distance.
(Multiple Choice)
4.9/5
(39)
A __________ refers to the number of times a collection of items occurs together in a transaction data set.
(Multiple Choice)
5.0/5
(40)
In the text mining process, the text is first preprocessed by deriving a smaller set of _________ from the larger set of words contained in a collection of documents.
(Multiple Choice)
4.7/5
(39)
The process of extracting useful information from text data is known as __________.
(Multiple Choice)
4.9/5
(27)
Single linkage is a measure of calculating dissimilarity between clusters by
(Multiple Choice)
4.8/5
(42)
When clustering only by dummy variables that represent categorical variables, the simplest measure of similarity between two observations is called the
(Multiple Choice)
4.9/5
(47)
__________ can be used to partition observations in a manner to obtain clusters with the least amount of information loss due to the aggregation.
(Multiple Choice)
4.9/5
(31)
Showing 21 - 40 of 44
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)