Exam 10: Supervised Data Mining: Decision Trees

arrow
  • Select Tags
search iconSearch Question
  • Select Tags

If predictor variables are highly correlated, then repeated sampling of the training data and a random selection of features are used to construct trees. This is an example of which strategy?

(Multiple Choice)
4.9/5
(36)

Using the following pruning table, what does the Rel Error represent? Using the following pruning table, what does the Rel Error represent?

(Multiple Choice)
4.8/5
(24)

A regression tree was developed to predict customer spending for a hotel during football season. One of the leaf nodes consists of six cases in the training set with the following values: 312.00, 350.00, 285.00, 295.00, 423.00, 249.00. What is the predicted spending amount on a hotel for the night for a customer that falls into this leaf node?

(Multiple Choice)
4.8/5
(29)

A regression tree was developed to predict customer spending for a hotel during football season. One of the leaf nodes consists of six cases in the training set with the following values: 312.00, 350.00, 285.00, 295.00, 380.00, 220.00. What is the predicted spending amount on a hotel for the night for a customer that falls into this leaf node?

(Multiple Choice)
4.8/5
(36)

Using the following chart for age and income, determine the split points for income. Using the following chart for age and income, determine the split points for income.

(Multiple Choice)
4.8/5
(40)

To measure impurity in a regression tree, mean square error (MSE) is used.

(True/False)
4.9/5
(42)

If the RMSE for the validation set is 58.78 and the RMSE for the test set is 57.12, then what range will the new data RMSE lie in?

(Multiple Choice)
4.8/5
(32)

In reviewing the split of data, Maggie notes among the 15 cases, 2 belong to Class 1 and the remaining to Class 0. What is the Gini index for the cases and is it pure or impure?

(Multiple Choice)
4.9/5
(30)

Which is not a purpose of running classification and regression trees (CART)?

(Multiple Choice)
4.9/5
(30)

Which description best fits the following tree structure for loan debt balance with a single age predictor? Which description best fits the following tree structure for loan debt balance with a single age predictor?

(Multiple Choice)
4.8/5
(40)

In reviewing the split of data, Maggie notes among the 13 cases, 2 belong to Class 1 and the remaining to Class 0. What is the Gini index for the cases and is it pure or impure?

(Multiple Choice)
4.8/5
(37)

If the performance measures are based on a cutoff value of 0.5, then if we lower the cutoff value, more cases will be in the target class, resulting in different performance measurement values. What chart can be used to review the data that are independent of the cutoff value?

(Multiple Choice)
4.8/5
(37)

When constructing the argument for a bagging tree strategy, the varImpPlot function displays feature importance graphically. For this we set the type argument to either equal 1 or 2. If type = 2, then what does this command?

(Multiple Choice)
4.8/5
(46)

Based on the following values for income, what are the possible split points? {12665, 15432, 28763, 34876, 41967, 52997}

(Multiple Choice)
4.8/5
(36)

The best-pruned tree is the smallest set, least complex tree, with the smallest validation error.

(True/False)
4.8/5
(35)

Which option is not one of the three common strategies used in creating ensemble models?

(Multiple Choice)
4.9/5
(31)

In the following tree, how many leaf nodes are there? In the following tree, how many leaf nodes are there?

(Multiple Choice)
4.8/5
(36)

Based on the following sorted 20 values for age, what are the possible split points? {20, 22, 24, 26, 28, 31, 32, 33, 35, 40, 42, 43, 45, 47, 49, 50, 52, 53, 55, 57}

(Multiple Choice)
4.8/5
(41)

If the RMSE for the validation set is 56.91 and the RMSE for the test set is 55.39, then what range will the new data RMSE lie in?

(Multiple Choice)
4.8/5
(32)

In a random forest model, as a guideline the user needs to select a number of the random features for each tree. If there are 9 predictor variables in the data, each tree will randomly select how many features to be included in the tree?

(Multiple Choice)
4.8/5
(40)
Showing 21 - 40 of 51
close modal

Filters

  • Essay(0)
  • Multiple Choice(0)
  • Short Answer(0)
  • True False(0)
  • Matching(0)