Essay
To examine the local housing market in a particular region, a sample of 120 homes sold during a year are collected. The data are given below:
Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the sale price using k-nearest neighbors with up to k = 10. Use Sale Price as the output variable and all the other variables as input variables. In Step 2 of XLMiner's k-Nearest Neighbors Prediction procedure, be sure to Normalize input data and to Score on best k between 1 and specified value. Generate a Detailed Scoring report for all three sets of data.
a. What value of k minimizes the root mean squared error (RMSE) on the validation data?
b. What is the RMSE on the validation data and test data?
c. What is the average error on the validation data and test data? What does this suggest?
Correct Answer:

Verified
a. A value of k = 2 minimizes the RMSE o...View Answer
Unlock this answer now
Get Access to more Verified Answers free of charge
Correct Answer:
Verified
View Answer
Unlock this answer now
Get Access to more Verified Answers free of charge
Q8: Test set is the data set used
Q25: _ is a generalization of linear regression
Q29: Which of the following is true of
Q30: k-means clustering is the process of:<br>A) agglomerating
Q30: _ is a measure of calculating dissimilarity
Q31: Average linkage is a measure of calculating
Q33: A bank is interested in identifying different
Q37: To examine the local housing market in
Q38: The endpoint of a k-means clustering algorithm
Q44: A cluster's _ can be measured by