
Essentials of Business Analytics 1st Edition by Jeffrey Camm,James Cochran,Michael Fry,Jeffrey Ohlmann ,David Anderson
Edition 1ISBN: 978-1285187273
Essentials of Business Analytics 1st Edition by Jeffrey Camm,James Cochran,Michael Fry,Jeffrey Ohlmann ,David Anderson
Edition 1ISBN: 978-1285187273 Exercise 17
As an intern with the local home builder's association, you have been asked to analyze the state of the local housing market that has suffered during a recent economic crisis. you have been provided three data sets in the file HousingBubble. The Pre-Crisis worksheet contains information on 1978 single-family homes sold during the one-year period before the burst of the housing bubble. The Post-Crisis worksheet contains information on 1657 single-family homes sold during the one-year period after the burst of the housing bubble. The NewDataToPredict worksheet contains information on homes currently for sale.
a. Consider the Pre-Crisis worksheet data. Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the sale price using k -nearest neighbors with up to k 5 20. Use Price as the output variable and all the other variables as input variables. In Step 2 of XLMiner's k -Nearest Neighbors Prediction procedure, be sure to Normalize input data and to Score on best k between 1 and specified value. Check the box next to In worksheet in the Score new data area. In the Match variables in the new range dialog box, (1) specify the NewDataToPredict worksheetin the Worksheet: field, (2) enter the cell range A1:P2001 in the Data range: field, and (3) click Match variable(s) with same name(s). Completing the procedure will result in a KNNP_NewScore worksheet that will contain the predicted sales price for each home in NewDataToPredict.
i. What value of k minimizes the root mean squared error (RMSE) on the validation data
ii. What is the RMSE on the validation data and test data
iii. What is the average error on the validation data and test data What does this suggest
b. Repeat part a with the Post-Crisis worksheet data.
c. The KNNP_NewScore1 and KNNP_NewScore2 worksheets contain the sales price predictions for the 2000 homes in the NewDataToPredict using the precrisis and postcrisis data, respectively. For each of these 2000 homes, compare the two predictions by computing the percentage change in predicted price between the precrisis and postcrisis models. Let percentage change 5 (postcrisis predicted price 2 precrisis predicted price)/precrisis predicted price. Summarize these percentage changes with a histogram. What is the average percentage change in predicted price between the precrisis and postcrisis model
a. Consider the Pre-Crisis worksheet data. Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Predict the sale price using k -nearest neighbors with up to k 5 20. Use Price as the output variable and all the other variables as input variables. In Step 2 of XLMiner's k -Nearest Neighbors Prediction procedure, be sure to Normalize input data and to Score on best k between 1 and specified value. Check the box next to In worksheet in the Score new data area. In the Match variables in the new range dialog box, (1) specify the NewDataToPredict worksheetin the Worksheet: field, (2) enter the cell range A1:P2001 in the Data range: field, and (3) click Match variable(s) with same name(s). Completing the procedure will result in a KNNP_NewScore worksheet that will contain the predicted sales price for each home in NewDataToPredict.
i. What value of k minimizes the root mean squared error (RMSE) on the validation data
ii. What is the RMSE on the validation data and test data
iii. What is the average error on the validation data and test data What does this suggest
b. Repeat part a with the Post-Crisis worksheet data.
c. The KNNP_NewScore1 and KNNP_NewScore2 worksheets contain the sales price predictions for the 2000 homes in the NewDataToPredict using the precrisis and postcrisis data, respectively. For each of these 2000 homes, compare the two predictions by computing the percentage change in predicted price between the precrisis and postcrisis models. Let percentage change 5 (postcrisis predicted price 2 precrisis predicted price)/precrisis predicted price. Summarize these percentage changes with a histogram. What is the average percentage change in predicted price between the precrisis and postcrisis model
Explanation
This question doesn’t have an expert verified answer yet, let Examlex AI Copilot help.
Essentials of Business Analytics 1st Edition by Jeffrey Camm,James Cochran,Michael Fry,Jeffrey Ohlmann ,David Anderson
Why don’t you like this exercise?
Other Minimum 8 character and maximum 255 character
Character 255