Multiple Choice
A gaming company has launched an online game where people can start playing for free, but they need to pay if they choose to use certain features. The company needs to build an automated system to predict whether or not a new user will become a paid user within 1 year. The company has gathered a labeled dataset from 1 million users. The training dataset consists of 1,000 positive samples (from users who ended up paying within 1 year) and 999,000 negative samples (from users who did not use any paid features) . Each data sample consists of 200 features including user age, device, location, and play patterns. Using this dataset for training, the Data Science team trained a random forest model that converged with over 99% accuracy on the training set. However, the prediction results on a test dataset were not satisfactory Which of the following approaches should the Data Science team take to mitigate this issue? (Choose two.)
A) Add more deep trees to the random forest to enable the model to learn more features.
B) Include a copy of the samples in the test dataset in the training dataset.
C) Generate more positive samples by duplicating the positive samples and adding a small amount of noise to the duplicated data.
D) Change the cost function so that false negatives have a higher impact on the cost value than false positives.
E) Change the cost function so that false positives have a higher impact on the cost value than false negatives.
Correct Answer:

Verified
Correct Answer:
Verified
Q25: A machine learning specialist stores IoT soil
Q26: During mini-batch training of a neural network
Q27: A Data Engineer needs to build a
Q28: A Machine Learning Specialist trained a regression
Q29: A Mobile Network Operator is building an
Q31: A data scientist has explored and sanitized
Q32: A data scientist needs to identify fraudulent
Q33: A company is setting up an Amazon
Q34: A company is building a predictive maintenance
Q35: A Data Scientist needs to analyze employment