Multiple Choice
You work on a regression problem in a natural language processing domain, and you have 100M labeled exmaples in your dataset. You have randomly shuffled your data and split your dataset into train and test samples (in a 90/10 ratio) . After you trained the neural network and evaluated your model on a test set, you discover that the root-mean-squared error (RMSE) of your model is twice as high on the train set as on the test set. How should you improve the performance of your model?
A) Increase the share of the test sample in the train-test split.
B) Try to collect more data and increase the size of your dataset.
C) Try out regularization techniques (e.g., dropout of batch normalization) to avoid overfitting.
D) Increase the complexity of your model by, e,g., introducing an additional layer or increase sizing the size of vocabularies or n-grams used.
Correct Answer:

Verified
Correct Answer:
Verified
Q51: Business owners at your company have given
Q52: An organization maintains a Google BigQuery dataset
Q53: When using Cloud Dataproc clusters, you can
Q54: You are managing a Cloud Dataproc cluster.
Q55: MJTelco Case Study Company Overview MJTelco is
Q57: Your globally distributed auction application allows users
Q58: How would you query specific partitions in
Q59: You work for a large bank that
Q60: You launched a new gaming app almost
Q61: You are planning to use Google's Dataflow