Multiple Choice
You have historical data covering the last three years in BigQuery and a data pipeline that delivers new data to BigQuery daily. You have noticed that when the Data Science team runs a query filtered on a date column and limited to 30-90 days of data, the query scans the entire table. You also noticed that your bill is increasing more quickly than you expected. You want to resolve the issue as cost-effectively as possible while maintaining the ability to conduct SQL queries. What should you do?
A) Re-create the tables using DDL. Partition the tables by a column containing a TIMESTAMP or DATE Type.
B) Recommend that the Data Science team export the table to a CSV file on Cloud Storage and use Cloud Datalab to explore the data by reading the files directly.
C) Modify your pipeline to maintain the last 30-90 days of data in one table and the longer history in a different table to minimize full table scans over the entire history.
D) Write an Apache Beam pipeline that creates a BigQuery table per day. Recommend that the Data Science team use wildcards on the table name suffixes to select the data they need.
Correct Answer:

Verified
Correct Answer:
Verified
Q96: Which of these is not a supported
Q97: You set up a streaming data insert
Q98: You are operating a streaming Cloud Dataflow
Q99: What Dataflow concept determines when a Window's
Q100: You have Cloud Functions written in Node.js
Q102: What are two of the benefits of
Q103: You want to use a BigQuery table
Q104: Cloud Dataproc charges you only for what
Q105: You need to move 2 PB of
Q106: You store historic data in Cloud Storage.