Multiple Choice
You are designing a cloud-native historical data processing system to meet the following conditions: The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Cloud Dataproc, BigQuery, and Compute Engine. A streaming data pipeline stores new data daily. Peformance is not a factor in the solution. The solution design should maximize availability. How should you design data storage for this solution?
A) Create a Cloud Dataproc cluster with high availability. Store the data in HDFS, and peform analysis as needed.
B) Store the data in BigQuery. Access the data using the BigQuery Connector on Cloud Dataproc and Compute Engine.
C) Store the data in a regional Cloud Storage bucket. Access the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.
D) Store the data in a multi-regional Cloud Storage bucket. Access the data directly using Cloud Dataproc, BigQuery, and Compute Engine.
Correct Answer:

Verified
Correct Answer:
Verified
Q146: Which of the following statements is NOT
Q147: You have a data pipeline that writes
Q148: Your company is loading comma-separated values (CSV)
Q149: You are building a data pipeline on
Q150: You receive data files in CSV format
Q152: What are all of the BigQuery operations
Q153: You have an Apache Kafka cluster on-prem
Q154: You work for a shipping company that
Q155: Which of the following statements about the
Q156: Your company is selecting a system to