Exam 18: Professional Data Engineer on Google Cloud Platform

Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dashboard for the CFO. During testing, you notice that some messages are missing in the dashboard. You check the logs, and all messages are being published to Cloud Pub/Sub successfully. What should you do next?

(Multiple Choice)

4.9/5

(37)

Question 181

When you store data in Cloud Bigtable, what is the recommended minimum amount of stored data?

(Multiple Choice)

4.8/5

(35)

Question 182

Your company is in the process of migrating its on-premises data warehousing solutions to BigQuery. The existing data warehouse uses trigger-based change data capture (CDC) to apply updates from multiple transactional database sources on a daily basis. With BigQuery, your company hopes to improve its handling of CDC so that changes to the source systems are available to query in BigQuery in near-real time using log-based CDC streams, while also optimizing for the performance of applying changes to the data warehouse. Which two steps should they take to ensure that changes are available in the BigQuery reporting table with minimal latency while reducing compute overhead? (Choose two.)

(Multiple Choice)

4.8/5

(33)

Question 183

You want to build a managed Hadoop system as your data lake. The data transformation process is composed of a series of Hadoop jobs executed in sequence. To accomplish the design of separating storage from compute, you decided to use the Cloud Storage connector to store all input data, output data, and intermediary data. However, you noticed that one Hadoop job runs very slowly with Cloud Dataproc, when compared with the on-premises bare-metal Hadoop environment (8-core nodes with 100-GB RAM). Analysis shows that this particular Hadoop job is disk I/O intensive. You want to resolve the issue. What should you do?

(Multiple Choice)

4.9/5

(35)

Question 184

You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics. Your design used a single database table to represent all patients and their visits, and you used self-joins to generate reports. The server resource utilization was at 50%. Since then, the scope of the project has expanded. The database must now store 100 times more patient records. You can no longer run the reports, because they either take too long or they encounter errors with insufficient compute resources. How should you adjust the database design?

(Multiple Choice)

4.8/5

(31)

Question 185

You are migrating your data warehouse to BigQuery. You have migrated all of your data into tables in a dataset. Multiple users from your organization will be using the data. They should only see certain tables based on their team membership. How should you set user permissions?

(Multiple Choice)

4.9/5

(35)

Question 186

Cloud Bigtable is Google's ______ Big Data database service.

(Multiple Choice)

4.7/5

(38)

Question 187

Your company's on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration. What should you do?

(Multiple Choice)

4.9/5

(28)

Question 188

To give a user read permission for only the first three columns of a table, which access control method would you use?

(Multiple Choice)

4.9/5

(35)

Question 189

An online retailer has built their current application on Google App Engine. A new initiative at the company mandates that they extend their application to allow their customers to transact directly via the application. They need to manage their shopping transactions and analyze combined data from multiple datasets using a business intelligence (BI) tool. They want to use only a single database for this purpose. Which Google Cloud database should they choose?

(Multiple Choice)

4.9/5

(37)

Question 190

Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?

(Multiple Choice)

4.8/5

(32)

Question 191

You work for an advertising company, and you've developed a Spark ML model to predict click-through rates at advertisement blocks. You've been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be migrated to BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud. What should you do?

(Multiple Choice)

4.9/5

(35)

Question 192

Google Cloud Bigtable indexes a single value in each row. This value is called the _______.

(Multiple Choice)

4.8/5

(39)

Question 193

You use BigQuery as your centralized analytics platform. New data is loaded every day, and an ETL pipeline modifies the original data and prepares it for the final users. This ETL pipeline is regularly modified and can generate errors, but sometimes the errors are detected only after 2 weeks. You need to provide a method to recover from these errors, and your backups should be optimized for storage costs. How should you organize your data in BigQuery and store your backups?

(Multiple Choice)

4.9/5

(32)

Question 194

You are a retailer that wants to integrate your online sales capabilities with different in-home assistants, such as Google Home. You need to interpret customer voice commands and issue an order to the backend systems. Which solutions should you choose?

(Multiple Choice)

4.9/5

(34)

Question 195

You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the "Trust No One" (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your data. What should you do?

(Multiple Choice)

4.9/5

(40)

Question 196

You are updating the code for a subscriber to a Pub/Sub feed. You are concerned that upon deployment the subscriber may erroneously acknowledge messages, leading to message loss. Your subscriber is not set up to retain acknowledged messages. What should you do to ensure that you can recover from errors after deployment?

(Multiple Choice)

4.9/5

(38)

Question 197

You want to automate execution of a multi-step data pipeline running on Google Cloud. The pipeline includes Cloud Dataproc and Cloud Dataflow jobs that have multiple dependencies on each other. You want to use managed services where possible, and the pipeline will run every day. Which tool should you use?

(Multiple Choice)

4.7/5

(35)

Question 198

You work for a manufacturing company that sources up to 750 different components, each from a different supplier. You've collected a labeled dataset that has on average 1000 examples for each unique component. Your team wants to implement an app to help warehouse workers recognize incoming components based on a photo of the component. You want to implement the first working version of this app (as Proof-Of-Concept) within a few working days. What should you do?

(Multiple Choice)

4.8/5

(44)

Question 199

You are selecting services to write and transform JSON messages from Cloud Pub/Sub to BigQuery for a data pipeline on Google Cloud. You want to minimize service costs. You also want to monitor and accommodate input data volume that will vary in size with minimal manual intervention. What should you do?

(Multiple Choice)

4.7/5

(42)

Question 200

Showing 181 - 200 of 256

When you store data in Cloud Bigtable, what is the recommended minimum amount of stored data?

You are migrating your data warehouse to BigQuery. You have migrated all of your data into tables in a dataset. Multiple users from your organization will be using the data. They should only see certain tables based on their team membership. How should you set user permissions?

Cloud Bigtable is Google's ______ Big Data database service.

To give a user read permission for only the first three columns of a table, which access control method would you use?

Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?

Google Cloud Bigtable indexes a single value in each row. This value is called the _______.

You are a retailer that wants to integrate your online sales capabilities with different in-home assistants, such as Google Home. You need to interpret customer voice commands and issue an order to the backend systems. Which solutions should you choose?

You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the "Trust No One" (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your data. What should you do?

Google AdWords: Display Advertising

Google AdWords Fundamentals

Associate Android Developer

Associate Cloud Engineer

Cloud Digital Leader

Google Analytics Individual Qualification (IQ)

Google Analytics Individual Qualification

GSuite

Looker Business Analyst

LookML Developer

Mobile Web Specialist

Professional Cloud Architect on Google Cloud Platform

Professional Cloud Developer

Professional Cloud DevOps Engineer

Professional Cloud Network Engineer

Professional Cloud Security Engineer

Professional Collaboration Engineer

Professional Machine Learning Engineer

Filters