Exam 18: Professional Data Engineer on Google Cloud Platform
Exam 1: Google AdWords: Display Advertising122 Questions
Exam 2: Google AdWords Fundamentals153 Questions
Exam 3: Associate Android Developer86 Questions
Exam 4: Associate Cloud Engineer134 Questions
Exam 5: Cloud Digital Leader91 Questions
Exam 6: Google Analytics Individual Qualification (IQ)121 Questions
Exam 7: Google Analytics Individual Qualification78 Questions
Exam 8: GSuite202 Questions
Exam 9: Looker Business Analyst388 Questions
Exam 10: LookML Developer41 Questions
Exam 11: Mobile Web Specialist13 Questions
Exam 12: Professional Cloud Architect on Google Cloud Platform118 Questions
Exam 13: Professional Cloud Developer85 Questions
Exam 14: Professional Cloud DevOps Engineer28 Questions
Exam 15: Professional Cloud Network Engineer57 Questions
Exam 16: Professional Cloud Security Engineer80 Questions
Exam 17: Professional Collaboration Engineer71 Questions
Exam 18: Professional Data Engineer on Google Cloud Platform256 Questions
Exam 19: Professional Machine Learning Engineer35 Questions
Select questions type
Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dashboard for the CFO. During testing, you notice that some messages are missing in the dashboard. You check the logs, and all messages are being published to Cloud Pub/Sub successfully. What should you do next?
(Multiple Choice)
4.9/5
(37)
When you store data in Cloud Bigtable, what is the recommended minimum amount of stored data?
(Multiple Choice)
4.8/5
(35)
Your company is in the process of migrating its on-premises data warehousing solutions to BigQuery. The existing data warehouse uses trigger-based change data capture (CDC) to apply updates from multiple transactional database sources on a daily basis. With BigQuery, your company hopes to improve its handling of CDC so that changes to the source systems are available to query in BigQuery in near-real time using log-based CDC streams, while also optimizing for the performance of applying changes to the data warehouse. Which two steps should they take to ensure that changes are available in the BigQuery reporting table with minimal latency while reducing compute overhead? (Choose two.)
(Multiple Choice)
4.8/5
(33)
You want to build a managed Hadoop system as your data lake. The data transformation process is composed of a series of Hadoop jobs executed in sequence. To accomplish the design of separating storage from compute, you decided to use the Cloud Storage connector to store all input data, output data, and intermediary data. However, you noticed that one Hadoop job runs very slowly with Cloud Dataproc, when compared with the on-premises bare-metal Hadoop environment (8-core nodes with 100-GB RAM). Analysis shows that this particular Hadoop job is disk I/O intensive. You want to resolve the issue. What should you do?
(Multiple Choice)
4.9/5
(35)
You designed a database for patient records as a pilot project to cover a few hundred patients in three clinics. Your design used a single database table to represent all patients and their visits, and you used self-joins to generate reports. The server resource utilization was at 50%. Since then, the scope of the project has expanded. The database must now store 100 times more patient records. You can no longer run the reports, because they either take too long or they encounter errors with insufficient compute resources. How should you adjust the database design?
(Multiple Choice)
4.8/5
(31)
You are migrating your data warehouse to BigQuery. You have migrated all of your data into tables in a dataset. Multiple users from your organization will be using the data. They should only see certain tables based on their team membership. How should you set user permissions?
(Multiple Choice)
4.9/5
(35)
Cloud Bigtable is Google's ______ Big Data database service.
(Multiple Choice)
4.7/5
(38)
Your company's on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration. What should you do?
(Multiple Choice)
4.9/5
(28)
To give a user read permission for only the first three columns of a table, which access control method would you use?
(Multiple Choice)
4.9/5
(35)
An online retailer has built their current application on Google App Engine. A new initiative at the company mandates that they extend their application to allow their customers to transact directly via the application. They need to manage their shopping transactions and analyze combined data from multiple datasets using a business intelligence (BI) tool. They want to use only a single database for this purpose. Which Google Cloud database should they choose?
(Multiple Choice)
4.9/5
(37)
Each analytics team in your organization is running BigQuery jobs in their own projects. You want to enable each team to monitor slot usage within their projects. What should you do?
(Multiple Choice)
4.8/5
(32)
You work for an advertising company, and you've developed a Spark ML model to predict click-through rates at advertisement blocks. You've been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be migrated to BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud. What should you do?
(Multiple Choice)
4.9/5
(35)
Google Cloud Bigtable indexes a single value in each row. This value is called the _______.
(Multiple Choice)
4.8/5
(39)
You use BigQuery as your centralized analytics platform. New data is loaded every day, and an ETL pipeline modifies the original data and prepares it for the final users. This ETL pipeline is regularly modified and can generate errors, but sometimes the errors are detected only after 2 weeks. You need to provide a method to recover from these errors, and your backups should be optimized for storage costs. How should you organize your data in BigQuery and store your backups?
(Multiple Choice)
4.9/5
(32)
You are a retailer that wants to integrate your online sales capabilities with different in-home assistants, such as Google Home. You need to interpret customer voice commands and issue an order to the backend systems. Which solutions should you choose?
(Multiple Choice)
4.9/5
(34)
You want to archive data in Cloud Storage. Because some data is very sensitive, you want to use the "Trust No One" (TNO) approach to encrypt your data to prevent the cloud provider staff from decrypting your data. What should you do?
(Multiple Choice)
4.9/5
(40)
You are updating the code for a subscriber to a Pub/Sub feed. You are concerned that upon deployment the subscriber may erroneously acknowledge messages, leading to message loss. Your subscriber is not set up to retain acknowledged messages. What should you do to ensure that you can recover from errors after deployment?
(Multiple Choice)
4.9/5
(38)
You want to automate execution of a multi-step data pipeline running on Google Cloud. The pipeline includes Cloud Dataproc and Cloud Dataflow jobs that have multiple dependencies on each other. You want to use managed services where possible, and the pipeline will run every day. Which tool should you use?
(Multiple Choice)
4.7/5
(35)
You work for a manufacturing company that sources up to 750 different components, each from a different supplier. You've collected a labeled dataset that has on average 1000 examples for each unique component. Your team wants to implement an app to help warehouse workers recognize incoming components based on a photo of the component. You want to implement the first working version of this app (as Proof-Of-Concept) within a few working days. What should you do?
(Multiple Choice)
4.8/5
(44)
You are selecting services to write and transform JSON messages from Cloud Pub/Sub to BigQuery for a data pipeline on Google Cloud. You want to minimize service costs. You also want to monitor and accommodate input data volume that will vary in size with minimal manual intervention. What should you do?
(Multiple Choice)
4.7/5
(42)
Showing 181 - 200 of 256
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)