Exam 18: Professional Data Engineer on Google Cloud Platform
Exam 1: Google AdWords: Display Advertising122 Questions
Exam 2: Google AdWords Fundamentals153 Questions
Exam 3: Associate Android Developer86 Questions
Exam 4: Associate Cloud Engineer134 Questions
Exam 5: Cloud Digital Leader91 Questions
Exam 6: Google Analytics Individual Qualification (IQ)121 Questions
Exam 7: Google Analytics Individual Qualification78 Questions
Exam 8: GSuite202 Questions
Exam 9: Looker Business Analyst388 Questions
Exam 10: LookML Developer41 Questions
Exam 11: Mobile Web Specialist13 Questions
Exam 12: Professional Cloud Architect on Google Cloud Platform118 Questions
Exam 13: Professional Cloud Developer85 Questions
Exam 14: Professional Cloud DevOps Engineer28 Questions
Exam 15: Professional Cloud Network Engineer57 Questions
Exam 16: Professional Cloud Security Engineer80 Questions
Exam 17: Professional Collaboration Engineer71 Questions
Exam 18: Professional Data Engineer on Google Cloud Platform256 Questions
Exam 19: Professional Machine Learning Engineer35 Questions
Select questions type
You have historical data covering the last three years in BigQuery and a data pipeline that delivers new data to BigQuery daily. You have noticed that when the Data Science team runs a query filtered on a date column and limited to 30-90 days of data, the query scans the entire table. You also noticed that your bill is increasing more quickly than you expected. You want to resolve the issue as cost-effectively as possible while maintaining the ability to conduct SQL queries. What should you do?
(Multiple Choice)
4.9/5
(31)
What are two of the benefits of using denormalized data structures in BigQuery?
(Multiple Choice)
5.0/5
(40)
You want to use a BigQuery table as a data sink. In which writing mode(s) can you use BigQuery as a sink?
(Multiple Choice)
4.8/5
(44)
Cloud Dataproc charges you only for what you really use with _____ billing.
(Multiple Choice)
4.8/5
(42)
You need to move 2 PB of historical data from an on-premises storage appliance to Cloud Storage within six months, and your outbound network capacity is constrained to 20 Mb/sec. How should you migrate this data to Cloud Storage?
(Multiple Choice)
4.7/5
(36)
You store historic data in Cloud Storage. You need to perform analytics on the historic data. You want to use a solution to detect invalid data entries and perform data transformations that will not require programming or knowledge of SQL. What should you do?
(Multiple Choice)
4.8/5
(25)
The CUSTOM tier for Cloud Machine Learning Engine allows you to specify the number of which types of cluster nodes?
(Multiple Choice)
4.8/5
(34)
As your organization expands its usage of GCP, many teams have started to create their own projects. Projects are further multiplied to accommodate different stages of deployments and target audiences. Each project requires unique access control configurations. The central IT team needs to have access to all projects. Furthermore, data from Cloud Storage buckets and BigQuery datasets must be shared for use in other projects in an ad hoc way. You want to simplify access control management by minimizing the number of policies. Which two steps should you take? (Choose two.)
(Multiple Choice)
4.9/5
(30)
To run a TensorFlow training job on your own computer using Cloud Machine Learning Engine, what would your command start with?
(Multiple Choice)
4.8/5
(33)
Your company is performing data preprocessing for a learning algorithm in Google Cloud Dataflow. Numerous data logs are being are being generated during this step, and the team wants to analyze them. Due to the dynamic nature of the campaign, the data is growing exponentially every hour. The data scientists have written the following code to read the data for a new key features in the logs. BigQueryIO.Read .named("ReadLogData") .from("clouddataflow-readonly:samples.log_data") You want to improve the performance of this data read. What should you do?
(Multiple Choice)
4.9/5
(36)
You're using Bigtable for a real-time application, and you have a heavy load that is a mix of read and writes. You've recently identified an additional use case and need to perform hourly an analytical job to calculate certain statistics across the whole database. You need to ensure both the reliability of your production application as well as the analytical workload. What should you do?
(Multiple Choice)
4.8/5
(36)
You are operating a Cloud Dataflow streaming pipeline. The pipeline aggregates events from a Cloud Pub/Sub subscription source, within a window, and sinks the resulting aggregation to a Cloud Storage bucket. The source has consistent throughput. You want to monitor an alert on behavior of the pipeline with Cloud Stackdriver to ensure that it is processing data. Which Stackdriver alerts should you create?
(Multiple Choice)
4.8/5
(34)
You are building a new application that you need to collect data from in a scalable way. Data arrives continuously from the application throughout the day, and you expect to generate approximately 150 GB of JSON data per day by the end of the year. Your requirements are: Decoupling producer from consumer Space and cost-efficient storage of the raw ingested data, which is to be stored indefinitely Near real-time SQL query Maintain at least 2 years of historical data, which will be queried with SQL Which pipeline should you use to meet these requirements?
(Multiple Choice)
4.8/5
(28)
You want to analyze hundreds of thousands of social media posts daily at the lowest cost and with the fewest steps. You have the following requirements: You will batch-load the posts once per day and run them through the Cloud Natural Language API. You will extract topics and sentiment from the posts. You must store the raw posts for archiving and reprocessing. You will create dashboards to be shared with people both inside and outside your organization. You need to store both the data extracted from the API to perform analysis as well as the raw social media posts for historical archiving. What should you do?
(Multiple Choice)
4.8/5
(33)
Which of the following is NOT one of the three main types of triggers that Dataflow supports?
(Multiple Choice)
4.8/5
(33)
You create an important report for your large team in Google Data Studio 360. The report uses Google BigQuery as its data source. You notice that visualizations are not showing data that is less than 1 hour old. What should you do?
(Multiple Choice)
4.8/5
(36)
The Dataflow SDKs have been recently transitioned into which Apache service?
(Multiple Choice)
4.9/5
(30)
You are deploying a new storage system for your mobile application, which is a media streaming service. You decide the best fit is Google Cloud Datastore. You have entities with multiple properties, some of which can take on multiple values. For example, in the entity 'Movie' the property 'actors' and the property 'tags' have multiple values but the property 'date released' does not. A typical query would ask for all movies with actor=<actorname> ordered by date _ released or all movies with tag=Comedy date_released. How should you avoid a combinatorial explosion in the number of indexes?
(Multiple Choice)
4.8/5
(36)
You are designing a data processing pipeline. The pipeline must be able to scale automatically as load increases. Messages must be processed at least once and must be ordered within windows of 1 hour. How should you design the solution?
(Multiple Choice)
4.8/5
(36)
You are working on a niche product in the image recognition domain. Your team has developed a model that is dominated by custom C++ TensorFlow ops your team has implemented. These ops are used inside your main training loop and are performing bulky matrix multiplications. It currently takes up to several days to train a model. You want to decrease this time significantly and keep the cost low by using an accelerator on Google Cloud. What should you do?
(Multiple Choice)
4.9/5
(33)
Showing 101 - 120 of 256
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)