Exam 2: Cloudera Certified Developer for Apache Hadoop (CCDH)
Exam 1: Cloudera Certified Administrator for Apache Hadoop (CCAH)30 Questions
Exam 2: Cloudera Certified Developer for Apache Hadoop (CCDH)36 Questions
Select questions type
Which best describes what the map method accepts and emits?
Free
(Multiple Choice)
4.7/5
(31)
Correct Answer:
D
What data does a Reducer reduce method process?
Free
(Multiple Choice)
4.7/5
(27)
Correct Answer:
C
Workflows expressed in Oozie can contain:
Free
(Multiple Choice)
4.9/5
(41)
Correct Answer:
A
You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value. As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?
(Multiple Choice)
4.7/5
(33)
What is the disadvantage of using multiple reducers with the default HashPartitioner and distributing your workload across you cluster?
(Multiple Choice)
4.8/5
(36)
In the reducer, the MapReduce API provides you with an iterator over Writable values. What does calling the next () method return?
(Multiple Choice)
4.9/5
(30)
Which best describes how TextInputFormat processes input files and line breaks?
(Multiple Choice)
4.9/5
(31)
In a MapReduce job with 500 map tasks, how many map task attempts will there be?
(Multiple Choice)
4.8/5
(33)
Analyze each scenario below and indentify which best describes the behavior of the default partitioner?
(Multiple Choice)
4.8/5
(44)
In a large MapReduce job with m mappers and n reducers, how many distinct copy operations will there be in the sort/shuffle phase?
(Multiple Choice)
4.8/5
(29)
The Hadoop framework provides a mechanism for coping with machine issues such as faulty configuration or impending hardware failure. MapReduce detects that one or a number of machines are performing poorly and starts more copies of a map or reduce task. All the tasks run simultaneously and the task finish first are used. This is called:
(Multiple Choice)
4.8/5
(44)
You want to perform analysis on a large collection of images. You want to store this data in HDFS and process it with MapReduce but you also want to give your data analysts and data scientists the ability to process the data directly from HDFS with an interpreted high-level programming language like Python. Which format should you use to store this data in HDFS?
(Multiple Choice)
4.8/5
(27)
Given a directory of files with the following structure: line number, tab character, string: Example: 1 abialkjfjkaoasdfjksdlkjhqweroij 2 kadfjhuwqounahagtnbvaswslmnbfgy 3 kjfteiomndscxeqalkzhtopedkfsikj You want to send each line as one record to your Mapper. Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?
(Multiple Choice)
4.7/5
(31)
You have written a Mapper which invokes the following five calls to the OutputColletor.collect method: output.collect (new Text ("Apple"), new Text ("Red") ) ; output.collect (new Text ("Banana"), new Text ("Yellow") ) ; output.collect (new Text ("Apple"), new Text ("Yellow") ) ; output.collect (new Text ("Cherry"), new Text ("Red") ) ; output.collect (new Text ("Apple"), new Text ("Green") ) ; How many times will the Reducer's reduce method be invoked?
(Multiple Choice)
4.8/5
(36)
Identify the MapReduce v2 (MRv2 / YARN) daemon responsible for launching application containers and monitoring application resource usage?
(Multiple Choice)
4.9/5
(44)
To process input key-value pairs, your mapper needs to lead a 512 MB data file in memory. What is the best way to accomplish this?
(Multiple Choice)
4.7/5
(39)
What types of algorithms are difficult to express in MapReduce v1 (MRv1)?
(Multiple Choice)
4.9/5
(35)
Showing 1 - 20 of 36
Filters
- Essay(0)
- Multiple Choice(0)
- Short Answer(0)
- True False(0)
- Matching(0)