Multiple Choice
You need to create a job that does frequency analysis on input data. You will do this by writing a Mapper that uses TextInputFormat and splits each value (a line of text from an input file) into individual characters. For each one of these characters, you will emit the character as a key and an InputWritable as the value. As this will produce proportionally more intermediate data than input data, which two resources should you expect to be bottlenecks?
A) Processor and network I/O
B) Disk I/O and network I/O
C) Processor and RAM
D) Processor and disk I/O
Correct Answer:

Verified
Correct Answer:
Verified
Q1: Which best describes what the map method
Q2: What data does a Reducer reduce method
Q3: Workflows expressed in Oozie can contain:<br>A) Sequences
Q4: Table metadata in Hive is:<br>A) Stored as
Q6: What is the disadvantage of using multiple
Q7: In the reducer, the MapReduce API provides
Q8: Which best describes how TextInputFormat processes input
Q9: In a MapReduce job with 500 map
Q10: Analyze each scenario below and indentify which
Q11: In a large MapReduce job with m