This set of Hadoop Questions & Answers for experienced focuses on “MapReduce Types”.
1. ___________ generates keys of type LongWritable and values of type Text.
a) TextOutputFormat
b) TextInputFormat
c) OutputInputFormat
d) None of the mentioned
View Answer
Explanation: If K2 and K3 are the same, you don’t need to call setMapOutputKeyClass().
2. Point out the correct statement.
a) The reduce input must have the same types as the map output, although the reduce output types may be different again
b) The map input key and value types (K1 and V1) are different from the map output types
c) The partition function operates on the intermediate key
d) All of the mentioned
View Answer
Explanation: In practice, the partition is determined solely by the key (the value is ignored).
3. In _____________ the default job is similar, but not identical, to the Java equivalent.
a) Mapreduce
b) Streaming
c) Orchestration
d) All of the mentioned
View Answer
Explanation: MapReduce Types and Formats MapReduce has a simple model of data processing.
4. An input _________ is a chunk of the input that is processed by a single map.
a) textformat
b) split
c) datanode
d) all of the mentioned
View Answer
Explanation: Each split is divided into records, and the map processes each record—a key-value pair—in turn.
5. Point out the wrong statement.
a) If V2 and V3 are the same, you only need to use setOutputValueClass()
b) The overall effect of Streaming job is to perform a sort of the input
c) A Streaming application can control the separator that is used when a key-value pair is turned into a series of bytes and sent to the map or reduce process over standard input
d) None of the mentioned
View Answer
Explanation: If a combine function is used then it is the same form as the reduce function, except its output types are the intermediate key and value types (K2 and V2), so they can feed the reduce function.
6. An ___________ is responsible for creating the input splits, and dividing them into records.
a) TextOutputFormat
b) TextInputFormat
c) OutputInputFormat
d) InputFormat
View Answer
Explanation: As a MapReduce application writer, you don’t need to deal with InputSplits directly, as they are created by an InputFormat.
7. ______________ is another implementation of the MapRunnable interface that runs mappers concurrently in a configurable number of threads.
a) MultithreadedRunner
b) MultithreadedMap
c) MultithreadedMapRunner
d) SinglethreadedMapRunner
View Answer
Explanation: A RecordReader is little more than an iterator over records, and the map task uses one to generate record key-value pairs, which it passes to the map function.
8. Which of the following is the only way of running mappers?
a) MapReducer
b) MapRunner
c) MapRed
d) All of the mentioned
View Answer
Explanation: Having calculated the splits, the client sends them to the jobtracker.
9. _________ is the base class for all implementations of InputFormat that use files as their data source.
a) FileTextFormat
b) FileInputFormat
c) FileOutputFormat
d) None of the mentioned
View Answer
Explanation: FileInputFormat provides implementation for generating splits for the input files.
10. Which of the following method add a path or paths to the list of inputs?
a) setInputPaths()
b) addInputPath()
c) setInput()
d) none of the mentioned
View Answer
Explanation: FileInputFormat offers four static convenience methods for setting a JobConf input paths.
Sanfoundry Global Education & Learning Series – Hadoop.
Here’s the list of Best Books in Hadoop.
- Check Hadoop Books
- Apply for Computer Science Internship
- Practice Programming MCQs
- Check Programming Books