This set of Hadoop Questions & Answers for freshers focuses on “MapReduce Features – 1”.
1. Which of the following is the default Partitioner for Mapreduce?
a) MergePartitioner
b) HashedPartitioner
c) HashPartitioner
d) None of the mentioned
View Answer
Explanation: The total number of partitions is the same as the number of reduce tasks for the job.
2. Point out the correct statement.
a) The right number of reduces seems to be 0.95 or 1.75
b) Increasing the number of reduces increases the framework overhead
c) With 0.95 all of the reduces can launch immediately and start transferring map outputs as the maps finish
d) All of the mentioned
View Answer
Explanation: With 1.75 the faster nodes will finish their first round of reduces and launch a second wave of reduces doing a much better job of load balancing.
3. Which of the following partitions the key space?
a) Partitioner
b) Compactor
c) Collector
d) All of the mentioned
View Answer
Explanation: Partitioner controls the partitioning of the keys of the intermediate map-outputs.
4. ____________ is a generalization of the facility provided by the MapReduce framework to collect data output by the Mapper or the Reducer.
a) OutputCompactor
b) OutputCollector
c) InputCollector
d) All of the mentioned
View Answer
Explanation: Hadoop MapReduce comes bundled with a library of generally useful mappers, reducers, and partitioners.
5. Point out the wrong statement.
a) It is legal to set the number of reduce-tasks to zero if no reduction is desired
b) The outputs of the map-tasks go directly to the FileSystem
c) The Mapreduce framework does not sort the map-outputs before writing them out to the FileSystem
d) None of the mentioned
View Answer
Explanation: Outputs of the map-tasks go directly to the FileSystem, into the output path set by setOutputPath(Path).
6. __________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework for execution.
a) JobConfig
b) JobConf
c) JobConfiguration
d) All of the mentioned
View Answer
Explanation: JobConf is typically used to specify the Mapper, combiner (if any), Partitioner, Reducer, InputFormat, OutputFormat and OutputCommitter implementations.
7. The ___________ executes the Mapper/ Reducer task as a child process in a separate jvm.
a) JobTracker
b) TaskTracker
c) TaskScheduler
d) None of the mentioned
View Answer
Explanation: The child-task inherits the environment of the parent TaskTracker.
8. Maximum virtual memory of the launched child-task is specified using _________
a) mapv
b) mapred
c) mapvim
d) All of the mentioned
View Answer
Explanation: Admins can also specify the maximum virtual memory of the launched child-task, and any sub-process it launches recursively, using mapred.
9. Which of the following parameter is the threshold for the accounting and serialization buffers?
a) io.sort.spill.percent
b) io.sort.record.percent
c) io.sort.mb
d) None of the mentioned
View Answer
Explanation: When the percentage of either buffer has filled, their contents will be spilled to disk in the background.
10. ______________ is percentage of memory relative to the maximum heap size in which map outputs may be retained during the reduce.
a) mapred.job.shuffle.merge.percent
b) mapred.job.reduce.input.buffer.percen
c) mapred.inmem.merge.threshold
d) io.sort.factor
View Answer
Explanation: When the reduce begins, map outputs will be merged to disk until those that remain are under the resource limit this defines.
Sanfoundry Global Education & Learning Series – Hadoop.
Here’s the list of Best Books in Hadoop.
- Check Hadoop Books
- Practice Programming MCQs
- Check Programming Books
- Apply for Computer Science Internship