Hadoop Questions and Answers – MapReduce Features – 2

This set of Interview Questions & Answers focuses on “MapReduce Features – 2”.

1. ____________ specifies the number of segments on disk to be merged at the same time.
a) mapred.job.shuffle.merge.percent
b) mapred.job.reduce.input.buffer.percen
c) mapred.inmem.merge.threshold
d) io.sort.factor
View Answer

Answer: d
Explanation: io.sort.factor limits the number of open files and compression codecs during the merge.

2. Point out the correct statement.
a) The number of sorted map outputs fetched into memory before being merged to disk
b) The memory threshold for fetched map outputs before an in-memory merge is finished
c) The percentage of memory relative to the maximum heap size in which map outputs may not be retained during the reduce
d) None of the mentioned
View Answer

Answer: a
Explanation: When the reduce begins, map outputs will be merged to disk until those that remain are under the resource limit this defines.

3. Map output larger than ___________ percent of the memory allocated to copying map outputs.
a) 10
b) 15
c) 25
d) 35
View Answer

Answer: c
Explanation: Map output will be written directly to disk without first staging through memory.

4. Jobs can enable task JVMs to be reused by specifying the job configuration _________
a) mapred.job.recycle.jvm.num.tasks
b) mapissue.job.reuse.jvm.num.tasks
c) mapred.job.reuse.jvm.num.tasks
d) all of the mentioned
View Answer

Answer: b
Explanation: Many of my tasks had performance improved over 50% using mapissue.job.reuse.jvm.num.tasks.


5. Point out the wrong statement.
a) The task tracker has local directory to create localized cache and localized job
b) The task tracker can define multiple local directories
c) The Job tracker cannot define multiple local directories
d) None of the mentioned
View Answer

Answer: d
Explanation: When the job starts, task tracker creates a localized job directory relative to the local directory specified in the configuration.

6. During the execution of a streaming job, the names of the _______ parameters are transformed.
a) vmap
b) mapvim
c) mapreduce
d) mapred
View Answer

Answer: d
Explanation: To get the values in a streaming job’s mapper/reducer use the parameter names with the underscores.

Sanfoundry Certification Contest of the Month is Live. 100+ Subjects. Participate Now!

7. The standard output (stdout) and error (stderr) streams of the task are read by the TaskTracker and logged to _________
a) ${HADOOP_LOG_DIR}/user
b) ${HADOOP_LOG_DIR}/userlogs
c) ${HADOOP_LOG_DIR}/logs
d) None of the mentioned
View Answer

Answer: b
Explanation: The child-jvm always has its current working directory added to the java.library.path and LD_LIBRARY_PATH.

8. ____________ is the primary interface by which user-job interacts with the JobTracker.
a) JobConf
b) JobClient
c) JobServer
d) All of the mentioned
View Answer

Answer: b
Explanation: JobClient provides facilities to submit jobs, track their progress, access component-tasks’ reports and logs, get the MapReduce cluster status information and so on.


9. The _____________ can also be used to distribute both jars and native libraries for use in the map and/or reduce tasks.
a) DistributedLog
b) DistributedCache
c) DistributedJars
d) None of the mentioned
View Answer

Answer: b
Explanation: Cached libraries can be loaded via System.loadLibrary or System.load.

10. __________ is used to filter log files from the output directory listing.
a) OutputLog
b) OutputLogFilter
c) DistributedLog
d) DistributedJars
View Answer

Answer: b
Explanation: User can view the history logs summary in specified directory using the following command $ bin/hadoop job -history output-dir.


Sanfoundry Global Education & Learning Series – Hadoop.

Here’s the list of Best Books in Hadoop.

To practice all interview questions and answers on MapReduce, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.