Hadoop Questions and Answers – Chuckwa with Hadoop – 2

«
»

This set of Hadoop Questions for entrance exams focuses on “Chuckwa with Hadoop – 2”.

1. __________ runs Demux parsers inside for convert unstructured data to semi-structured data, then load the key value pairs to HBase table.
a) HCatWriter
b) HBWriter
c) HBaseWriter
d) None of the mentioned
View Answer

Answer: c
Explanation: Demux parser class package, HBaseWriter uses hbase.demux.package to validate HBase for annotated demux parser classes.

2. Point out the correct statement.
a) chukwa supports two different reliability strategies
b) chukwaCollector.asyncAcks.scantime affects how often collectors will check the filesystem for commits
c) chukwaCollector.asyncAcks.scanperiod defaults to thrice the rotation interval
d) all of the mentioned
View Answer

Answer: a
Explanation: The first, default strategy, is as follows: collectors write data to HDFS, and as soon as the HDFS write call returns success, report success to the agent, which advances its checkpoint state.

advertisement

3. The __________ streams chunks of data to HDFS, and write data in temp filename with .chukwa suffix.
a) LocalWriter
b) SeqFileWriter
c) SocketTeeWriter
d) All of the mentioned
View Answer

Answer: b
Explanation: When the file is completed writing, the filename is renamed with .done suffix. SeqFileWriter has the following configuration in chukwa-collector-conf.xml.

4. Conceptually, each _________ emits a semi-infinite stream of bytes, numbered starting from zero.
a) Collector
b) Adaptor
c) Compactor
d) LocalWriter
View Answer

Answer: b
Explanation: A Chunk is a sequence of bytes, with some metadata. Several of these are set automatically by the Agent or Adaptors.

5. Point out the wrong statement.
a) Filters use the same syntax as the Dump command
b) “RAW” will send the internal data of the Chunk, without any metadata, prefixed by its length encoded as a 32-bit int
c) Specifying “WRITABLE” will cause the chunks to be written using Hadoop Writable serialization framework
d) None of the mentioned
View Answer

Answer: d
Explanation: “HEADER” is similar to “RAW”, but with a one-line header in front of the content.

6. The _____________ allows external processes to watch the stream of chunks passing through the collector.
a) LocalWriter
b) SeqFileWriter
c) SocketTeeWriter
d) All of the mentioned
View Answer

Answer: c
Explanation: SocketTeeWriter listens on a port (specified by conf option chukwaCollector.tee.port, defaulting to 9094.)

advertisement

7. Data analytics scripts are written in ____________
a) Hive
b) CQL
c) PigLatin
d) Java
View Answer

Answer: c
Explanation: Data stored in HBase are aggregated by data analytic scripts to provide visualization and interpretation of health of Hadoop cluster.

8. If demux is successful within ____________ attempts, archives the completed files in Chukwa.
a) one
b) two
c) three
d) all of the mentioned
View Answer

Answer: c
Explanation: The Demux MapReduce job is run on the data in demuxProcessing/mrInput.

9. Chukwa is ___________ data collection system for managing large distributed systems.
a) open source
b) proprietary
c) service based
d) none of the mentioned
View Answer

Answer: a
Explanation: Chukwa is built on top of the Hadoop Distributed File System (HDFS) and Map/Reduce framework and inherits Hadoop scalability and robustness.

10. Collectors write chunks to logs/*.chukwa files until a __________ MB chunk is reached.
a) 64
b) 108
c) 256
d) 1024
View Answer

Answer: a
Explanation: PostProcessManager wakes up every few minutes and aggregates, orders and de-dups record files.

advertisement

Sanfoundry Global Education & Learning Series – Hadoop.

Here’s the list of Best Reference Books in Hadoop.

To practice all areas of Hadoop for entrance exams, here is complete set of 1000+ Multiple Choice Questions and Answers.

advertisement
advertisement
advertisement
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He is Linux Kernel Developer & SAN Architect and is passionate about competency developments in these areas. He lives in Bangalore and delivers focused training sessions to IT professionals in Linux Kernel, Linux Debugging, Linux Device Drivers, Linux Networking, Linux Storage, Advanced C Programming, SAN Storage Technologies, SCSI Internals & Storage Protocols such as iSCSI & Fiber Channel. Stay connected with him @ LinkedIn