This set of Hadoop Questions for entrance exams focuses on “Chuckwa with Hadoop – 2”.
1. __________ runs Demux parsers inside for convert unstructured data to semi-structured data, then load the key value pairs to HBase table.
a) HCatWriter
b) HBWriter
c) HBaseWriter
d) None of the mentioned
View Answer
Explanation: Demux parser class package, HBaseWriter uses hbase.demux.package to validate HBase for annotated demux parser classes.
2. Point out the correct statement.
a) chukwa supports two different reliability strategies
b) chukwaCollector.asyncAcks.scantime affects how often collectors will check the filesystem for commits
c) chukwaCollector.asyncAcks.scanperiod defaults to thrice the rotation interval
d) all of the mentioned
View Answer
Explanation: The first, default strategy, is as follows: collectors write data to HDFS, and as soon as the HDFS write call returns success, report success to the agent, which advances its checkpoint state.
3. The __________ streams chunks of data to HDFS, and write data in temp filename with .chukwa suffix.
a) LocalWriter
b) SeqFileWriter
c) SocketTeeWriter
d) All of the mentioned
View Answer
Explanation: When the file is completed writing, the filename is renamed with .done suffix. SeqFileWriter has the following configuration in chukwa-collector-conf.xml.
4. Conceptually, each _________ emits a semi-infinite stream of bytes, numbered starting from zero.
a) Collector
b) Adaptor
c) Compactor
d) LocalWriter
View Answer
Explanation: A Chunk is a sequence of bytes, with some metadata. Several of these are set automatically by the Agent or Adaptors.
5. Point out the wrong statement.
a) Filters use the same syntax as the Dump command
b) “RAW” will send the internal data of the Chunk, without any metadata, prefixed by its length encoded as a 32-bit int
c) Specifying “WRITABLE” will cause the chunks to be written using Hadoop Writable serialization framework
d) None of the mentioned
View Answer
Explanation: “HEADER” is similar to “RAW”, but with a one-line header in front of the content.
6. The _____________ allows external processes to watch the stream of chunks passing through the collector.
a) LocalWriter
b) SeqFileWriter
c) SocketTeeWriter
d) All of the mentioned
View Answer
Explanation: SocketTeeWriter listens on a port (specified by conf option chukwaCollector.tee.port, defaulting to 9094.)
7. Data analytics scripts are written in ____________
a) Hive
b) CQL
c) PigLatin
d) Java
View Answer
Explanation: Data stored in HBase are aggregated by data analytic scripts to provide visualization and interpretation of health of Hadoop cluster.
8. If demux is successful within ____________ attempts, archives the completed files in Chukwa.
a) one
b) two
c) three
d) all of the mentioned
View Answer
Explanation: The Demux MapReduce job is run on the data in demuxProcessing/mrInput.
9. Chukwa is ___________ data collection system for managing large distributed systems.
a) open source
b) proprietary
c) service based
d) none of the mentioned
View Answer
Explanation: Chukwa is built on top of the Hadoop Distributed File System (HDFS) and Map/Reduce framework and inherits Hadoop scalability and robustness.
10. Collectors write chunks to logs/*.chukwa files until a __________ MB chunk is reached.
a) 64
b) 108
c) 256
d) 1024
View Answer
Explanation: PostProcessManager wakes up every few minutes and aggregates, orders and de-dups record files.
Sanfoundry Global Education & Learning Series – Hadoop.
Here’s the list of Best Books in Hadoop.
- Check Hadoop Books
- Check Programming Books
- Apply for Computer Science Internship
- Practice Programming MCQs