Hadoop Questions and Answers – Data Integrity

This set of Hadoop Multiple Choice Questions & Answers (MCQs) focuses on “Data Integrity”.

1. The HDFS client software implements __________ checking on the contents of HDFS files.
a) metastore
b) parity
c) checksum
d) none of the mentioned
View Answer

Answer: c
Explanation: When a client creates an HDFS file, it computes a checksum of each block of the file and stores these checksums in a separate hidden file in the same HDFS namespace.

2. Point out the correct statement.
a) The HDFS architecture is compatible with data rebalancing schemes
b) Datablocks support storing a copy of data at a particular instant of time
c) HDFS currently support snapshots
d) None of the mentioned
View Answer

Answer: a
Explanation: A scheme might automatically move data from one DataNode to another if the free space on a DataNode falls below a certain threshold.

3. The ___________ machine is a single point of failure for an HDFS cluster.
a) DataNode
b) NameNode
c) ActionNode
d) All of the mentioned
View Answer

Answer: b
Explanation: If the NameNode machine fails, manual intervention is necessary. Currently, automatic restart and failover of the NameNode software to another machine is not supported.

4. The ____________ and the EditLog are central data structures of HDFS.
a) DsImage
b) FsImage
c) FsImages
d) All of the mentioned
View Answer

Answer: b
Explanation: A corruption of these files can cause the HDFS instance to be non-functional.

advertisement
advertisement

5. Point out the wrong statement.
a) HDFS is designed to support small files only
b) Any update to either the FsImage or EditLog causes each of the FsImages and EditLogs to get updated synchronously
c) NameNode can be configured to support maintaining multiple copies of the FsImage and EditLog
d) None of the mentioned
View Answer

Answer: a
Explanation: HDFS is designed to support very large files.

6. __________ support storing a copy of data at a particular instant of time.
a) Data Image
b) Datanots
c) Snapshots
d) All of the mentioned
View Answer

Answer: c
Explanation: One usage of the snapshot feature may be to roll back a corrupted HDFS instance to a previously known good point in time.

7. Automatic restart and ____________ of the NameNode software to another machine is not supported.
a) failover
b) end
c) scalability
d) all of the mentioned
View Answer

Answer: a
Explanation: If the NameNode machine fails, manual intervention is necessary.

8. HDFS, by default, replicates each data block _____ times on different nodes and on at least ____ racks.
a) 3, 2
b) 1, 2
c) 2, 3
d) All of the mentioned
View Answer

Answer: a
Explanation: HDFS has a simple yet robust architecture that was explicitly designed for data reliability in the face of faults and failures in disks, nodes and networks.

advertisement

9. _________ stores its metadata on multiple disks that typically include a non-local file server.
a) DataNode
b) NameNode
c) ActionNode
d) None of the mentioned
View Answer

Answer: b
Explanation: HDFS tolerates failures of storage servers (called DataNodes) and its disks.

10. The HDFS file system is temporarily unavailable whenever the HDFS ________ is down.
a) DataNode
b) NameNode
c) ActionNode
d) None of the mentioned
View Answer

Answer: b
Explanation: When the HDFS NameNode is restarted it recovers its metadata.

advertisement

Sanfoundry Global Education & Learning Series – Hadoop.

Here’s the list of Best Books in Hadoop.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.