Hadoop Questions and Answers – Orchestration in Hadoop

«
»

This set of Hadoop Multiple Choice Questions & Answers (MCQs) focuses on “Orchestration in Hadoop”.

1. A collection of various actions in a control dependency DAG is referred to as ________________
a) workflow
b) dataflow
c) clientflow
d) none of the mentioned
View Answer

Answer: a
Explanation: Falcon provides the key services for data processing apps.

2. Point out the correct statement.
a) Large datasets are incentives for users to come to Hadoop
b) Data management is a common concern to be offered as a service
c) Understanding the life-time of a feed will allow for implicit validation of the processing rules
d) All of the mentioned
View Answer

Answer: d
Explanation: Falcon decouples a data location and its properties from workflows.

advertisement

3. The ability of Hadoop to efficiently process large volumes of data in parallel is called __________ processing.
a) batch
b) stream
c) time
d) all of the mentioned
View Answer

Answer: b
Explanation: There are also a number of use cases that require more “real-time” processing of data—processing the data as it arrives, rather than through batch processing.

4. __________ is used for simplified Data Management in Hadoop.
a) Falcon
b) flume
c) Impala
d) None of the mentioned
View Answer

Answer: a
Explanation: Apache Falcon process orchestration and scheduling.

5. Point out the wrong statement.
a) Falcon promotes Javascript Programming
b) Falcon does not do any heavy lifting but delegates to tools with in the Hadoop ecosystem
c) Falcon handles retry logic and late data processing. Records audit, lineage and metrics
d) All of the mentioned
View Answer

Answer: a
Explanation: Falcon promotes Polyglot Programming.

6. Falcon provides ___________ workflow for copying data from source to target.
a) recurring
b) investment
c) data
d) none of the mentioned
View Answer

Answer: a
Explanation: Falcon instruments workflows for dependencies, retry logic, Table/Partition registration, notifications, etc.

advertisement

7. A recurring workflow is used for purging expired data on __________ cluster.
a) Primary
b) Secondary
c) BCP
d) None of the mentioned
View Answer

Answer: a
Explanation: Falcon provides retention workflow for each cluster based on the defined policy.

8. Falcon provides the key services data processing applications need so Sophisticated________ can easily be added to Hadoop applications.
a) DAM
b) DLM
c) DCM
d) All of the mentioned
View Answer

Answer: b
Explanation: Complex data processing logic is handled by Falcon instead of hard-coded in apps.

9. Falcon promotes decoupling of data set location from ___________ definition.
a) Oozie
b) Impala
c) Kafka
d) Thrift
View Answer

Answer: a
Explanation: Falcon uses declarative processing with simple directives enabling rapid prototyping.

10. Falcon provides seamless integration with _____________
a) HCatalog
b) metastore
c) HBase
d) Kafka
View Answer

Answer: b
Explanation: Falcon maintains the dependencies and relationships between entities.

advertisement

Sanfoundry Global Education & Learning Series – Hadoop.

Here’s the list of Best Reference Books in Hadoop.

advertisement
advertisement
advertisement
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He is Linux Kernel Developer & SAN Architect and is passionate about competency developments in these areas. He lives in Bangalore and delivers focused training sessions to IT professionals in Linux Kernel, Linux Debugging, Linux Device Drivers, Linux Networking, Linux Storage, Advanced C Programming, SAN Storage Technologies, SCSI Internals & Storage Protocols such as iSCSI & Fiber Channel. Stay connected with him @ LinkedIn