Hadoop Questions and Answers – Orchestration in Hadoop

This set of Hadoop Multiple Choice Questions & Answers (MCQs) focuses on “Orchestration in Hadoop”.

1. A collection of various actions in a control dependency DAG is referred to as ________________
a) workflow
b) dataflow
c) clientflow
d) none of the mentioned
View Answer

Answer: a
Explanation: Falcon provides the key services for data processing apps.

2. Point out the correct statement.
a) Large datasets are incentives for users to come to Hadoop
b) Data management is a common concern to be offered as a service
c) Understanding the life-time of a feed will allow for implicit validation of the processing rules
d) All of the mentioned
View Answer

Answer: d
Explanation: Falcon decouples a data location and its properties from workflows.

3. The ability of Hadoop to efficiently process large volumes of data in parallel is called __________ processing.
a) batch
b) stream
c) time
d) all of the mentioned
View Answer

Answer: b
Explanation: There are also a number of use cases that require more “real-time” processing of data—processing the data as it arrives, rather than through batch processing.

4. __________ is used for simplified Data Management in Hadoop.
a) Falcon
b) flume
c) Impala
d) None of the mentioned
View Answer

Answer: a
Explanation: Apache Falcon process orchestration and scheduling.

advertisement
advertisement

5. Point out the wrong statement.
a) Falcon promotes Javascript Programming
b) Falcon does not do any heavy lifting but delegates to tools with in the Hadoop ecosystem
c) Falcon handles retry logic and late data processing. Records audit, lineage and metrics
d) All of the mentioned
View Answer

Answer: a
Explanation: Falcon promotes Polyglot Programming.

6. Falcon provides ___________ workflow for copying data from source to target.
a) recurring
b) investment
c) data
d) none of the mentioned
View Answer

Answer: a
Explanation: Falcon instruments workflows for dependencies, retry logic, Table/Partition registration, notifications, etc.

Note: Join free Sanfoundry classes at Telegram or Youtube

7. A recurring workflow is used for purging expired data on __________ cluster.
a) Primary
b) Secondary
c) BCP
d) None of the mentioned
View Answer

Answer: a
Explanation: Falcon provides retention workflow for each cluster based on the defined policy.

8. Falcon provides the key services data processing applications need so Sophisticated________ can easily be added to Hadoop applications.
a) DAM
b) DLM
c) DCM
d) All of the mentioned
View Answer

Answer: b
Explanation: Complex data processing logic is handled by Falcon instead of hard-coded in apps.

advertisement

9. Falcon promotes decoupling of data set location from ___________ definition.
a) Oozie
b) Impala
c) Kafka
d) Thrift
View Answer

Answer: a
Explanation: Falcon uses declarative processing with simple directives enabling rapid prototyping.

10. Falcon provides seamless integration with _____________
a) HCatalog
b) metastore
c) HBase
d) Kafka
View Answer

Answer: b
Explanation: Falcon maintains the dependencies and relationships between entities.

advertisement

Sanfoundry Global Education & Learning Series – Hadoop.

Here’s the list of Best Books in Hadoop.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.