Hadoop Questions for Campus Interviews

This set of Hadoop Questions for campus interviews focuses on “Spark with Hadoop – 2”.

1. Users can easily run Spark on top of Amazon’s __________
a) Infosphere
b) EC2
c) EMR
d) None of the mentioned
View Answer

Answer: b
Explanation: Users can easily run Spark (and Shark) on top of Amazon’s EC2 either using the scripts that come with Spark.

2. Point out the correct statement.
a) Spark enables Apache Hive users to run their unmodified queries much faster
b) Spark interoperates only with Hadoop
c) Spark is a popular data warehouse solution running on top of Hadoop
d) None of the mentioned
View Answer

Answer: a
Explanation: Shark can accelerate Hive queries by as much as 100x when the input data fits into memory, and up 10x when the input data is stored on disk.

3. Spark runs on top of ___________ a cluster manager system which provides efficient resource isolation across distributed applications.
a) Mesjs
b) Mesos
c) Mesus
d) All of the mentioned
View Answer

Answer: b
Explanation: Mesos enables fine grained sharing which allows a Spark job to dynamically take advantage of the idle resources in the cluster during its execution.

4. Which of the following can be used to launch Spark jobs inside MapReduce?
a) SIM
b) SIMR
c) SIR
d) RIS
View Answer

Answer: b
Explanation: With SIMR, users can start experimenting with Spark and use its shell within a couple of minutes after downloading it.

5. Point out the wrong statement.
a) Spark is intended to replace, the Hadoop stack
b) Spark was designed to read and write data from and to HDFS, as well as other storage systems
c) Hadoop users who have already deployed or are planning to deploy Hadoop Yarn can simply run Spark on YARN
d) None of the mentioned
View Answer

Answer: a
Explanation: Spark is intended to enhance, not replace, the Hadoop stack.

6. Which of the following language is not supported by Spark?
a) Java
b) Pascal
c) Scala
d) Python
View Answer

Answer: b
Explanation: The Spark engine runs in a variety of environments, from cloud services to Hadoop or Mesos clusters.

Subscribe Now: Hadoop Newsletter | Important Subjects Newsletters

7. Spark is packaged with higher level libraries, including support for _________ queries.
a) SQL
b) C
c) C++
d) None of the mentioned
View Answer

Answer: a
Explanation: Standard libraries increase developer productivity and can be seamlessly combined to create complex workflows.

8. Spark includes a collection over ________ operators for transforming data and familiar data frame APIs for manipulating semi-structured data.
a) 50
b) 60
c) 70
d) 80
View Answer

Answer: d
Explanation: Spark provides easy-to-use APIs for operating on large datasets.

9. Spark is engineered from the bottom-up for performance, running ___________ faster than Hadoop by exploiting in memory computing and other optimizations.
a) 100x
b) 150x
c) 200x
d) None of the mentioned
View Answer

Answer: a
Explanation: Spark is fast on disk too; it currently holds the world record in large scale on-disk sorting.

10. Spark powers a stack of high-level tools including Spark SQL, MLlib for _________
a) regression models
b) statistics
c) machine learning
d) reproductive research
View Answer

Answer: c
Explanation: Spark is used at a wide range of organizations to process large datasets.

Sanfoundry Global Education & Learning Series – Hadoop.

Here’s the list of Best Books in Hadoop.

To practice all areas of Hadoop for campus interviews, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

« Prev - Hadoop Questions and Answers – Spark with Hadoop – 1

» Next - Hadoop Questions and Answers – Flume with Hadoop

Related Posts:

Recommended Articles: