Decision Trees Questions and Answers

This set of Machine Learning Multiple Choice Questions & Answers (MCQs) focuses on “Decision Trees”.

1. Which of the following statements is not true about the Decision tree?
a) It can be applied on binary classification problems only
b) It is a predictor that predicts the label associated with an instance by traveling from a root node of a tree to a leaf
c) At each node, the successor child is chosen on the basis of a splitting of the input space
d) The splitting is based on one of the features or on a predefined set of splitting rules
View Answer

Answer: a
Explanation: Decision trees can be also used for other prediction problems and not only for binary classification problems. So it is a predictor that predicts the label associated with an instance by traveling from a root node of a tree to a leaf. The successor child is chosen on the basis of a splitting of the input space and is based on one of the features or on a predefined set of splitting rules.

2. Decision tree uses the inductive learning machine learning approach.
a) True
b) False
View Answer

Answer: a
Explanation: Decision tree uses the inductive learning machine learning approach. Inductive learning enables the system to recognize patterns and regularities in previous knowledge or training data and extract the general rules from them. A decision tree is considered to be an inductive learning task as it uses particular facts to make more generalized conclusions.

3. Which of the following statements is not true about a splitting rule at internal nodes of the tree based on thresholding the value of a single feature?
a) It move to the right or left child of the node on the basis of 1_{[xi < ϑ]}, where i ∈ [d] is the index of the relevant feature
b) It move to the right or left child of the node on the basis of 1_{[xi < ϑ]}, where ϑ ∈ R is the threshold
c) Here a decision tree splits the instance space, X = R^d, into cells, where each leaf of the tree corresponds to one cell
d) Splits based on thresholding the value of a single feature are also known as multivariate splits
View Answer

Answer: d
Explanation: Splits based on thresholding the value of a single feature are known as univariate splits. And here it moves to the right or left child of the node on the basis of 1_{[xi < ϑ]}, where i ∈ [d] is the index of the relevant feature and ϑ ∈ R is the threshold. A decision tree splits the instance space, X = R^d, into cells, where each leaf of the tree corresponds to one cell.

4. Consider the figure. If person A starts driving at 8:30 AM and there are no other vehicles on the road, and another person B starts driving at 10 AM and there is an accident on the road, what will be the commute time of A and B respectively?

a) LONG, LONG
b) LONG, SHORT
c) SHORT, LONG
d) SHORT, SHORT
View Answer

Answer: c
Explanation: Given figure shows a decision tree. And person A starts driving at 8:30 AM and there is no traffic. So he will commute in SHORT time. At the same time person B starts driving at 10 AM and there was an accident on the road. So he will commute for a LONG time.

5. In a splitting rule at internal nodes of the tree based on thresholding the value of a single feature, it follows that a tree with k leaves can shatter a set of k instances.
a) False
b) True
View Answer

Answer: b
Explanation: Here the splitting rule at internal nodes of the tree is based on thresholding the value of a single feature; it follows that a tree with k leaves can shatter a set of k instances. Hence, if we allow decision trees of arbitrary size, we obtain a hypothesis class of infinite VC dimension and this approach can easily lead to overfitting.

6. Minimum description length (MDL) principle is used to avoid overfitting in decision trees.
a) True
b) False
View Answer

Answer: a
Explanation: MDL procedures automatically and inherently protect against overfitting and can be used to estimate both the parameters and the structure of a model. Hence MDL principle is used to avoid overfitting in decision trees and aim at learning a decision tree that on one hand fits the data well while on the other hand is not too large.

7. Suppose in a decision tree, we are making some simplifying assumptions that each instance is a vector of d bits (X = {0, 1}^d). Which of the following statements is not true about the above situation?
a) It thresholding the value of a single feature corresponds to a splitting rule of the form 1_[xi=1] for some i = [d]
b) The hypothesis class becomes finite, but is still very large
c) Any classifier from {0, 1}^d to {0, 1} can be represented by a decision tree with 2^d leaves and depth of d + 1
d) Any classifier from {0, 1}^d to {0, 1} can be represented by a decision tree with 2^d+1 leaves and depth of d + 1
View Answer

Answer: d
Explanation: Given the simplifying assumptions, and any classifier from {0, 1}^d to {0, 1} can be represented by a decision tree not with 2^d+1 leaves but with 2^d leaves and depth of d + 1. And here the hypothesis class becomes finite, but is still very large.

8. What does it mean by the VC dimension of a class is 2^d?
a) The number of examples need to PAC learn the hypothesis class grows with 2^d
b) The number of examples need to PAC learn the hypothesis class grows with 2^d+1
c) The number of examples need to PAC learn the hypothesis class grows with 2^d-1
d) The number of examples need to PAC learn the hypothesis class grows with 2^d+1
View Answer

Answer: a
Explanation: Suppose in a decision tree we are making some simplifying assumptions that each instance is a vector of d bits (X = {0, 1}^d). Then the VC dimension of the class is 2^d, which means that the number of examples we need to PAC learn the hypothesis class grows with 2^d. Unless d is very small, this is a huge number of examples.

9. Consider the dataset given below where T and F represent True and False respectively. What is the entropy H (Rain)?

Temperature	Cloud	Rain
Low	T	T
Low	T	T
Medium	T	F
Medium	T	T
High	T	F
High	F	F

a) 1
b) 0.5
c) 0.2
d) 0.6
View Answer

Answer: a
Explanation: We know entropy = ∑\(_{i = 1} ^n\) – P_i log₂ P_i.
Entropy = – (3/6) * log₂ (3/6) – (3/6) * log₂ (3/6)
= – (1/2) * log₂ (1/2) – (1/2) * log₂ (1/2)
= – 0.5 * -1 – 0.5 * -1
= 0.5 + 0.5
= 1

10. What does the following figure represent?

a) Decision tree for OR
b) Decision tree for AND
c) Decision tree for XOR
d) Decision tree for XNOR
View Answer

Answer: b
Explanation: The given figure represents the decision tree implementation of Boolean AND as per the following truth table.

A	B	A AND B
F	F	F
F	T	F
T	T	T
T	F	F

So whenever A is false the decision tree will lead to false. Otherwise it will lead to true or false according to the B.

More MCQs on Decision Trees:

Sanfoundry Global Education & Learning Series – Machine Learning.

To practice all areas of Machine Learning, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

« Prev - Machine Learning Questions and Answers – Implementing Soft SVM with SGD

» Next - Decision Trees Questions and Answers – Gain Measure Implementation

Recommended Articles: