# Decision Tree Pruning Questions and Answers – Set 2

This set of Machine Learning Multiple Choice Questions & Answers (MCQs) focuses on “Decision Tree Pruning – Set 2”.

1. Which of the following statements is not an advantage of Reduced error pruning?
a) Linear computational complexity
b) Over pruning
c) Simplicity
d) Speed

Explanation: Over pruning is a disadvantage of Reduced error pruning. When the test set is much smaller than the training set, it may lead to over pruning. And the advantage of this method is its linear computational complexity, simplicity and speed.

2. Minimum error pruning is a Top down approach.
a) True
b) False

Explanation: Minimum error pruning is not a Top down approach. It is a bottom – up approach which seeks a single tree that minimizes the expected error rate on an independent data set. The tree is pruned back to the point where the cross – validated error is a minimum.

3. Which of the following statements is not a step in Minimum error pruning?
a) At each non leaf node in the tree, calculate expected error rate if that subtree is pruned
b) Calculate the expected error rate for that node if subtree is not pruned
c) If pruning the node leads to greater expected error rate, then keep the subtree
d) If pruning the node leads to smaller expected error rate, then don’t prune it

Explanation: In Minimum error pruning if pruning the node leads to smaller expected error rate, then prune it. Here at each non leaf node in the tree, calculate expected error rate if that subtree is pruned otherwise calculate the expected error rate for that node if subtree is not pruned. And if pruning the node leads to greater expected error rate, then keep the subtree (no pruning).

4. Pre pruning is also known as online pruning.
a) True
b) False

Explanation: Pre – pruning is also known as forward pruning or online – pruning. Pre – pruning prevents the generation of non – significant branches. It prevents overfitting by trying to stop the tree – building process early, before it produces leaves with very small samples.

5. Which of the following statements is not a step in Pre pruning?
a) Pre – pruning a decision tree involves using a termination condition to decide when to terminate some of the branches prematurely as the tree is generated
b) When constructing the tree, some significant measures can be used to assess the goodness of a split
c) High threshold results in oversimplified trees
d) If partitioning the tuples at a node would result the split that falls below a pre specified threshold, then further partitioning of the given subset is expanded

Explanation: If partitioning the tuples at a node would result in the split that falls below a pre specified threshold, then further partitioning of the given subset is halted otherwise it is expanded. That is a high threshold result in oversimplified trees, and low threshold result in very little simplification.

6. Minimum number of objects pruning is a Post pruning technique.
a) True
b) False

Explanation: Minimum number of objects pruning is not a post pruning technique but it is a pre pruning technique. In this method of pruning, the minimum number of objects is specified as a threshold value. And there is one parameter minobj which is set to specify threshold value.

7. Which of the following statements is not true about Minimum number of objects pruning?
a) Whenever the split is made which yields a child leaf that represents less than minobj from the data set, the parent node and children node are compressed to a single node
b) Increasing no of objects increases accuracy of the dataset
c) Increasing no of objects simplifies the tree
d) The different ranges of the minimum no of objects are set for few examples and tested for accuracy

Explanation: In Minimum number of object pruning increasing no of objects reduces accuracy of the dataset, but it simplifies the tree. Whenever the split is made which yields a child leaf that represents less than minobj from the data set, the parent node and children node are compressed to a single node.

8. Which of the following is not a Post pruning technique?
a) Reduced error pruning
b) Error complexity pruning
c) Minimum error pruning
d) Chi – square pruning

Explanation: Chi – square pruning is not a post pruning technique but it is a pre pruning technique. It converts decision trees to a set of rules and eliminates variable values in rules which are independent of label using chi – square test for independence. And simplify rule set by eliminating unnecessary rules.

9. Which of the following is not a Post pruning technique?
a) Pessimistic error pruning
b) Iterative growing and pruning
c) Reduced error pruning
d) Early stopping pruning

Explanation: Early stopping pruning is also known as Pre pruning. So it is not a post pruning technique. To prevent overfitting it tries to stop the tree – building process early, before it produces leaves with very small samples. This heuristic is also known as Pre – pruning decision trees.

10. Consider we have a set of data with 3 classes, and we have observed 20 examples of which the greatest number 15 is in class c. If we predict that all future examples will be in class c, what is the expected error rate using minimum error pruning?
a) 0.304
b) 0.5y
c) 0.402
d) 0.561

Explanation: The expected error rate Ek = $$\frac {n – n_c + k – 1}{n + k}$$. Given n = 20, nc = 15 and k = 3.
Expected error rate Ek = $$\frac {20 – 15 + 3 – 1}{20 + 3}$$
= 723
= 0.304

11. Consider we have a set of data with 3 classes, and we have observed 20 examples of which the greatest number 15 is in class c. If we predict that all future examples will be in class c, what is the expected error rate without pruning?
a) 0.22
b) 0.17
c) 0.15
d) 0.05

Explanation: Given n = 20, nc = 15 and k = 3. Then without pruning the Expected error rate Ek will be:
Expected error rate Ek = $$\frac {n – k}{n} ( \frac {n – n_c – 1}{n}) + \frac {k}{n} (\frac {k – 1}{2k})$$
= $$\frac {20 – 3}{20} ( \frac {20 – 15 – 1}{20}) + \frac {3}{20} ( \frac {3 – 1}{2 * 3})$$
= $$\frac {17}{20}( \frac {4}{20}) + \frac {3}{20} ( \frac {2}{6})$$
= $$\frac {68}{400} + \frac {6}{120}$$
= 0.17 + 0.05
= 0.22

12. Consider the example, number of corrected mis – classifications at a particular node, n'(t) = 15.5, and number of corrected mis – classifications for sub – tree, n'(Tt) = 12. N(t) is the number of training set examples at node t and it is equal to 35. Here the tree should be pruned.
a) True
b) False

Explanation: We know the standard error SE = $$\sqrt {\frac {n'(T_t) *(N(t) – n'(T_t))}{N(t)}}$$

= $$\sqrt {\frac {12 *(35 – 12)}{35}}$$

= $$\sqrt {\frac {12 * 23}{35}}$$

= 2.8
Since 12 + 2.8 = 14.8, which is less than 15.5, the sub – tree should be kept and not pruned.

Sanfoundry Global Education & Learning Series – Machine Learning.

To practice all areas of Machine Learning, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]