# Machine Learning Questions and Answers – Empirical Minimization Framework

This set of Machine Learning Multiple Choice Questions & Answers (MCQs) focuses on “Empirical Minimization Framework”.

1. The true error is available to the learner.
a) True
b) False

Explanation: True error is calculated with respect to the probability distribution of the generation of dataset instances and labeling function. These two are not available to the learner. Hence, the learner cannot calculate the true error.

2. What is one of the drawbacks of Empirical Risk Minimization?
a) Underfitting
b) Both Overfitting and Underfitting
c) Overfitting
d) No drawbacks

Explanation: Empirical Risk Minimization makes the learner output a predictor which gives the minimum error on the training set. This often leads to a predictor that is specifically designed to be accurate on the training data set but fails to be highly accurate on the test set, as the predictor was training set-specific. This is overfitting.

3. The error available to the learner is ______
a) true error
b) error of the classifier
c) training error
d) testing error

Explanation: The learner only knows about the error it incurred over the training set instances. It is minimized by the learner to produce the labeling function. This is then used on the testing set to generate a testing error. The error produced by randomly selecting an instance from the dataset, and misclassifying it using the labeling function.

4. Which is the more desirable way to reduce overfitting?
a) Giving an upper bound to the size of the training set
b) Making the test set larger than the training set
c) Giving an upper bound to the accuracy obtained on the training set
d) Overfitting cannot be reduced

Explanation: More the number of training set examples, more specific the predictor is going to be to the training set. Hence reducing it can reduce overfitting. Making the test set larger than the training set will lead to underfitting, which is not desirable. Giving an upper bound on accuracy can abruptly stop the learner at a premature stage.

5. What is the relation between Empirical Risk Minimization and Training Error?
a) ERM tries to maximize training error
b) ERM tries to minimize training error
c) It depends on the dataset
d) ERM is not concerned with training error

Explanation: ERM makes the learner develop a predictor which works well on the training data (data available to the learner). Its aim is to minimize the error. Lesser the error, the better is the predictor (not considering overfitting).

6. What happens due to overfitting?
a) Hypothesis works poorly on training data but works well on test data
b) Hypothesis works well on training data and works well on test data
c) Hypothesis works well on training data but works poorly on test data
d) Hypothesis works poorly on training data and works poorly on test data

Explanation: ERM tries to minimize the training error. This often leads to the learner producing a hypothesis that is too specific to the training data. This then performs badly on any other data set. This is overfitting.

7. What is assumed while using empirical risk minimization with inductive bias?
a) The learner has some prior knowledge about training data
b) The learner has some knowledge about labeling function
c) Reduction of overfitting may lead to underfitting

Explanation: The learner must choose a hypothesis from a set of H, reduced hypothesis space. Since the choice is determined before seeing the training set, the learner needs to have prior knowledge of training data.

8. The hypothesis space H for inductive bias is a finite class.
a) False
b) True

Explanation: The hypothesis space H contains a finite number of hypothesizes. The learner is restricted to chose from only these hypothesizes. If hypothesis space is not finite, there is no question of restriction.

9. The assumption that the training set instances are independently and identically distributed is known as the __________
a) empirical risk assumption
b) inductive bias assumption
c) i.i.d. assumption
d) training set rule

Explanation: The three letters of i.i.d stands for independently and identically distributed. The instances are not dependent on each other. Every one of them is unique, and it is assumed that the instances follow a certain distribution.

10. Delta is the __________ parameter of the prediction.
a) training
b) confidence
c) accuracy
d) computing

Explanation: The confidence parameter is used to state that the chosen hypothesis will give a successful outcome with a certain probability. This probability is given by (1 – delta).

Sanfoundry Global Education & Learning Series – Machine Learning.

To practice all areas of Machine Learning, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]