This set of Machine Learning Multiple Choice Questions & Answers (MCQs) focuses on “Support Vector Machines”.

1. A Support Vector Machine (SVM) is a discriminative classifier defined by a separating hyperplane.

a) True

b) False

View Answer

Explanation: A Support Vector Machine (SVM) is a discriminative classifier defined by a separating hyperplane. Suppose we are given labeled training data, then the algorithm outputs an optimal hyperplane which categories new examples. And hyperplane is a line dividing a plane into two parts where in each class lay in either side.

2. Support vector machines cannot be used for regression.

a) False

b) True

View Answer

Explanation: Support Vector Machine (SVM) is a classification and regression prediction tool. These are a popular set of supervised learning algorithms originally developed for classification (categorical target) problems, and then extended to regression (numerical target) problems.

3. Which of the following statements is not true about SVM?

a) It is memory efficient

b) It can address a large number of predictor variables

c) It is versatile

d) It doesn’t require feature scaling

View Answer

Explanation: SVM requires feature scaling, so we have to do feature scaling of variables before applying SVM. SVMs are memory efficient, can address a large number of predictor variables and are versatile since they support a large number of different kernel functions.

4. Which of the following statements is not true about SVM?

a) It has regularization capabilities

b) It handles non-linear data efficiently

c) It has much improved stability

d) Choosing an appropriate kernel function is easy

View Answer

Explanation: Choosing an appropriate kernel function is not an easy task. It could be tricky and complex. In case of using a high dimension kernel, you might generate too many support vectors which reduce the training speed. All other three statements are advantages of SVM.

5. Minimizing a quadratic objective function (*w\(_i^2\)*) subject to certain constraints where i= 1 to n, in SVM is known as primal formulation of linear SVMs.

a) True

b) False

View Answer

Explanation: Minimizing a quadratic objective function (

*w\(_i^2\)*) subject to certain constraints in SVM is known as primal formulation of linear SVMs. It is an SVM optimisation problem. It is a convex quadratic programming optimisation problem with n variables, where

*n*is the number of features in the data set.

6. Given a primal problem *f**, minimizing *x ^{2}* subject to

*x*>= b and a dual problem

*d**, maximizing

*d(α)*subject to

*α*>= 0. Then

*d* = f**if

*f*is non convex and

*x*, α**satisfy zero gradient, primal feasibility dual feasibility, and complementary slackness.

a) True

b) False

View Answer

Explanation: Given a primal problem

*f**, minimizing

*x*subject to

^{2}*x*>= b and a dual problem

*d**, maximizing

*d(α)*subject to

*α*>= 0. Then

*d* = f**if

*f*is non convex and

*x*, α**satisfy zero gradient, primal feasibility, dual feasibility and complementary slackness. These are the Karush–Kuhn–Tucker (KKT) conditions.

7. Which of the following statements is not true about dual formulation in SVM optimisation problem?

a) No need to access data, need to access only dot products

b) Number of free parameters is bounded by the number of support vectors

c) Number of free parameters is bounded by the number of variables

d) Regularizing the sparse support vector associated with the dual hypothesis is sometimes more intuitive than regularizing the vector of regression coefficients

View Answer

Explanation: In dual formulation in SVM optimisation problem number of free parameters is bounded not by the number of variables but by the number of support vectors. All other three statements are benefits of dual formulation in SVM optimisation problem.

8. The optimal classifier is the one with the largest margin.

a) True

b) False

View Answer

Explanation: Consider all the samples are correctly classified, where the data point can be as far from the decision boundary as possible. Then we introduce the concept of margin to measure the distance from data samples to separating hyperplane. So the optimal classifier is the one with the largest margin.

9. Suppose we have an equality optimization problem as follows: Minimize f(x, y) = x + 2y subject to x^{2} + y^{2} – 4 = 0. While solving the above equation we get x = ± \(\frac {2}{\sqrt 5}\), y = ± \(\frac {4}{\sqrt 5}\), λ = ± \(\frac {\sqrt 5}{4}\). At what value of x and y does the function f(x, y) has its minimum value?

a) –\(\frac {2}{\sqrt 5}, – \frac {4}{\sqrt 5}\)

b) \(\frac {2}{\sqrt 5}, – \frac {4}{\sqrt 5}\)

c) –\(\frac {2}{\sqrt 5}, \frac {4}{\sqrt 5}\)

d) \(\frac {2}{\sqrt 5}, \frac {4}{\sqrt 5}\)

View Answer

Explanation: When x = –\(\frac {2}{\sqrt 5}\), y = –\(\frac {4}{\sqrt 5}\) and λ = ± \(\frac {\sqrt 5}{4}\),

f(x, y, λ) = x + 2y + λ(x

^{2}+ y

^{2}– 4)

= –\(\frac {2}{\sqrt 5} + -\frac {8}{\sqrt 5} ± \frac {\sqrt 5}{4} ( \frac {4}{5} + \frac {16}{5} – 4)\)

= –\(\frac {10}{\sqrt 5} ± \frac {\sqrt 5}{4}\) (4 – 4)

= –\(\frac {10}{\sqrt 5} ± \frac {\sqrt 5}{4}\) * 0

= –\(\frac {10}{\sqrt 5}\)

Similarly when x = \(\frac {2}{\sqrt 5}\), y = \(\frac {4}{\sqrt 5}\) and λ = ± \(\frac {\sqrt 5}{4}\),

f(x, y, λ) = \(\frac {10}{\sqrt 5}\)

When x = –\(\frac {2}{\sqrt 5}\), y = \(\frac {4}{\sqrt 5}\) and λ = ± \(\frac {\sqrt 5}{4}\)

f(x, y, λ) = \(\frac {6}{\sqrt 5}\)

When x = \(\frac {2}{\sqrt 5}\), y = –\(\frac {4}{\sqrt 5}\) and λ = ± \(\frac {\sqrt 5}{4}\)

f(x, y, λ) = –\(\frac {6}{\sqrt 5}\)

So the function f(x, y) has its minimum value (-\(\frac {10}{\sqrt 5}\)) at x = –\(\frac {2}{\sqrt 5}\) and y = –\(\frac {4}{\sqrt 5}\).

10. Suppose we have an equality optimization problem as follows: Minimize f(x, y) = x + y subject to x^{2} + y^{2} – 2 = 0. While solving the above equation what will be the value x, y and λ?

a) ±1, ±1, ± \(\frac {1}{2}\)

b) ±2, ±1, ± \(\frac {1}{2}\)

c) ±1, ±2, ± \(\frac {1}{2}\)

d) ±\(\frac {1}{2}\), ±\(\frac {1}{2}\), ± 1

View Answer

Explanation: We know the Lagrangian L(x, y, λ) = x + y + λ(x

^{2}+ y

^{2}− 2)

\(\frac {\delta L}{\delta x}\) = 1 + 2λx = 0

\(\frac {\delta L}{\delta y}\) = 1 + 2λy = 0

\(\frac {\delta L}{\delta \lambda}\) = x

^{2}+ y

^{2}− 2 = 0

By solving the above three equations we get x = ±1, y = ±1 and λ = ±\(\frac {1}{2}\).

11. Suppose we have an equality optimization problem as follows: Minimize f(x, y) = x + 2y subject to x^{2} + y^{2} – 9 = 0. What will be the value of x, y and λ?

a) ± \(\frac {3}{\sqrt 5}\), ±\(\frac {6}{\sqrt 5}\), ±\(\frac {\sqrt 5}{6}\)

b) ± \(\frac {9}{5}\), ±\(\frac {6}{5}\), ±\(\frac {5}{6}\)

c) ± \(\frac {9}{\sqrt 5}\), ±\(\frac {6}{5}\), ±\(\frac {5}{6}\)

d) ± \(\frac {3}{5}\), ±\(\frac {6}{5}\), ±\(\frac {\sqrt 5}{6}\)

View Answer

Explanation: We know the Lagrangian L(x, y, λ) = x + 2y + λ(x

^{2}+ y

^{2}– 9).

\(\frac {\delta L}{\delta x}\) = 1 + 2λx = 0

\(\frac {\delta L}{\delta y}\) = 2 + 2λy = 0

\(\frac {\delta L}{\delta \lambda}\) = x

^{2}+ y

^{2}– 9 = 0

By solving the above three equations we get x = ± \(\frac {3}{\sqrt 5}\), y = ± \(\frac {6}{\sqrt 5}\) and λ = ± \(\frac {\sqrt 5}{6}\).

**More MCQs on Support Vector Machines:**

- Support Vector Machines MCQ (Set 2)
- Support Vector Machines MCQ (Set 3)
- Support Vector Machines MCQ (Set 4)

**Sanfoundry Global Education & Learning Series – Machine Learning**.

To practice all areas of Machine Learning, __ here is complete set of 1000+ Multiple Choice Questions and Answers__.

**If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]**