K-Nearest Neighbor Algorithm Questions and Answers

This set of Machine Learning Multiple Choice Questions & Answers (MCQs) focuses on “K-Nearest Neighbor Algorithm”.

1. Which of the following statements is false about k-Nearest Neighbor algorithm?
a) It stores all available cases and classifies new cases based on a similarity measure
b) It has been used in statistical estimation and pattern recognition
c) It cannot be used for regression
d) The input consists of the k closest training examples in the feature space
View Answer

Answer: c
Explanation: KNN is used for both classification and regression and the input consist of the k closest training examples in the feature space. It has been used in statistical estimation and pattern recognition. It stores all available cases and classifies new cases based on a similarity measure.

2. Which of the following statements is not true about k-Nearest Neighbor classification?
a) The output is a class membership
b) An object is classified by a plurality vote of its neighbors
c) If k = 1, then the object is simply assigned to the class of that single nearest neighbor
d) The output is the property value for the object
View Answer

Answer: d
Explanation: In k-Nearest Neighbor classification the output is a class membership and not the property value for the object. Here an object is classified by a plurality vote of its neighbors. So if k = 1, then the object is simply assigned to the class of that single nearest neighbor.

3. Suppose k = 3 and the data point A’s 3-nearest-neighbours from the dataset are instances X, Y and Z. The table shows their classes and the distances computed. Then A’s predicted class using majority voting will be ‘Good’?

Neighbor Class Distance
X Good 0.2
Y Bad 0.3
Z Bad 0.5
advertisement
advertisement

a) True
b) False
View Answer

Answer: b
Explanation: In majority voting approach, all votes are equal. For each class C∈ L, we count how many of the k neighbors have that class. We return the class with the most votes. So here are two classes ‘Good’ and ‘Bad’. And the class ‘Bad’ have the most votes (2 votes). So A’s predicted class using majority voting will be ‘Bad’.

4. We have data from a survey and objective testing with two attributes A and B to classify whether a special paper tissue is good or not. Here are four training samples given in the table. Now the factory produces a new paper tissue that pass laboratory test with A = 3 and B = 7. If K = 3, then ‘Good’ is the classification of this new tissue?

A B C = Classification
7 6 Bad
7 4 Bad
4 4 Good
2 4 Good

a) True
b) False
View Answer

Answer: a
Explanation: We have K = 3. Then we have,

A B Square distance to query instance (3, 7)
7 6 (7 – 3)2 + (6 – 7)2 = 17
7 4 (7 – 3)2 + (4 – 7)2 = 25
4 4 (4 – 3)2 + (4 – 7)2 = 10
2 4 (2 – 3)2 + (4 – 7)2 = 10

When sorting the distance we get,

A B Square distance to query instance (3, 7) Rank the mini-mum distance Is it included in the 3 nearest neigh-bors? C = Classification
7 6 17 3 Yes Bad
7 4 25 4 No Bad
4 4 10 1 Yes Good
2 4 10 2 Yes Good
advertisement

Use simple majority of the category of nearest neighbors as the prediction value of the query instance. Here we have 2 ‘Good’ and 1 ‘Bad’. Then the new tissue paper lies in the category of ‘Good’.

5. Suppose k = 3 and the data point A’s 3-nearest-neighbours from the dataset are instances X, Y and Z. The table shows their classes and the distances computed. Then A’s predicted class using inverse distance weighted voting will be ‘Good’?

Neighbor Class Distance
X Good 0.1
Y Bad 0.3
Z Bad 0.5
advertisement

a) True
b) False
View Answer

Answer: a
Explanation: In this approach, closer neighbors get higher votes. Take a neighbor’s vote to be the inverse of its distance to q and is known as Inverse distance weighted voting.
Vote (X) = 1 / 0.1
= 10
Vote(Y) = 1 / 0.3
= 3.33
Vote (Z) = 1 / 0.5
= 2
Here X (Good) gets a vote of 10 and Y (Bad), Z (Bad) together gets a vote of 5.33 only. So, the predicted class will be ‘Good’.

6. Which of the following statements is not true about k Nearest Neighbor?
a)It belongs to the supervised learning domain
b)It has an application in data mining and intrusion detection
c)It is Non-parametric
d) It is not an instance based learning algorithm
View Answer

Answer: d
Explanation: k-NN is based on supervised learning algorithm and a Non- parametric algorithm. It is also called as lazy learner algorithm. KNN is used in applications like data mining, intrusion decision and genetics, economic forecasting.

7. Which of the following statements is not supporting in defining k Nearest Neighbor as a lazy learning algorithm?
a) It defers data processing until it receives a request to classify unlabeled data
b) It replies to a request for information by combining its stored training data
c) It stores all the intermediate results
d) It discards the constructed answer
View Answer

Answer: c
Explanation: k Nearest Neighbor is considered to be as a lazy learning algorithm and it defers data processing until it receives a request to classify unlabeled data. It replies to a request for information by combining its stored training data. And the most important thing is that it discards the constructed answer and any intermediate results.

8. Which of the following statements is not supporting kNN to be a lazy learner?
a) When it gets the training data, it does not learn and make a model
b) When it gets the training data, it just stores the data
c) It derives a discriminative function from the training data
d) It uses the training data when it actually needs to do some prediction
View Answer

Answer: c
Explanation: It does not derive any discriminative function from the training data. So, kNN does not immediately learn a model, but delays the learning, that is why it is called lazy learner. All other three are the statements supporting kNN to be a lazy learner.

9. Euclidian distance and Manhattan distance are the same in kNN algorithm to calculate the distance.
a) True
b) False
View Answer

Answer: b
Explanation: Both Euclidian distance and Manhattan distance are used to calculate the distance between two points. But they are not the same. Euclidian distance takes the square root of the sum of the squares of the difference of the coordinates. Manhattan distance takes the sum of the absolute values of the difference of the coordinates.

10. What is the Manhattan distance between a data point (9, 7) and a new query instance (3, 4)?
a) 7
b) 9
c) 3
d) 4
View Answer

Answer: b
Explanation: Manhattan distance takes the sum of the absolute values of the difference of the coordinates. Let the data point be (x1, y1) = (9, 7) and query instance be (x2, y2) = (3, 4).
Manhattan distance, d = |x1 – x2| + |y1 – y2|
= |9 – 3| + |7 – 4|
= |6| + |3|
= 9

Sanfoundry Global Education & Learning Series – Machine Learning.

To practice all areas of Machine Learning, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.