Min Hash Multiple Choice Questions and Answers (MCQs)

This set of Data Structures & Algorithms Multiple Choice Questions & Answers (MCQs) focuses on “Min Hash”.

1. Which technique is used for finding similarity between two sets?
a) MinHash
b) Stack
c) Priority Queue
d) PAT Tree
View Answer

Answer: a
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of the similarity between two sets.

2. Who invented the MinHash technique?
a) Weiner
b) Samuel F. B. Morse
c) Friedrich Clemens Gerke
d) Andrei Broder
View Answer

Answer: d
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of the similarity between two sets. It was invented by Andrei Broder in 1997.

3. Which technique was firstly used to remove duplicate web pages from search results in AltaVista search engine?
a) MinHash
b) Stack
c) Priority Queue
d) PAT Tree
View Answer

Answer: a
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of the similarity between two sets. It is used in removing duplicate web pages from search results in AltaVista search engine.
advertisement
advertisement

4. Which technique was firstly used clustering documents using the similarity of two words or strings?
a) MinHash
b) Stack
c) Priority Queue
d) PAT Tree
View Answer

Answer: a
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of similarity between two sets. It is used in clustering documents using the similarity of two words or strings.

5. Which indicator is used for similarity between two sets?
a) Rope Tree
b) Jaccard Coefficient
c) Tango Tree
d) MinHash Coefficient
View Answer

Answer: b
Explanation: In computer science as well as data mining, to find the similarity between two given sets, a technique called MinHash or min-wise independent permutation scheme is used. It helps in the quick estimation of similarity between two sets. Jaccard Coefficient is used for similarity between two sets.
Sanfoundry Certification Contest of the Month is Live. 100+ Subjects. Participate Now!

6. Which of the following is defined as the ratio of total elements of intersection and union of two sets?
a) Rope Tree
b) Jaccard Coefficient Index
c) Tango Tree
d) MinHash Coefficient
View Answer

Answer: b
Explanation: MinHash helps in the quick estimation of similarity between two sets. Jaccard Coefficient is used for similarity between two sets. Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets.

7. What is the value of the Jaccard index when the two sets are disjoint?
a) 1
b) 2
c) 3
d) 0
View Answer

Answer: d
Explanation: MinHash helps in the quick estimation of similarity between two sets. Jaccard Coefficient is used for the similarity between two sets. Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets. For two disjoint sets, the value of the Jaccard index is zero.
advertisement

8. When are the members of two sets more common relatively?
a) Jaccard Index is Closer to 1
b) Jaccard Index is Closer to 0
c) Jaccard Index is Closer to -1
d) Jaccard Index is Farther to 1
View Answer

Answer: a
Explanation: Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets. For two disjoint sets, the value of the Jaccard index is zero. The members of two set more common relatively when the Jaccard Index is Closer to 1.

9. What is the expected error for estimating the Jaccard index using MinHash scheme for k different hash functions?
a) O (log k!)
b) O (k!)
c) O (k2)
d) O (1/k½)
View Answer

Answer: d
Explanation: Jaccard Coefficient Index is defined as the ratio of total elements of intersection and union of two sets. For two disjoint sets, the value of the Jaccard index is zero. The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½).
advertisement

10. How many hashes will be needed for calculating Jaccard index with an expected error less than or equal to 0.05?
a) 100
b) 200
c) 300
d) 400
View Answer

Answer: d
Explanation: The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½). 400 hashes will be needed for calculating Jaccard index with an expected error less than or equal to 0.05.

11. What is the expected error by the estimator Chernoff bound on the samples performed without replacement?
a) O (log k!)
b) O (k!)
c) O (k2)
d) O (1/k½)
View Answer

Answer: d
Explanation: The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½). The expected error by the estimator Chernoff bound on the samples performed without replacement is O (1/k½).

12. What is the time required for single variant hashing to maintain the minimum hash queue?
a) O (log n!)
b) O (n!)
c) O (n2)
d) O (n)
View Answer

Answer: d
Explanation: The expected error for estimating the Jaccard index using MinHash scheme for k different hash functions is O (1/k½). The time required for single variant hashing to maintain the minimum hash queue is O (n).

13. How many bits are needed to specify the single permutation by min-wise independent family?
a) O (log n!)
b) O (n!)
c) Ω (n2)
d) Ω (n)
View Answer

Answer: d
Explanation: The time required for single variant hashing to maintain the minimum hash queue is O (n). Ω (n) bits are needed to specify the single permutation by min-wise independent family.

14. Is MinHash used as a tool for association rule learning.
a) True
b) False
View Answer

Answer: a
Explanation: MinHash was originally used to remove the duplicate webpages from a search engine. But in data mining, MinHash used as a tool for association rule learning by Cohen at 2001.

15. Did Google conduct a large evaluation for comparing the performance by two technique MinHash and SimHash.
a) True
b) False
View Answer

Answer: a
Explanation: MinHash was originally used to remove the duplicate webpages from a search engine. But in data mining, MinHash used as a tool for association rule learning by Cohen at 2001. Google conducted a survey to compare the performance by two technique MinHash and SimHash.

Sanfoundry Global Education & Learning Series – Data Structure.

To practice all areas of Data Structure, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.