Data Mining Questions and Answers – What Kind of Patterns can be Mined – Set 2

This set of Data Mining Multiple Choice Questions & Answers (MCQs) focuses on “What Kind of Patterns can be Mined – Set 2”.

1. Given below is the diagram showing decision tree. What would be the class of a person who is 23 years old and owns a car?
data-mining-questions-answers-questions-answers-what-kind-patterns-can-mined-2-q1

a) Class A
b) Class B
c) Class C
d) Class D
View Answer

Answer: b
Explanation: The very first rectangle checks his age i.e. 25 and places him under the category of Young and the rectangle below the Youth arrow checks whether or not the person owns a Vehicle since he owns a car he is put in Class B.

2. Which among the following methods is not used for constructing a neural network?
a) Bayesian Classification
b) Support Vector Machines
c) Stemming Algorithms
d) K-nearest neighbor Classification
View Answer

Answer: c
Explanation: Stemming Algorithm is used to remove the unwanted letters which are not relevant for extracting the information from the database. A neural network is a collection of neuron like processing units with weighted connections between the units. Bayesian Classification, Support vector Machines, K-nearest neighbor Classification are the methods used for constructing neural networks.
advertisement
advertisement

3. Which values are generally modeled by Regression Analysis?
a) Continuous Valued Functions
b) Discrete Class Labels
c) Unordered Class Labels
d) Distorted Class Labels
View Answer

Answer: a
Explanation: Classification predicts categorical (discrete, unordered) labels, regression models continuous-valued functions. That is, regression is used to predict missing or unavailable numerical data values rather than (discrete) class labels.

4. Which among the given methods are preceded by Relevance Analysis?
a) Characterization and Discrimination
b) Association and Correlation
c) Classification and Regression
d) Clustering and Outlier Analysis
View Answer

Answer: c
Explanation: Classification and Regression may need to be preceded by relevance analysis, which attempts to identify attributes that are significantly relevant to the Classification and Regression process.

5. In which phase of data segregation is the clustering used?
a) Intermediate Phase
b) First Phase
c) Last Phase
d) Second Phase
View Answer

Answer: b
Explanation: Clustering analyzes data objects without consulting class labels. In many cases, class labeled data may simply not exist at the beginning. Clustering can be used to generate class labels for a group of data.

6. On which principles are the objects clustered or grouped?
a) Maximizing intraclass similarity
b) Minimizing intraclass similarity
c) Maximizing interclass similarity
d) Maximizing intraclass dissimilarity
View Answer

Answer: a
Explanation: The objects are clustered or grouped based on the principle of maximizing intraclass similarity and minimizing interclass similarity. Objects within the cluster have high similarity to the objects which are outside the cluster or another cluster.
advertisement

7. Outliers are the objects that do comply with the general behavior of the Class/Model.
a) True
b) False
View Answer

Answer: b
Explanation: A data set may contain objects that do not comply with the general behavior or model of the data. These data objects are Outliers. Many data mining methods discard outliers as noise and exceptions, but these are interesting in some domains.

8. Can the outliers be detected from the cluster analysis?
a) Yes
b) No
View Answer

Answer: a
Explanation: Outliers may be detected using statistical tests that assume a distribution or probability model for the data, or using distance measures where objects that are remote from any other cluster are considered outliers.
advertisement

9. Can the data mining generate all interesting patterns?
a) Yes
b) No
View Answer

Answer: b
Explanation: An ideal data mining system would generate all interesting patterns and only the interesting patterns. But the current data mining systems predict many other patterns along with the interesting ones and only the interesting ones.

10. Are all of the patterns generated by data mining system interesting?
a) Yes
b) No
View Answer

Answer: a
Explanation: The answer is no, only a small fraction of the patterns potentially generated would actually be of interest to a given user. And there are methods to predict whether or not a pattern generated is interesting. These measures are called interesting measures.

11. With what probabilistic measure can the Support measure be found for association rules of the form X ⇒Y ?
a) P(X U Y)
b) P(X ꓵ Y)
c) P (X|Y)
d) P (Y|X)
View Answer

Answer: a
Explanation: An objective measure for association rules of the form X ⇒Y is rule support, representing the percentage of transactions from a transaction database that the given rule satisfies. This is taken to be the probability P (X∪Y), where X∪Y indicates that a transaction contains both X and Y.

12. With what probabilistic measure can the Confidence measure be found for association rules of the form X ⇒Y ?
a) P(X U Y)
b) P(X ꓵ Y)
c) P (X|Y)
d) P (Y|X)
View Answer

Answer: c
Explanation: One of the objective measure for association rules is Confidence, which assesses the degree of certainty of the detected association. This is taken to be the conditional probability P(Y|X), that is, the probability that a transaction containing X also contains Y.

13. What are the objective interesting measures for Classification (If Then) rules?
a) Support and Confidence
b) Coverage and Confidence
c) Support and Accuracy
d) Accuracy and Coverage
View Answer

Answer: d
Explanation: The objective interesting measures for Association rules are Support and Confidence and the objective interesting measures for Classification (IF- Then) rules are Accuracy and Coverage.

14. Which among the following is not the interesting pattern as far the System perspective of the Interesting patterns are concerned?
a) Novel
b) Easily understood by Humans
c) Valid
d) Unexpected
View Answer

Answer: d
Explanation: A pattern is interesting as per the System Perspective if it is (1) easily understood by humans, (2) valid on new or test data with some degree of certainty, (3) potentially useful, and (4) novel.

15. Which among the following is not the interesting pattern as far the user perspective of the Interesting patterns are concerned?
a) Completeness
b) Actionable
c) Valid
d) Unexpected
View Answer

Answer: c
Explanation: These measures find patterns interesting if the patterns are unexpected (contradicting a user’s belief) or offer strategic information on which the user can act. In the latter case, such patterns are referred to as actionable. Patterns that are expected can be interesting if they validate on a hypothesis.

Sanfoundry Global Education & Learning Series – Data Mining.

To practice all areas of Data Mining, here is complete set of Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.