This set of Data Mining Multiple Choice Questions & Answers (MCQs) focuses on “Basic Data Mining Tasks, KDD,Issues – Set 2”.
1. Which of the following is not a motivating factor for the development of data mining tools?
a) Data rich but information poor situation
b) Data tombs
c) Dependency on domain experts in expert systems
d) Data cleaning
View Answer
Explanation: The presence of a huge amount of data but the inability to extract information from this data, also described as data rich but information poor situation, led to the need for data mining tools. This data stored in databases, when not used much, form data tombs. Expert systems formed to assist analysis of data require domain knowledge so it was also not completely error-free. All these situations motivated the development of data mining tools. Data cleaning, on the other hand, is a step towards data mining.
2. After cleaning and integrating data from heterogeneous sources, the data is stored in _____
a) Flat files
b) Database
c) Data Warehouse
d) Directories
View Answer
Explanation: Data cleaned and integrated from heterogeneous data sources is stored in a data warehouse. The data warehouse architecture provides various analytical functionalities like aggregation for increased decision support.
3. Which of the following characteristic refers to the ability of a data mining system to generate all possible interesting patterns?
a) Completeness
b) Optimization
c) Pruning
d) Hypothesis
View Answer
Explanation: The completeness of a data mining algorithm refers to the ability of the data mining system to generate all possible interesting patterns. This may require user constraints as the generation of all possible patterns is often impractical and inefficient.
4. Which of the following is a subset of data warehouse focused on a specific functional area?
a) Flat files
b) Database
c) Data mart
d) Association rules
View Answer
Explanation: The data mart is a subset of data warehouse and is oriented to a specific functional area or subject. Data warehouse, on the other hand, is oriented towards different functional areas and may have a more complex design than a data mart.
5. The algorithms which partition the data into pieces, which are further processed in parallel, are known as _____
a) Parallel and distributed data-intensive mining algorithms
b) Incremental mining
c) Spooling
d) Clustering
View Answer
Explanation: The set of algorithms that first partition the data into pieces, process these data pieces in parallel and then, merge the patterns from each partition, are called parallel and distributed data-intensive mining algorithms. These algorithms are developed to reduce the complexity generated due to the huge size of data.
6. Which of the following technique is used to avoid the mining of data from scratch due to new data updates?
a) Parallel and distributed data-intensive mining algorithms
b) Incremental mining
c) Dynamic mining
d) Ad hoc data mining
View Answer
Explanation: Due to the presence of new data updates, the mining process has to be executed again from scratch. This is avoided by incremental data mining which performs knowledge modification incrementally.
7. Data mining involves various algorithms that try to fit a model to the data.
a) True
b) False
View Answer
Explanation: The data mining algorithms make an effort to fit a model to the data based on some criteria or preference, which classifies one model better than the other, depending on the data and task being performed.
8. Which of the following statement about data mining query language (DMQL) is not true?
a) DMQL is based on SQL
b) DMQL does not allow access to concept hierarchies
c) DMQL query can specify a condition to be satisfied by the mined data
d) DMQL query must specify the knowledge type that has to be mined
View Answer
Explanation: Data mining query language (DMQL) is a sophisticated query language based on SQL. In addition to the basic functionalities, it provides additional features like access to concept hierarchies, specification of conditions on mined data, specification of knowledge type to be mined.
9. Which of the following statement about knowledge and data discovery management system (KDDMS) is false?
a) It will include data mining tools and data management tools
b) It will include data mining tools but not data management tools
c) It will provide concurrency features
d) It will provide recovery features
View Answer
Explanation: Knowledge and data discovery management systems (KDDMS) are the upcoming data mining systems that will include data mining tools, data management tools, concurrency features, recovery features, and will also ensure data consistency.
10. Data mining involves various algorithms that do not require search functionality for searching the data based on some properly defined criteria.
a) True
b) False
View Answer
Explanation: Data mining involves various algorithms that require a search technique for searching the data based on some preference or criteria. The criteria, which help to choose the model to fit to the data, should be properly defined.
Sanfoundry Global Education & Learning Series – Data Mining.
To practice all areas of Data Mining, here is complete set of Multiple Choice Questions and Answers.