This set of Data Mining Multiple Choice Questions & Answers (MCQs) focuses on “Data Transformation and Data Discretization”.
1. If the class information is used during discretization process, it is called _____
a) Supervised discretization
b) Unsupervised discretization
c) Clustered discretization
d) Disorganized discretization
View Answer
Explanation: Data discretization technique is used to divide the range of attributes into intervals. When the discretization process uses class information, it is known as supervised discretization.
2. The process of using a few cut points to split the entire attribute range recursively is also referred to as _____
a) Splitting
b) Merging
c) Bottom up discretization
d) Approximate discretization
View Answer
Explanation: In top down discretization, which is also known as splitting, a few points are chosen to split the entire attribute range. These points are known as split points or cut points. This process is recursively applied on the resulting intervals.
3. Which of the following is true about bottom up discretization?
a) All the values are treated as potential split points
b) Some the values are treated as potential split points
c) Only one value are treated as potential split points
d) Split points are not considered
View Answer
Explanation: In bottom up discretization, also known as merging, all the values are considered as potential split points, which are then merged to form intervals recursively.
4. To avoid the dependence of attribute on the choice of measurement units, the data is _____
a) Subtracted
b) Normalized
c) Graphically plotted
d) Sampled
View Answer
Explanation: The dataset may have different measurement units for different attributes. This may led to some attributes holding more weight than others. To avoid the dependence of attributes on the choice of measurement units, data normalization is performed.
5. Given the maximum and minimum height of students of a class as 190 cm and 157 cm, a student having a height of 178 cm when normalized to the range of [0.0 to 1.0] using min max normalization will have the normalized height as _____
a) Radius of the cluster
b) Centroid distance
c) Median distance
d) Mean square distance
View Answer
Explanation: Using min max normalization, a value v is mapped to v1 using the formula:
v1 = \(\frac{v-old_{min}}{old_{max} – old_{min}} (new_{max} – new_{min}) + (new_{min})\)
Given, oldmin = 157, oldmax = 190, newmin = 0, newmax = 1, v = 178
v1 = [((178 – 157)/(190 – 157))*(1 – 0)] + 0
v1 = (21/33) = 0.63
Hence, using min max normalization, v1 = 0.63
Sanfoundry Global Education & Learning Series – Data Mining.
To practice all areas of Data Mining, here is complete set of Multiple Choice Questions and Answers.