Data Mining Questions and Answers – Data Transformation and Data Discretization

This set of Data Mining Multiple Choice Questions & Answers (MCQs) focuses on “Data Transformation and Data Discretization”.

1. If the class information is used during discretization process, it is called _____
a) Supervised discretization
b) Unsupervised discretization
c) Clustered discretization
d) Disorganized discretization
View Answer

Answer: a
Explanation: Data discretization technique is used to divide the range of attributes into intervals. When the discretization process uses class information, it is known as supervised discretization.

2. The process of using a few cut points to split the entire attribute range recursively is also referred to as _____
a) Splitting
b) Merging
c) Bottom up discretization
d) Approximate discretization
View Answer

Answer: a
Explanation: In top down discretization, which is also known as splitting, a few points are chosen to split the entire attribute range. These points are known as split points or cut points. This process is recursively applied on the resulting intervals.

3. Which of the following is true about bottom up discretization?
a) All the values are treated as potential split points
b) Some the values are treated as potential split points
c) Only one value are treated as potential split points
d) Split points are not considered
View Answer

Answer: a
Explanation: In bottom up discretization, also known as merging, all the values are considered as potential split points, which are then merged to form intervals recursively.

4. To avoid the dependence of attribute on the choice of measurement units, the data is _____
a) Subtracted
b) Normalized
c) Graphically plotted
d) Sampled
View Answer

Answer: b
Explanation: The dataset may have different measurement units for different attributes. This may led to some attributes holding more weight than others. To avoid the dependence of attributes on the choice of measurement units, data normalization is performed.

5. Given the maximum and minimum height of students of a class as 190 cm and 157 cm, a student having a height of 178 cm when normalized to the range of [0.0 to 1.0] using min max normalization will have the normalized height as _____
a) Radius of the cluster
b) Centroid distance
c) Median distance
d) Mean square distance
View Answer

Answer: d
Explanation: Using min max normalization, a value v is mapped to v1 using the formula:
v1 = \(\frac{v-old_{min}}{old_{max}⁡ – old_{min}} (new_{max} – new_{min}) + (new_{min})\)
Given, oldmin = 157, oldmax = 190, newmin = 0, newmax = 1, v = 178
v1 = [((178 – 157)/(190 – 157))*(1 – 0)] + 0
v1 = (21/33) = 0.63
Hence, using min max normalization, v1 = 0.63

Sanfoundry Global Education & Learning Series – Data Mining.

To practice all areas of Data Mining, here is complete set of Multiple Choice Questions and Answers.


If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.