This set of Data Science Multiple Choice Questions & Answers (MCQs) focuses on “Introduction to Regression Models”.

1. Which of the following function can be replaced with the question mark in the below figure ?

a) boxplot

b) lplot

c) levelplot

d) all of the Mentioned

Explanation: levelplot is used plotting “image”.

2. Point out the correct statement:

a) The mean is a measure of central tendency of the data

b) Empirical mean is related to “centering” the random variables

c) The empirical standard deviation is a measure of spread

d) All of the Mentioned

Explanation: The process of centering then scaling the data is called “normalizing” the data.

3. Which of the following implies no relationship with respect to correlation ?

a) Cor(X, Y) = 1

b) Cor(X, Y) = 0

c) Cor(X, Y) = 2

d) All of the Mentioned

Explanation: Correlation is a statistical technique that can show whether and how strongly pairs of variables are related.

4. Normalized data are centered at ___ and have units equal to standard deviations of the original data.

a) 0

b) 5

c) 1

d) 10

Explanation: In statistics and applications of statistics, normalization can have a range of meanings.

5. Point out the wrong statement:

a) Regression through the origin yields an equivalent slope if you center the data first

b) Normalizing variables results in the slope being the correlation

c) Least squares is not an estimation tool

d) None of the Mentioned

Explanation: Least squares is an estimation tool.

6. Which of the following is correct with respect to residuals ?

a) Positive residuals are above the line, negative residuals are below

b) Positive residuals are below the line, negative residuals are above

c) Positive residuals and negative residuals are below the line

d) All of the Mentioned

Explanation: Residuals can be thought of as the outcome with the linear association of the predictor removed.

7. Minimizing the likelihood is the same as maximizing -2 log likelihood.

a) True

b) False

Explanation: Maximizing the likelihood is the same as minimizing 2 log likelihood.

8. Which of the following refers to the circumstance in which the variability of a variable is unequal across the range of values of a second variable that predicts it ?

a) Heterogenity

b) Heteroskedasticity

c) Heteroelasticty

d) None of the Mentioned

Explanation: Heteroskedasticity has serious consequences for the OLS estimator.

9. Which of the following outcome is odd man out in the below figure ?

a) R Squared

b) Kappa

c) RMSE

d) All of the Mentioned

Explanation: Kappa is categorical outcome.

10. Residuals are useful for investigating best model fit.

a) True

b) False

Explanation: Residuals are useful for investigating poor model fit.

