Data Science Questions and Answers – caret – 3

This set of Data Science Multiple Choice Questions & Answers focuses on “Caret – 3”.

1. varImp is a wrapper around the evimp function in the _______ package.
a) numpy
b) earth
c) plot
d) none of the mentioned
View Answer

Answer: b
Explanation: The earth package is an implementation of Jerome Friedman’s Multivariate Adaptive Regression Splines.

2. Point out the wrong statement.
a) The trapezoidal rule is used to compute the area under the ROC curve
b) For regression, the relationship between each predictor and the outcome is evaluated
c) An argument, para, is used to pick the model fitting technique
d) All of the mentioned
View Answer

Answer: c
Explanation: An argument, nonpara, is used to pick the model fitting technique.

3. Which of the following curve analysis is conducted on each predictor for classification?
a) NOC
b) ROC
c) COC
d) All of the mentioned
View Answer

Answer: b
Explanation: For two class problems, a series of cutoffs is applied to the predictor data to predict the class.

4. Which of the following function tracks the changes in model statistics?
a) varImp
b) varImpTrack
c) findTrack
d) none of the mentioned
View Answer

Answer: a
Explanation: GCV change value can also be tracked.

advertisement
advertisement

5. Point out the correct statement.
a) The difference between the class centroids and the overall centroid is used to measure the variable influence
b) The Bagged Trees output contains variable usage statistics
c) Boosted Trees uses different approach as a single tree
d) None of the mentioned
View Answer

Answer: a
Explanation: The larger the difference between the class centroid and the overall center of the data, the larger the separation between the classes.

6. Which of the following model model include a backwards elimination feature selection routine?
a) MCV
b) MARS
c) MCRS
d) All of the mentioned
View Answer

Answer: b
Explanation: MARS stands for Multivariate Adaptive Regression Splines.

Sanfoundry Certification Contest of the Month is Live. 100+ Subjects. Participate Now!

7. The advantage of using a model-based approach is that is more closely tied to the model performance.
a) True
b) False
View Answer

Answer: a
Explanation: Model-based approach is able to incorporate the correlation structure between the predictors into the importance calculation.

8. Which of the following model sums the importance over each boosting iteration?
a) Boosted trees
b) Bagged trees
c) Partial least squares
d) None of the mentioned
View Answer

Answer: a
Explanation: gbm package can be used here.

advertisement

9. Which of the following argument is used to set importance values?
a) scale
b) set
c) value
d) all of the mentioned
View Answer

Answer: a
Explanation: All measures of importance are scaled to have a maximum value of 100.

10. For most classification models, each predictor will have a separate variable importance for each class.
a) True
b) False
View Answer

Answer: a
Explanation: The exceptions are classification trees, bagged trees and boosted trees.

advertisement

Sanfoundry Global Education & Learning Series – Data Science.

Here’s the list of Best Books in Data Science.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.