Gradient Descent for Multiple Variables Questions and Answers

This set of Machine Learning Multiple Choice Questions & Answers (MCQs) focuses on “Gradient Descent for Multiple Variables”.

1. The cost function is minimized by __________
a) Linear regression
b) Polynomial regression
c) PAC learning
d) Gradient descent
View Answer

Answer: d
Explanation: Gradient descent starts with a random value of t₀, t₁, …, t_n. It alters them in order to change the cost function at a particular learning rate. Once it reaches a local minimum, it stops and outputs the value of t₀, t₁, …, t_n.

2. What is the minimum number of parameters of the gradient descent algorithm?
a) 1
b) 2
c) 3
d) 4
View Answer

Answer: c
Explanation: Since multivariate linear regression is being considered, the minimum number of features is 2. For these two variables, two parameters are t₁ and t₂. Another parameter that is required is t₀ which gives the y-intercept.

3. What happens when the learning rate is low?
a) It always reaches the minima quickly
b) It reaches the minima very slowly
c) It overshoots the minima
d) Nothing happens
View Answer

Answer: b
Explanation: If the learning rate is low, the gradient descent reaches the minima very slowly. The parameters are then altered by a very low percentage and thus a lot of iterations are required to reach the minima. It is time ineffective.

4. When was gradient descent invented?
a) 1847
b) 1947
c) 1857
d) 1957
View Answer

Answer: a
Explanation: Augustin-Louis Cauchy, a French mathematician invented the concept of gradient descent in 1847. Since then, it has been modified a few times. Gradient descent algorithm has a lot of different applications.

5. Gradient descent tries to _____________
a) maximize the cost function
b) minimize the cost function
c) minimize the learning rate
d) maximize the learning rate.
View Answer

Answer: b
Explanation: Gradient descent tries to minimize the cost function by updating the values of t₀, t₁, …, t_n after each iteration. The change in the values of t₀, t₁, …, t_n depends on the learning rate.

6. Feature scaling can be used to simplify gradient descent for multivariate linear regression.
a) True
b) False
View Answer

Answer: a
Explanation: There are multiple features in multivariate linear regression and all of them have different ranges. This increases the complexity of gradient descent. So, feature scaling is used to make the ranges of each feature similar.

7. x₁’s range is 0 to 300. x₂’s range is 0 to 1000. What are the suitable ranges of x₁ and x₂ after mean normalization?
a) x₁ = (x₁ – 150)/300, x₂ = (x₂-500)/1000
b) x₁ = x₂ – 700
c) x₁ = x₁ – 300, x₂ = x₂ – 1000
d) x₁ = x₁/300, x₂ = x₂/1000
View Answer

Answer: a
Explanation: Mean normalization tries to make the range of each feature similar. It subtracts the mean from the value and divides it by the upper bound of the range. After updating (x₁ – 150)/300, x₂ = (x₂-500)/1000, we get, x₁’s range is -0.5 to 0.5 and x₂’s range is -0.5 to 0.5.

8. x₁’s range is 0 to 300. x₂’s range is 0 to 1000. What are the suitable ranges of x₁ and x₂ after feature scaling?
a) x₁ = x₁ – 300, x₂ = x₂ – 1000
b) x₁ = x₂ – 700
c) x₁ = x₁/1000, x₂ = x₂/ 300
d) x₁ = x₁/300, x₂ = x₂/1000
View Answer

Answer: d
Explanation: Feature scaling tries to make the range of each feature similar. After updating x₁ = x₁/300, x₂ = x₂/1000, we get, x₁’s range is 0 to 1 and x₂’s range is 0 to 1.

9. On which factor is the updating of each parameter dependent on?
a) The number of training examples
b) Target variable
c) The learning rate and the target variable
d) The learning rate
View Answer

Answer: c
Explanation: Updating each factor depends on both the learning rate and the target variable. If the learning rate is high, the change will be more and vice-versa. The updating depends on how much closer the value predicted by the hypothesis is to the value of the target variable.

10. What is updated by gradient descent after each iteration?
a) The learning rate
b) Independent variables
c) Target variable
d) The number of training examples
View Answer

Answer: b
Explanation: The gradient descent algorithm updates the value of all the features. It is done in order to minimize the cost function. The change in the value of the independent variables depends on the learning rate.

11. Who introduced the topic of gradient descent?
a) Vapnik
b) Augustin-Louis Cauchy
c) Chervonenkis
d) Alan Turing
View Answer

Answer: b
Explanation: Cauchy invented gradient descent in 1847. Vapnik and Chervonenkis introduced the concept of VC dimension. Alan Turing is known as the father of computer science, for his various works in the field of artificial intelligence, cryptanalysis, amongst others.

12. Mean normalization can be used to simplify gradient descent for multivariate linear regression.
a) True
b) False
View Answer

Answer: a
Explanation: Mean normalization tries to reduce the complexity of gradient descent by scaling down the range of each feature. It subtracts the mean from the value of the independent variable and divides it by the upper limit of its range.

Sanfoundry Global Education & Learning Series – Machine Learning.

To practice all areas of Machine Learning, here is complete set of 1000+ Multiple Choice Questions and Answers.

If you find a mistake in question / option / answer, kindly take a screenshot and email to [email protected]

« Prev - Multivariate Linear Regression Questions and Answers

» Next - Polynomial Regression in Machine learning Questions and Answers

Recommended Articles: