Lesson 3
Linear Regression
Chapter : Lesson 3
Episode 6 - Gradient Descent in More Detail
Summary:
-
Gradient descent algorithm (for two parameters):
- Choose random w_1 and w_0.
- Repeat until convergence:
- w_1 \leftarrow w_1 - \alpha \frac{\partial L}{\partial w_1}
- w_2 \leftarrow w_2 - \alpha \frac{\partial L}{\partial w_2}
-
Gradient/derivative: by how much does a value change when the value of a variable changes by a minuscule amount (tending towards zero).
-
For linear regression (assuming sum of squares error):
- Loss function, L(\theta) = \frac{1}{2} \sum_i^N (w_1 x^{[i]} + w_0 - y^{[i]})^2 = \frac{1}{2} \sum_i^N (\hat{y}^{[i]} - y^{[i]})^2.
- \frac{\partial L}{\partial w_1} = \sum_i^N (w_1 x^{[i]} + w_0 - y^{[i]}) x^{[i]} = \sum_i^N (\hat{y}^{[i]} - y^{[i]}) x^{[i]}.
- \frac{\partial L}{\partial w_0} = \sum_i^N (w_1 x^{[i]} + w_0 - y^{[i]}) = \sum_i^N (\hat{y}^{[i]} - y^{[i]}).