Chapter : Lesson 3

Episode 6 - Gradient Descent in More Detail

face Josiah Wang

Summary:

  • Gradient descent algorithm (for two parameters):

    1. Choose random w1 and w0.
    2. Repeat until convergence:
      • w1w1αLw1
      • w2w2αLw2
  • Gradient/derivative: by how much does a value change when the value of a variable changes by a minuscule amount (tending towards zero).

  • For linear regression (assuming sum of squares error):

    • Loss function, L(θ)=12iN(w1x[i]+w0y[i])2=12iN(y^[i]y[i])2.
    • Lw1=iN(w1x[i]+w0y[i])x[i]=iN(y^[i]y[i])x[i].
    • Lw0=iN(w1x[i]+w0y[i])=iN(y^[i]y[i]).