This is not an error, but something that confused me and may confuse others. In the week 2 assignment section “3 - Linear Regression using Gradient Descent” it presents equation 1 which makes sense, with the earlier caveat that “Division by 2
is taken just for scaling purposes.” So far so good…
At first read the partial derivatives in equation 2 made no sense to me - where were they coming from? And why does the 2nd not have a second x? I eventually figured out the derivations but this took time, probably as a) I’m new to partial derivatives, b) the summation distracted me considerably. Explicitly stating the following might help others:
- the (mX + b - Y) ^2 part of pdEwrtM is d(mX + b - Y)(mX + b - Y) = (mX + b - Y) (X) + (X) (mX + b - Y) = 2 (mX + b - Y)
- the (mX + b - Y) ^2 part of pdEwrtB is d(mX + b - Y)(mX + b - Y) = (mx + b - Y) * 1 + 1 * (mx + b - Y) = 2 (mx + b - Y)
hth,
Jeremy