Gradient Descent for Linear Regression formula

Need some detailed information on how this becomes to this ? like how the 2 gets cancelled as per rules of calculus.

2 Likes

It is just basic mathematical step (the numerator’s 2 cancels out with the 2 in the denominator).

This might be useful.

1 Like

It uses two rules in calculus besides that 2 gets cancelled:

Power Rule (as mentioned by TMosh): Differentiating a function of the form f(x) = x^n, where n is a constant, results in f’(x) = nx^(n-1). For example, if f(x) = x^3, then f’(x) = 3x^2.

Chain Rule: When differentiating a composition of functions, such as f(g(x)), the chain rule is used. It states that the derivative of f(g(x)) is f’(g(x)) * g’(x). In this case, g(x)=wx^(i), so it results in multiplication of x^(i) in the end of the results.

3 Likes