“Gradient descent will automatically take smaller steps” , is this just workable for linear regression or also workable for other regressions more complicated?
After some research, I found out that Gradient Descent can be used for other optimisation problems other than linear regression. You will also find out in the later videos that you can also use it for Classification as well as Neural networks, however the steps are what matter because if you take larger steps then you will descend into steeper territory and if you take smaller steps then you might completely miss the minimum you are looking for
Additionally, you may observe that in gradient descent, parameter updates occur as follows:
w_new=w_old−lr⋅gradient
Here, the gradient is essentially the derivative. As we progress towards the global minimum, the derivative diminishes because at the minimum, it reaches zero, and our objective is to converge to that point.
This phenomenon is not exclusive to gradient descent but extends to neural networks and various other machine learning models.
Thank you guys, that’s really helpful.