There is barely any difference in the cost function and mean sum of squares of error, and OLS also minimizes sum of squares by maxima minima. So, am I correct to guess that the parameters obtained by both methods are going to be the same?

Hi @Raj_Pandey1, I think they are not different problems but the same. We can approach the problem in different ways, in the lecture video we used gradient descent, and in packages like sklearn or scipy, they are using other approaches like some SVD-based ones to get to the answer.

Since they are the same problem, any well-setup approach to solve the problem should get to the same answer (model parameters) or approximately the same answers. For example, we can’t expect gradient descent to always reach exactly the optimal set of parameters to unlimted precision, because we have a limited learning rate which lower bounds the step size.

Raymond

Thanks for the answer To add to that, Gradient Descent seems to be easier to implement however it may not give an exact result.

I actually had another slightly unrelated doubt. In Course 1 Week 2 Lab 3 : reshaping features, the “run_gradient_descent()” function is used. This function hasn’t been defined anywhere in the previous labs (I did recheck, hopefully i havent missed it). I assumed that its works similarly to the “gradient_descent()” function made for single feature regression. But this function requires initial values of weights and b. they haven’t been entered in the function. is there any way we can access the source of this function to see what’s going on in it.

Sorry for the trouble!!

Thanks in advance

Hello Raj,

Exactly!

Sure. If you check out the first code cell, you will see it is imported from `lab_utils_multi`

, then you can find it by clicking, on the menu bar of the jupyter notebook, “File” > “Open” > “lab_utils_multi.py” > search for run_gradient_descent

Cheers,

Raymond

Thanks for the quick response. Feels amazing to have issues resolved this fast. Have a great day

Thank you Raj, you have a great day too!