Questions about week 1 content

Shirin_Tavana · July 28, 2022, 2:10pm

Hello, I’ve got a couple of questions from week 1 lectures and would appreciate if you could help me with them.

I know from previous courses in statistics that the problem of linear regression can be solved using other methods such as maximum likelihood estimation for example. Here we learned the gradient descent algorithm to find the best fit. Now I am wondering how I should approach problems in general? How should I know which method is the most suitable for my data?
I don’t quite understand what is meant by “convergence” in the context of gradient descent algorithm? How is “convergence” formulated mathematically? In the final lab we used 10000 iterations, to make sure model parameters converge. I was thinking of using a while loop to repeat the algorithm until “w” and “b” converge, but I am not sure what the condition should be.
In one of the lectures it was mentioned that one of the issues with the gradient descent algorithm is that we might end up at a local minimum instead of the global minimum depending on our initializations (assuming cost functions other than the squared error). But I did not understand what the solution to this issue is.

Thank you very much for your time!

TMosh · July 28, 2022, 3:16pm

Gradient descent has a computational advantage if the data set is large or has many features, because computing the gradients is extremely easy. Statistical methods require computing lots of means and sums of squares.

Convergence means that the minimum cost has been found. The method used in Week 1 (using a fixed number of iterations) is presented for simplicity - there are better methods discussed later in the course.

If a cost function is convex, then there are no local minima. That’s the case for the squared-error cost function. If you have a non-convex cost function (such as for a neural network), then you can run gradient descent multiple times, using different random initial weight values, and use the solution that gave the lowest cost.

Topic		Replies	Views
Week 1 Community Contributions: Share Your Notes Supervised ML: Regression and Classification week-1	37	1385	July 4, 2022
Convergence of Gradient Descent Supervised ML: Regression and Classification week-2	1	507	July 10, 2022
Regarding Gradient Descent Function Supervised ML: Regression and Classification week-1	6	507	January 24, 2023
C1_W1_Gradient-Descent Supervised ML: Regression and Classification week-1	3	568	July 28, 2022
Why is the number of iterations in gradient descent specified? Supervised ML: Regression and Classification week-1	4	501	March 20, 2023

Questions about week 1 content

Related topics