Conflict in concept of a video and assignment

saifkhanengr · November 23, 2022, 8:19am

In the video “Understanding Mini-batch Gradient Descent” Prof. Andrew said that “stochastic gradient descent won’t ever converge, it’ll always just kind of oscillate and wander around the region of the minimum.” But in the assignment of week 2, we used it as an optimization method.

A screenshot from the video is attached in which Andrew showed that it won’t converge. Also, a screenshot from the assignment is also attached.

Mubsi · November 23, 2022, 8:49am

Hi @saifkhanengr,

In the lecture video, as you pointed out, Andrew says, " As stochastic gradient descent won’t ever converge, it’ll always just kind of oscillate and wander around the region of the minimum". But Andrew continues to say after that sentence, “But it won’t ever just head to the minimum and stay there”.

I guess what the assignment meant by “reach convergence” is that, SGD may not necessarily provide you with THE minimum value, but a good minimum value.

Hope this helps,
Mubsi

saifkhanengr · November 23, 2022, 9:42am

Hi Mubashar! Thanks for responding. So, is using SGD a wise decision or not? Or we should avoid it?

Mubsi · November 23, 2022, 10:32am

Hi @saifkhanengr,

Somewhere in C2, you’ll go through an assignment where you’ll visualise the results driven by all of these techniques (it could be the lab in question, I don’t quite remember).

Based on the task at hand the results one gets from them, you can try different techniques and select the one where you are satisfied with the results.

I think Andrew talks about using SGD with the help of some other mechanisms to get even better results.

saifkhanengr · November 23, 2022, 10:40am

That is in the Week 2 assignment of course 2. I witnessed.

Thanks for your time and response. I highly appreciate it.
Saif.

Topic		Replies	Views
Understanding Mini batch size Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	571	July 10, 2021
DLS Course 2,Week2,Programming Assignment(Exercise 3 and Exercise 5) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	703	June 30, 2022
Stochastic Gradient Descent Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	556	June 5, 2021
Epoch size of Stochastic Gradient Descent (SGD) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	505	June 4, 2023
Stochastic Gradient Descent convergence Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	529	July 25, 2022

Conflict in concept of a video and assignment

Related topics