Problems with the answer of a question [Improving Deep Neural Networks | Week 2 Quiz]

Woreviam · June 15, 2025, 8:33pm

The answers of a question suggested that “mini batch gradient decent” and “getting more training data” could help find parameter values to get an small cost function value.

First, I found this question very tricky. There was another question that indicated that mini batch gradient descent is not necessarily faster than batch gradient descent, and I’ve found OK, computationally talking, both approaches have the same time complexity, without memory limits they take the same time, and with it, only batch might be not applicable. The problem with mini batch is that it can leads to worse J cost value after one mini batch is processed and even after the whole epoch is processed. I think is not a strong option to get an small value for J cost.

Second, if the question stated that “gradient descent in a deep network is taking excessively long to find”. How getting more training data can help to get a small value for J cost? More data for a batch process is more time to process that batch, isn’t?

paulinpaloalto · June 15, 2025, 9:41pm

Yes, some of the quiz questions get into pretty subtle distinctions and are sometimes worded in a pretty complicated way.

Generally speaking, we are supposed to treat answers to quiz questions the same way that we treat the programming assignments, in that we don’t give out the correct answers on a public thread, but we can talk about the issues in general.

It is true that minibatch is not guaranteed to work better than batch in every possible case, but generally it does work better in the sense that you update the parameters after each minibatch. That sword has two edges, of course: you get more frequent updates, but you also get more statistical noise in the updates because you are randomly sampling the minibatches. But minibatch gradient descent is the default way people do things these days, meaning that it definitely is worth a try.

I agree that adding more data is also not likely to speed up convergence.

I have not taken this quiz in a couple of years and will have to go back and try again to see if I can find the question you are asking about. But (again) just speaking generally, my first question would be whether we are doing one of the more sophisticated algorithms like RMS, Momentum or Adam. If convergence is slow, that’s the first thing I would think of to try. If even that doesn’t work, then maybe the problem is that the model we’ve chosen just isn’t up to the task, meaning it just has too much bias. So the next thing to try would be a more powerful and expressive model (more layers, more neurons per layer and maybe different activation functions).

If you want to go into more specifics with a particular quiz question, we’ll need to switch to using DMs to have a private conversation about it. To start a DM thread, you click the other person’s name or avatar and then click “Message”.

paulinpaloalto · June 15, 2025, 9:54pm

The other general thing to say is that sometimes it takes some experimentation with the quiz questions to figure out the correct answer. There is no penalty for taking the quiz as many times as you want. The only hassle is that they limit us to 3 submissions per 8 hour period. At least that’s the way it was the last time I tried.

Of course the other wrinkle here is that you’ll notice that you don’t always get the exact same questions in the same order. They have a pool of questions that they select from to keep things interesting.

Topic		Replies	Views
Week 2 Quiz question 2 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	577	July 30, 2023
Week 2 - When to use mini-batch gradient descent Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	553	June 15, 2021
Gradient steps in Mini batch vs batch Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	832	May 18, 2021
What is the advantage of Mini batch gradient descent over batch gradient descent? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	585	May 17, 2021
Mini-batch Gradient descent Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	546	August 27, 2022

Problems with the answer of a question [Improving Deep Neural Networks | Week 2 Quiz]

Related topics