Quiz weak 1 about different between one epoch and one iteration

{moderator edit - quiz answers removed}

i could not understand different between option 1 and option 3 and why option 1 is incorrect?
is not training one epoch same work as one iteration over all sample?

The terminology is that an “epoch” is one complete pass through the full training set. So if you are doing normal gradient descent, that is one iteration. If you are doing mini-batch, then you have another ‘inner’ loop over the minibatches to cover one full epoch. Thus there is more overhead in doing one epoch with minibatch. Of course that is only part of the story: the whole point of minibatch is that you update the parameters after every minibatch, so they get updated multiple times per epoch. So you may end up being able to get the same level of convergence with fewer epochs using minibatch. Meaning that (when it works) it may actually be more efficient overall.

That’s why option 1 is incorrect. I’m not supposed to just give you the answer. But after seeing the above reasoning, it should help analyze the other two options.


thanks you mention good points