I want to understand what happens when we are training m number of training examples, does the forward propagation followed by backward propagation happen only once, or does this cycle keep happening until we get the desired w and b for that particular training example and then move forward to second training example to do the same?

Also, does one epoch mean training m number of training examples once?

Forward propagation followed by backward propagation happens the `number of iteration`

times for all examples (not one by one).

Yesâ€¦

Can you explain in detail?

This was all explained in the lectures, but hereâ€™s my summary:

Here in Course 1, we do â€śfull batchâ€ť gradient descent. That means we do a number of iterations of the following process:

- Compute forward propagation on all training samples with the current weights. This is done in a vectorized way for efficiency.
- Do backward propagation on all samples to compute the gradients, which are averaged over all the samples.
- Apply the computed gradients to update the weights.
- Go to 1) again and repeat for the full number of iterations.

Steps 1) to 3) are called one â€śepochâ€ť of training. Later in Course 2 we will learn a more sophisticated technique called â€śminibatch gradient descentâ€ť where we break up the full m training samples into â€śminibatchesâ€ť and iterate through those in each â€śepochâ€ť.

If this still doesnâ€™t make sense to you, my suggestion would be to watch the lectures again with what I said above in mind. Prof Ng said everything I said above in the lectures, other than the â€śminibatchâ€ť issue. Heâ€™ll discuss that in Course 2.

1 Like