DLS Week 2, Exercise 2, Ex. 7 Avg Cost Calc and Backward Prop

Zachary_Wohl · April 24, 2023, 8:05am

Hi,

I have two small questions about the model implementation in # 7 - Learning Rate Decay and Scheduling in the week 2, exercise 2 problem. These aren’t graded exercises, but I was confused why the code is being implemented this way.

The model’s avg_cost is calculated as avg_cost / m. Wouldn’t this only be correct in stochastic gradient descent? Shouldn’t the denominator for the avg_cost be (m / mini_batch_size) if cost_total is only being updated after each minibatch?

cost_avg = cost_total / m

Second, why is backward prop function called with minibatch_X instead of al3, the output of the forward prop function, which is believe is what was implemented in earlier exercises?

    for minibatch in minibatches:

        # Select a minibatch
        (minibatch_X, minibatch_Y) = minibatch

        # Forward propagation
        a3, caches = forward_propagation(minibatch_X, parameters)

        # Compute cost and add to the cost total
        cost_total += compute_cost(a3, minibatch_Y)

        # Backward propagation
        grads = backward_propagation(minibatch_X, minibatch_Y, caches)

Thanks

Mujassim_Jamal · April 25, 2023, 5:48am

Hi @Zachary_Wohl, You have asked a good question!

Logically, What if you have different mini-batch sizes at the end? This could certainly create a problem isn’t? Because the cost should be divided by the total number of examples (m), but if the final mini-batch has a smaller size, then dividing by (m/mini_batch_size) would take only complete batches into account and give an inaccurate estimate of the cost. That is why we’ll first accumulate costs over an entire epoch and then divide by the m training examples.

This function takes three arguments: X, Y, and caches. Backpropagation is computed using the output of forward propagation a3, which is available in the caches variable. The minibatch_X variable is then used to compute dW1 for the first hidden layer. You can also check the implementation of the backward_propagation() function that is available in the opt_utils_v1a.py file.

Let me know if you have any queires.

Regards,
Mujassim

Topic		Replies	Views
About DLS Course 2 Week 3 programming exercises Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	656	November 2, 2022
Week 3 assignment: epoch_cost Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	579	May 21, 2021
Why take cost average in Gradient Descent? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	533	April 28, 2022
Week 2 programming assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	549	October 13, 2021
Questioning the way of computing epoch_cost Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	572	May 12, 2021

DLS Week 2, Exercise 2, Ex. 7 Avg Cost Calc and Backward Prop

Related topics