Week 1 Gradient Checking - Ex4 gradient_check_n

I’ve been struggling with this for 5-7 hours now. I think there is a problem with the way I’m writing something very fundamental I suppose, because I wrote the code as per the instructions given above the code snippet.

Inside the for-loop, I have in 2 places this set of lines for both + and -
You can replace “delta” with “plus” and “minus”.

    theta_delta =     np.copy(parameters_values) 
    theta_delta [i] += epsilon 
    J_delta[i], _ =   forward_propagation_n(X, Y, vector_to_dictionary(theta_delta))

Output:
difference is: 0.2850931567761623
numerator: 2.3225406048733928
denominator: 8.146602433873603

I print the values for gradapprox[i] and see that the difference is relatively big for i = 20-23 and i = 35-38 which corresponds to ‘b1’ and ‘W2’ parameters respectively. Printing the Js reveals that the delta (or simple subtraction) between plus and minus is in the order of 1e-7 and 1e-8 throughout the loop, and 8 floating places long like gradapprox and grad.

I get the gradapprox by simple subtraction and then divide by 2 * epsilon as it says in the description.

And in the validation code below the code snippet there is a line that says,
cost, cache = forward_propagation_n(X, Y, parameters)
gradients = backward_propagation_n(X, Y, cache)
difference = gradient_check_n(parameters, gradients, X, Y, 1e-7, True)
expected_values = [0.2850931567761623, 1.1890913024229996e-07]

is the expected value for difference, 0.2850931567761623 or 1.1890913024229996e-07?

Since I already got the answer of 0.2850931567761623.

It depends on whether you have fixed the intentional and obvious bugs that they put into the back propagation logic. If you don’t fix the bugs, then you get the 0.28509xxxx answer, meaning that your gradient checking logic has detected that there is an error in back prop. If you fix the bugs, then you get the 1.xxxx e-7 answer, indicating that your gradient logic is fine.

Thanks for your help. Resolved

1 Like