I’m kinda stuck on this one … I looked at the tests but they didn’t provide any insight. Interestingly, the output for iteration 1 is correct but 2 tests are failing. Any ideas?
Here are my results for that test cell:
Cost after iteration 1: 0.6926114346158595
Cost after first iteration: 0.693049735659989
Cost after iteration 1: 0.6915746967050506
Cost after iteration 1: 0.6915746967050506
Cost after iteration 1: 0.6915746967050506
Cost after iteration 2: 0.6524135179683452
All tests passed.
Notice that the only one that agrees with your results is the second line. A difference in the 4th decimal place is not a rounding error: it’s a real mistake of some sort. So what is different about that case? That is the only case in which you are seeing the cost number before you’ve actually done any updates to the parameters. In all the other cases, you’re seeing the result after the second iteration (iteration 0 is the first iteration in python, right?). So there must be something wrong with how you are handling the back propagation and updating of the parameters in your two_layer_model
logic. Note that you can assume that all the subroutines are correct here, so the problem is in how you are calling them. Are you sure you’re not hard-coding anything, e.g. the learning rate?
Hi, thanks for the response. It turned out I had the relu and sigmoid in the wrong order in the backprop function, which doesn’t actually make sense to me. Yes, I know backprop is doing all of the steps in reverse order - that makes sense - but I would figure that the last step in backprop would be the sigmoid, not the relu for dA0 output.
You’re right that sigmoid is for the output layer, but because everything is backwards in backprop you process the output layer first, right? Then you work your way backwards through the “hidden” layers, although there are only two layers total here of course.