The assertion of A2.shape is failing when forward_propagation is invoked in nn_model. I have assigned the recommended value to n_y variable in the layer_sizes function, it’s tests pass and grader gives 100% for this functions implementation.
What other information can I provide to get input on this?
Please show us the exception trace from the failing assertion. It might also help to add a print statement in your logic to show the shape of A2, as in:
Just because your layer_sizes function is correct doesn’t mean you didn’t call it incorrectly later. I assume all your functions before nn_model passed their tests, so if you fail the test for nn_model it means that the incorrect logic is in nn_model.
Notice that your A2.shape value is inconsistent: sometimes it is 2 x 2000 and sometimes it is 1 x 2000. I instrumented my code and I only see 1 x 2000.
So that is the thing to investigate: what causes it to be 2 x 2000 in some cases?
I improved my instrumentation so that it prints A2.shape only once and then double checks that it never changes. With that logic added, here’s what I see when I run the test cell for nn_model:
A2.shape (1, 2000)
A2.shape (1, 5)
All tests passed
So notice that there are two different test cases with different numbers of samples, but the first dimension of A2 is always 1.
@paulinpaloalto I just updated the image in the above comment here. Note that after the first iteration of training the model the shapes of its weights at the layer two of the network changes from 1 to 2.
Here’s a theory: maybe your update_parameters logic is incorrect and it is changing the shape of W2 or maybe b2, which results in the shape of A2 changing.
Related to this, there does not seem to be a canonical implementation of log loss loss function implementation. What is the variant that is expected here?
It doesn’t matter as long as you correctly express what the math formula for the cost specifies. You can use dot products or you can do elementwise multiply followed by addition, if that’s what you are asking.
To close the loop on the public thread, it turned out to be a bug in update_parameters where the b2 value was incorrectly initialized, which is why the dimensions were incorrect. All fixed now!