Hi,
I am trying to implement my code for forward propagation but I am getting this error. Even though the code generates the values for A2 but it still doesn’t give the required output.
A2 = [[0.49993903 0.50002176 0.50002679]]
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-21-e41f1de7d764> in <module>
3 print("A2 = " + str(A2))
4
----> 5 forward_propagation_test(forward_propagation)
~/work/release/W3A1/public_tests.py in forward_propagation_test(target)
107 assert output[1]["Z2"].shape == expected_Z2.shape, f"Wrong shape for cache['Z2']."
108
--> 109 assert np.allclose(output[0], expected_A2), "Wrong values for A2"
110 assert np.allclose(output[1]["Z1"], expected_Z1), "Wrong values for cache['Z1']"
111 assert np.allclose(output[1]["A1"], expected_A1), "Wrong values for cache['A1']"
AssertionError: Wrong values for A2
Kindly suggest to me what’s going wrong with my logic?
Your output is of the proper shape, so I would look to your expressions for Z1, A1, Z2, and A2. The test “threw” the exception in testing for the expected (correct) A2 values. Since that is the final layer, you can have bad expressions for an of those that I listed above. Check the Exercise 4 instructions, and carefully translate the mathematics of the various outputs into the proper Python formulations.
I think I have properly formulated mathematical calculations into python code. Since I can’t post my code here. I would like to describe what I did.
For Z1 I calculated the dot product of the vectors W1 and X and added b1 to it
For A1 I used np.tanh(Z1)
For Z2 I calculated the dot product of W2 and A1 and add b2 to it
For A2 I just calculated the sigmoid of Z2.
Wow. So even your Z1 value is already different, so that throws everything else off. The formula is quite straightforward and what you described sounds correct. Here is the mathematical version of the formula:
Z1 = W1 \cdot X + b1
So how could that go wrong in python? Are you sure you used np.dot for the multiply? But if you tried to use np.multiply or *, it would throw a dimension mismatch, since W1 is 4 x 2 and X is 2 x 3.
I tried a couple of mistakes like not actually adding b1 or multiplying by b1 and that doesn’t give me the same bad values that you show. Maybe somehow the test case code is different. Try printing W1, X and b1. Here’s what I see:
I’m not really sure how it worked now but I actually restarted the kernel and deleted all the code and did it again. This time it worked. Yes, I used np.dot(A,B) to calculate the dot product and all the earlier test cases ran smoothly without any error. But luckily the error is resolved now.
Thanks for your support!
Great! The issue probably was that you had fixed the code, but forgot to do “Shift-Enter” to compile the new code. Just typing new code and then calling the function again runs the old code.
I am running into the same error. Moreover I have the same values as that of W1 X Z1 B1 as that of paulinpaloalto. Even though I share the same values as him I fail to understand by B1 is not a zero vector if we previously initialized it to zeros?
This is just the unit test case for the function forward_propagation which is one step in the iterations for training or what you would do in “inference” mode, so it just takes whatever parameter values are passed in as arguments. When we really implement the full nn_model, then we start from the initial W and b values, which would be zero for the bias values, but that is only on the first iteration, right? Then the update parameters logic will push the b values in the direction of a better solution.
Think about writing a test case for the forward_propagation function: if you used zero values for the bias values, then the test case would not be able to detect the bug of leaving out the bias values, right? So that wouldn’t be a very good test case.
There are two tests for the forward_propagation function. The first one is returned by forward_propagation_test_case. You can find that function by clicking “File → Open” and examining the appropriate file. Check the “import” cell early in the notebook to see the file names.
Here’s what I see in that test case:
X is 2 x 3
W1 is 4 x 2, b1 is 4 x 1
W2 is 1 x 4, b2 is 1 x 1
So with those dimensions, the layer one linear activation is:
Z1 = W1 \cdot X + b1
So that will be 4 x 2 dot 2 x 3 which gives 4 x 3 as the shape of Z1. Then A1 is the same shape, right?
Then for the second layer we have:
Z2 = W2 \cdot A1 + b2
There the dot product should be 1 x 4 dot 4 x 3 which gives 1 x 3 as the shape for Z2 and A2, right?
So it looks like your code goes off the rails by using W1 instead of W2 in the second layer formula.
If that’s not it, then you need to examine things more carefully based on the “dimensional analysis” I gave above.
Update: I checked the other test case, which is in the function forward_propagation_test and it actually has the same dimensions as shown above. My personal opinion is that it’s a bad idea to have two tests with the same shapes, because it limits the ability to catch hard-coding errors, but they didn’t ask my opinion.
Oh man, that was it…I was retrieving W2 = parameters[“W1”] instead of “W2” … can’t believe how pesky a little oversight like that can be. Yes, W2 is definitely 1X3 as the output of the parameters clearly shows.
All programmers feel your pain. A single character wrong can ruin everything. But the key is knowing how to debug a problem like that: that’s part of the job, right? Anytime you get a shape mismatch, the first step is to figure out what the shapes should be. That technique I showed there is called “Dimensional Analysis”. Here’s another example of how to do that for the equivalent test in Week 4.
Awesome, I’ll start examining the tests more thoroughly.
On a similar topic, how can we calculate what the expected dimensions of 2 multiplied matrices will be?
For example, on this slide, Andrew very quickly calculated that the W[1] matrix would need to be (3, 2) in Z = W[1] * x where Z shape is (3, 1) and x shape is (2, 1)
If what you’re saying is that you don’t understand how matrix multiplication works, then we’re in trouble here. This course assumes you have solid knowledge of basic linear algebra. You don’t need to know what an eigenvalue is, but you need to be very comfortable with all the basic algebraic operations on vectors and matrices.
If you have a 3 x 2 matrix and you multiply it by a 2 x 1 matrix, the result will be 3 x 1, right?
If you don’t understand how that works, you need to take a Linear Algebra course first. The one on Khan Academy is a good place to start.
In the video, as I just described, it is a 3 x 2 matrix dot 2 x 1 matrix. That gives 3 x 1, right? By the same rules that 3 x 2 dot 2 x 4 gives 3 x 4, right? The “inner dimensions” are cancelled and the outer dimensions remain.