W3_A1_Forward propagation_AssertionError: Wrong values for A2

Hi,
I am trying to implement my code for forward propagation but I am getting this error. Even though the code generates the values for A2 but it still doesn’t give the required output.

A2 = [[0.49993903 0.50002176 0.50002679]]
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-21-e41f1de7d764> in <module>
      3 print("A2 = " + str(A2))
      4 
----> 5 forward_propagation_test(forward_propagation)

~/work/release/W3A1/public_tests.py in forward_propagation_test(target)
    107     assert output[1]["Z2"].shape == expected_Z2.shape, f"Wrong shape for cache['Z2']."
    108 
--> 109     assert np.allclose(output[0], expected_A2), "Wrong values for A2"
    110     assert np.allclose(output[1]["Z1"], expected_Z1), "Wrong values for cache['Z1']"
    111     assert np.allclose(output[1]["A1"], expected_A1), "Wrong values for cache['A1']"

AssertionError: Wrong values for A2

Kindly suggest to me what’s going wrong with my logic?

Your output is of the proper shape, so I would look to your expressions for Z1, A1, Z2, and A2. The test “threw” the exception in testing for the expected (correct) A2 values. Since that is the final layer, you can have bad expressions for an of those that I listed above. Check the Exercise 4 instructions, and carefully translate the mathematics of the various outputs into the proper Python formulations.

I think I have properly formulated mathematical calculations into python code. Since I can’t post my code here. I would like to describe what I did.
For Z1 I calculated the dot product of the vectors W1 and X and added b1 to it
For A1 I used np.tanh(Z1)
For Z2 I calculated the dot product of W2 and A1 and add b2 to it
For A2 I just calculated the sigmoid of Z2.

That sounds correct in terms of code, although W1 and X are matrices, not vectors, right?

I added a cell after the test cell (which is not modifiable) and printed all the values from the returned cache:

Z1 = cache["Z1"]
A1 = cache["A1"]
Z2 = cache["Z2"]
A2 = cache["A2"]
print(f"Z1 = {Z1}")
print(f"A1 = {A1}")
print(f"Z2 = {Z2}")
print(f"A2 = {A2}")

When I run that, here’s what I get:

Z1 = [[ 1.7386459   1.74687437  1.74830797]
 [-0.81350569 -0.73394355 -0.78767559]
 [ 0.29893918  0.32272601  0.34788465]
 [-0.2278403  -0.2632236  -0.22336567]]
A1 = [[ 0.9400694   0.94101876  0.94118266]
 [-0.67151964 -0.62547205 -0.65709025]
 [ 0.29034152  0.31196971  0.33449821]
 [-0.22397799 -0.25730819 -0.2197236 ]]
Z2 = [[-1.30737426 -1.30844761 -1.30717618]]
A2 = [[0.21292656 0.21274673 0.21295976]]

Are any of your Z1, A1 and Z2 values different than that?

I checked those values that were calculated and I found this:

Z1 = [[ 0.04585435 -0.02677315  0.03969249]
 [-0.00111639 -0.00135123  0.01054208]
 [ 0.03021782 -0.01645445  0.01932617]
 [ 0.00661832 -0.00654221  0.02111497]]
A1 = [[ 0.04582223 -0.02676676  0.03967166]
 [-0.00111639 -0.00135123  0.01054169]
 [ 0.03020863 -0.01645297  0.01932377]
 [ 0.00661822 -0.00654212  0.02111183]]
Z2 = [[-2.43884785e-04  8.70208881e-05  1.07176486e-04]]
A2 = [[0.49993903 0.50002176 0.50002679]]
A2 = [[0.49993903 0.50002176 0.50002679]]
Z1 = [[ 0.04585435 -0.02677315  0.03969249]
 [-0.00111639 -0.00135123  0.01054208]
 [ 0.03021782 -0.01645445  0.01932617]
 [ 0.00661832 -0.00654221  0.02111497]]
A1 = [[ 0.04582223 -0.02676676  0.03967166]
 [-0.00111639 -0.00135123  0.01054169]
 [ 0.03020863 -0.01645297  0.01932377]
 [ 0.00661822 -0.00654212  0.02111183]]
Z2 = [[-2.43884785e-04  8.70208881e-05  1.07176486e-04]]
A2 = [[0.49993903 0.50002176 0.50002679]]

They are different from cached values. My previous test cases passed without any error.

Wow. So even your Z1 value is already different, so that throws everything else off. The formula is quite straightforward and what you described sounds correct. Here is the mathematical version of the formula:

Z1 = W1 \cdot X + b1

So how could that go wrong in python? Are you sure you used np.dot for the multiply? But if you tried to use np.multiply or *, it would throw a dimension mismatch, since W1 is 4 x 2 and X is 2 x 3.

I tried a couple of mistakes like not actually adding b1 or multiplying by b1 and that doesn’t give me the same bad values that you show. Maybe somehow the test case code is different. Try printing W1, X and b1. Here’s what I see:

W1 = [[-0.00416758 -0.00056267]
 [-0.02136196  0.01640271]
 [-0.01793436 -0.00841747]
 [ 0.00502881 -0.01245288]]
X = [[ 1.62434536 -0.61175641 -0.52817175]
 [-1.07296862  0.86540763 -2.3015387 ]]
b1 = [[ 1.74481176]
 [-0.7612069 ]
 [ 0.3190391 ]
 [-0.24937038]]
Z1 = [[ 1.7386459   1.74687437  1.74830797]
 [-0.81350569 -0.73394355 -0.78767559]
 [ 0.29893918  0.32272601  0.34788465]
 [-0.2278403  -0.2632236  -0.22336567]]
1 Like

I’m not really sure how it worked now but I actually restarted the kernel and deleted all the code and did it again. This time it worked. Yes, I used np.dot(A,B) to calculate the dot product and all the earlier test cases ran smoothly without any error. But luckily the error is resolved now.
Thanks for your support!

Great! The issue probably was that you had fixed the code, but forgot to do “Shift-Enter” to compile the new code. Just typing new code and then calling the function again runs the old code.

1 Like

Ran into the same problem, and I am getting the same error output too. I found out that b in initialized to a zero vector of the same dimensions.

–update:
I solved the error by recreating the fresh notebook again and then writing the same script.

I am running into the same error. Moreover I have the same values as that of W1 X Z1 B1 as that of paulinpaloalto. Even though I share the same values as him I fail to understand by B1 is not a zero vector if we previously initialized it to zeros?

This is just the unit test case for the function forward_propagation which is one step in the iterations for training or what you would do in “inference” mode, so it just takes whatever parameter values are passed in as arguments. When we really implement the full nn_model, then we start from the initial W and b values, which would be zero for the bias values, but that is only on the first iteration, right? Then the update parameters logic will push the b values in the direction of a better solution.

Think about writing a test case for the forward_propagation function: if you used zero values for the bias values, then the test case would not be able to detect the bug of leaving out the bias values, right? So that wouldn’t be a very good test case. :nerd_face:

I’m having a similar problem with this exercise too…when I calculate Z2 dot product of W2 and A1, it doesn’t work because they’re incompatible shapes:

W2 shape is (4, 2)
A1 shape is (4, 3)

which throws this error…

I tried using W2.T which gets rid of the error, but the tests still fail. Any suggestions?

There are two tests for the forward_propagation function. The first one is returned by forward_propagation_test_case. You can find that function by clicking “File → Open” and examining the appropriate file. Check the “import” cell early in the notebook to see the file names.

Here’s what I see in that test case:

X is 2 x 3
W1 is 4 x 2, b1 is 4 x 1
W2 is 1 x 4, b2 is 1 x 1

So with those dimensions, the layer one linear activation is:

Z1 = W1 \cdot X + b1

So that will be 4 x 2 dot 2 x 3 which gives 4 x 3 as the shape of Z1. Then A1 is the same shape, right?

Then for the second layer we have:

Z2 = W2 \cdot A1 + b2

There the dot product should be 1 x 4 dot 4 x 3 which gives 1 x 3 as the shape for Z2 and A2, right?

So it looks like your code goes off the rails by using W1 instead of W2 in the second layer formula.

If that’s not it, then you need to examine things more carefully based on the “dimensional analysis” I gave above.

Update: I checked the other test case, which is in the function forward_propagation_test and it actually has the same dimensions as shown above. My personal opinion is that it’s a bad idea to have two tests with the same shapes, because it limits the ability to catch hard-coding errors, but they didn’t ask my opinion.

Oh man, that was it…I was retrieving W2 = parameters[“W1”] instead of “W2” … can’t believe how pesky a little oversight like that can be. Yes, W2 is definitely 1X3 as the output of the parameters clearly shows.

All programmers feel your pain. A single character wrong can ruin everything. But the key is knowing how to debug a problem like that: that’s part of the job, right? Anytime you get a shape mismatch, the first step is to figure out what the shapes should be. That technique I showed there is called “Dimensional Analysis”. Here’s another example of how to do that for the equivalent test in Week 4.

Awesome, I’ll start examining the tests more thoroughly.

On a similar topic, how can we calculate what the expected dimensions of 2 multiplied matrices will be?

For example, on this slide, Andrew very quickly calculated that the W[1] matrix would need to be (3, 2) in Z = W[1] * x where Z shape is (3, 1) and x shape is (2, 1)

This seems to violate linear algebra rules…aren’t (3,1) and (2,1) actually incompatible?

If what you’re saying is that you don’t understand how matrix multiplication works, then we’re in trouble here. This course assumes you have solid knowledge of basic linear algebra. You don’t need to know what an eigenvalue is, but you need to be very comfortable with all the basic algebraic operations on vectors and matrices.

If you have a 3 x 2 matrix and you multiply it by a 2 x 1 matrix, the result will be 3 x 1, right?

If you don’t understand how that works, you need to take a Linear Algebra course first. The one on Khan Academy is a good place to start.

I understand that the expected dimensions should be the outer values of 2 matrices, for example (3,2) X (2,4) should result in (3,4) matrix.

But that’s not the case in the video…is there a special rule for 1s (besides matching for copying compatibility)?

In the video, as I just described, it is a 3 x 2 matrix dot 2 x 1 matrix. That gives 3 x 1, right? By the same rules that 3 x 2 dot 2 x 4 gives 3 x 4, right? The “inner dimensions” are cancelled and the outer dimensions remain.

In the video it’s a (3,1) matrix dot (2,1)…it makes sense your way, but in the video the missing value is the W shape (3,2), not the (3,1)