Hello hear. I hope that you are doing great. I am having an issue on week 3 programming assignment exercise_4: Forward Propagation. Actually prior to this function, I have initialized well the parameters (W1, b1, W2 and b2) and passed well. However I am using these values to compute the Z1 = W1X+b1 as even mentioned in the formula and yet I am getting an assertion error. Please I need your help so that I can continue doing other question because I am really stuck here.
Here is the error message I am receiving. I tried to debug it but it is not working. I don’t really know where is the error
AssertionError Traceback (most recent call last)
in
1 t_X, parameters = forward_propagation_test_case()
----> 2 A2, cache = forward_propagation(t_X, parameters)
3 print("A2 = " + str(A2))
4
5 forward_propagation_test(forward_propagation)
in forward_propagation(X, parameters)
41 # YOUR CODE ENDS HERE
42
—> 43 assert(A2.shape == (1, X.shape[1]))
44
45 cache = {“Z1”: Z1,
AssertionError:
When you have a shape mismatch, the first step is to find out what the wrong shape is. So what shape is your A2 value? Try this:
print(f"A2.shape = {A2.shape}")
right before the assertion that fails. Note that you can also look at the shapes of the parameters that are generated by the test case and then do the “dimensional analysis” to figure out what the shapes should be at each layer. Why does your output end up the wrong shape? Note that the “linear activation” portion of forward propagation at each layer involves a “dot product”, not an “elementwise multiply”, right?
Oooooh yeah… it is working now. Thank you very much. So it means that every time we perform linear activation we only use dot product for each layer? (in this case for calculating both Z1 and Z2)??
Yes, it is a dot product followed by a sum. It is important to recognize the notational conventions that Prof Ng uses. When he means “elementwise” multiply, he always and only uses “*” as the operator. If he means dot product, then he just writes the operands adjacent to each other without any explicit operator.
So when he writes it this way:
Z^{[l]} = W^{[l]}A^{[l-1]} + b^{[l]}
The first operation there is dot product. I think it would be clearer to write it this way:
Z^{[l]} = W^{[l]} \cdot A^{[l-1]} + b^{[l]}
But Prof Ng is the boss and he didn’t ask my opinion, so we just have to understand his notation.
Here’s another thread from a while back that covers this in a bit more detail.
yeah sure. I completely understand it. Thank you