I have been working on the basis that the principal objective here has been to wrangle the dimensions of the various arrays to produce a set of values for A2 that are based on the necessary data elements that also satisfy the quoted data tuples in the assignment.
I therefore printed out values for the shapes of W1, b1, W2, b2 and X to keep the problem in focus.
I then printed out shapes for Z1, A1, Z2 and, because A2 does not have a shape, a simple ‘print’.
Even though my final values are not correct (yet) it looks like they do obey the conditions of the Assert, please see below
What do you mean “A2 does not have a shape”? That would be a problem, right? That must mean that somehow A2 is not the correct data type (a numpy array). That would be the first thing to investigate. What type is it:
print(f"type(A2) = {type(A2)}")
Next question: how did it get that way?
Also note that what you printed has only one set of square brackets. The “expected value” shown has two sets. That may seem trivial, but it’s not: it means that either the first variable is some other data type (e.g. a python list or tuple) or that it is a 1D numpy array. Whereas the expected value is a 2D array.
Thank you for your great questions, which pointed me back to Professor Ng’s lesson from week 2 on ’ A Note on Python/Numpy Vectors’. Armed with the .reshape function I dealt with the rank 1 array and my printout of Z1, A1, Z2 and A2, now at least seem to all have the right dimensions.
My values for A2 are incorrect, and I am getting an error telling me that Z1 is the wrong shape. It is a (4, 2) array, which appears to mean that, while the values for the dimensions may be correct, something else is not correct (the data?).
I wasn’t struck by a sudden epiphany, which would have been ideal, so I just started trying some variations in the Z1 line of code to see if I could keep the Z1 values the same but change the A2 values. In doing this I notice something that I don’t understand: if I use"W1[0].reshape(4, 1)" in my Z1 line of code I get:
ValueError: cannot reshape array of size 2 into shape (4,1)
Z1 is a (4, 2) array. So I tried “W1[1].reshape(4, 1)”. Once again I get:
ValueError: cannot reshape array of size 2 into shape (4,1)
Why is Z1 giving a value of 2 in both dimensions? It is not a (2, 2) shape.
Is the answer to this question necessary to correcting my code, or is it a detour?
I am using ‘*’, rather than np.dot. I have tried using transpose (.T) to wrangle the data and I have had some success that way, but also some surprising results such as the one above.
Does this relate back to issues using rank 1 arrays?
You make the following two statements back to back:
So which is it? Is your Z1 the right shape or is it not? Or maybe the better question is to ask: do you know how to figure out what the shape should be?
It seems like the right thing to do here is to take a few deep cleansing breaths and then go back and look at the formulas with a calm mind. Here is the formula for Z1:
Z^{[1]} = W^{[1]} \cdot X + b^{[1]}
Now look at the test case. What are the dimensions of the objects? Here’s what I see by examining public_tests.py, although you could also just add some print statements to your code to see the shapes:
X is 2 x 3
W1 is 4 x 2
b1 is 4 x 1
W2 is 1 x 4
b2 is 1 x 1
Ok, so what shape should Z1 be? It is the result of this dot product: 4 x 2 dotted with 2 x 3, which should give a 4 x 3 result, right? Note that adding b1 which is 4 x 1 will not change the shape and also applying tanh does not change the shape since the activations are applied elementwise. So A1 will also be 4 x 3.
What I did there is called “dimensional analysis” and I find it the best first step in debugging any kind of dimension mismatch problem. What do you get when you apply that same method to the second layer?