W 4 A1 | Ex- 5 | Wrong shape and output

That’s great news that you have solved it! You had me worried there for a while :sweat_smile:. When you get stuck in a rabbit hole like that, one strategy is just to think about something completely different for a while to clear your mind. Go for a walk around the block or get something to eat. Then come back and look at the problem again with “fresh eyes”. It’s really easy to get stuck by just looking at the problem the wrong way, so you need a way to break that cycle. Just a thought :nerd_face:

12 Likes

Thank you @parrotox and @paulinpaloalto! This thread helped me correct my mistake :star_struck:
I was passing l and A_prev to the lasts layer :frowning: . That last advice is great! Sometimes the problem is less complicated than we think.

5 Likes

This poses the question, since capital L is len(parameters) // 2 returns an integer, that works great for odd number of layers. So will that always be the case? Will we ever get an even number of layers? If we did, the loop would do everything and when it drops out, using capital L or l+1 for the final sigmoid function would give an error. Should that really be
if len(parameters) % 2 > 0:
L = len(parameters) // 2
else:
L = len(parameters) // 2 - 1
I got it to work thanks to this thread, but I was struggling with how we arrived at getting the sigmoid function incremented.
Other that that, I am loving these assignments in that you have to figure out what is going on and I am getting much better at using the print statement!

1 Like

Hi @yodester, parameters is a Python dictionary containing the W’s and the b’s for each layer. So if there are two layers (for example), there are 4 dictionary entries (or “key-value pairs”): W1, b1, W2, b2. So len(parameters) = 4 in this case. Dividing by 2 always gives the proper number of layers, since there is a W and a b for each layer.

3 Likes

My question revolves around even layers. As you say, if there are 2 layers, L = 2.
The loop is from 1 to L doing the layers 1 & 2, then it drops out and does the last layer sigmoid function. For odd numbered layers, there is one layer left and it would get the sigmoid function. If there were an even number, it would never get the sigmoid or it would redo the last layer.

1 Like

There is no difference in the behavior for even or odd numbers of layers. Remember that indexing in python is “0 based”. Try this and watch what happens:

for ii in range(1,5):
    print("ii = " + str(ii))
for jj in range(1,4):
    print("jj = " + str(jj))
print("After loop jj = " + str(jj))

Python is an interactive language. You don’t have to wonder how something works: you can just try it and see.

3 Likes

Thankyou for the reply ,as I am a slow learner and beginner to python I got stuck at this point, it was really helpful

3 Likes

Thank you parrotox and poulinpaloalto. This discussion thread certainly enhanced my understanding and was able to sort out my problem easily. Thanks.

3 Likes

I had the same problem with the layer values, and solved it with an if/else, the code still doesn’t pass the test

image

1 Like

I added similar print statements to my code and here’s what I get for that test cell:

l = 1
A1.shape = (4, 4)
l = 2
A2.shape = (3, 4)
l = 3
A3.shape = (1, 4)
AL = [[0.03921668 0.70498921 0.19734387 0.04728177]]
l = 1
A1.shape = (4, 4)
l = 2
A2.shape = (3, 4)
l = 3
A3.shape = (1, 4)
l = 1
A1.shape = (4, 4)
l = 2
A2.shape = (3, 4)
l = 3
A3.shape = (1, 4)
l = 1
A1.shape = (4, 4)
l = 2
A2.shape = (3, 4)
l = 3
A3.shape = (1, 4)
 All tests passed.
1 Like

Actually, your values for the shapes look correct, other than the fact that you are not printing the shape for A3 (but it’s clearly correct from your output).

But if you actually look at the test case, what they are comparing against is the “cache” output values, not the direct A values. So this means what is wrong is your handling of the caches.

I added these print statements to the end of my L_model_forward function:

    print(f"type(caches) = {type(caches)}")
    print(f"len(caches) = {len(caches)}")
    print(f"type(caches[-1]) = {type(caches[-1])}")
    print(f"len(caches[-1]) = {len(caches[-1])}")

And here’s what I get running that same test cell:

type(caches) = <class 'list'>
len(caches) = 3
type(caches[-1]) = <class 'tuple'>
len(caches[-1]) = 2

What do you see if you try that? E.g. maybe you forgot to append the cache entry for the output layer.

2 Likes

Thank you, I printed the cache´s variables and they were different. I went back a few steps and now it worked

1 Like

It’s great news that you found the solution! Thanks for confirming.

1 Like

Thank you @paulinpaloalto @parrotox , I had the same issue and this post helped me solve this finally after couple of days.

1 Like

The value of l in parameters[‘W’+str(l)] is not updated after the loop. Do it manually or use parameters[‘W’+str(L)] instead.

1 Like

just to save others’ time, try this link Python Tryit Editor v1.0

you will understand how the for i in range(1,L) loop works, it took me more than an hour just because of this number :slight_smile:

1 Like

ValueError: shapes (1,3) and (4,4) not aligned: 3 (dim 1) != 4 (dim 0)

I get this error in ### Exercise 5 - L_model_forward.
I don’t know what I’m doing wrong.

1 Like

Your best debugging strategy here is to have a careful look at the “dimensional analysis” that I gave earlier on this thread.

Looking at the dimensions on that error message note that 1 x 3 is the shape of W3, but 4 x 4 is the shape of A1. So how could those two end up in a dot product? Now that A2 should be 3 x 4 which would have worked.

1 Like

I’m sorry I do not know what else to do, can you point me to something?

1 Like

Did you read the post that I linked? Once you understand what that is telling you, now you know what should be happening at each layer as you go through the forward propagation process. Start by adding print statements to show the layer number and the shapes of the W^{[l]} and A^{[l-1]} values each time through the loop and then for the output layer. Where do things go off the rails? My guess is that the logic in the loop is correct and then it fails for the output layer. What is different in that case? What is the value of A_prev when you fall out of the loop over the hidden layers?

1 Like