Week 4, Exercise 5 - L_model_forward()

amritraj · June 7, 2021, 8:20am

Hi, I am getting the ValueError: shapes are not aligned. Thought, it had already passed all the tests of linear_forward. Here’s the screenshot of the error:

Any help will be appreciated

reza1 · June 7, 2021, 8:40am

hi there, are you sure that first input in linear function is A_prev ?
as far as I remember the first one was “W” and the second one was A_prev and finally b.

amritraj · June 7, 2021, 10:19am

Yeah…the definition of linear function is linear_forward(A, W, b)…

amritraj · June 7, 2021, 10:43am

The problem is solved now. During sigmoid activation, I had to send ‘A’ as an argument, not ‘A_prev’.

ngxj · August 24, 2021, 2:58am

Which function do you pass it to - ? I’m confused because for linear_activation_forward (which includes linear_forward and activation function), don’t we still pass A_prev?

Maitha_Shehab_Khanji · October 16, 2021, 8:55pm

just to save others’ time, try this link you will understand how the for i in range(1,L) loop works.
it took me more than an hour just because of this number Python Tryit Editor v1.0

gweeish · December 1, 2021, 6:44am

Hi,

I’m still quite confused over this part of the assignment. I would appreciate if someone could talk me through this. I’m not sure why “parameters[“W” + str(L)]” was used to call the value of W etc…

Thank you.

amritraj · December 1, 2021, 8:56am

parameters have column names as “W1”, “W2”, … Hence, parameters[“W” + str(L)] is used.

gweeish · December 1, 2021, 11:31am

Hi,

Thank you for replying. I was wondering why ’ + str(L)’ was used.

alvedsc · December 31, 2021, 12:28am

I am really confused on why we use a for-loop that goes from 1 to L.

We are building a model that computes the linear->relu from 1 to L-1 and then calculate the Linear->sigmoid out of the loop.

# L-1 iterations of relu
  for l in range(1, L-1):
          A, cache = linear_activation_forward(A_prev, parameters['W' + str(l)], parameters['b' + str(l)], "relu")
          caches.append(cache)

# last 
AL, cache = linear_activation_forward(A, parameters['W' + str(L-1)], parameters['b' + str(L-1)], "sigmoid")
    caches.append(cache)

I tested and when I use a for-loop that iterates from 1 to L, the result is what the test expects, but I wonder if it is correct.
If we make a for-loop from 1 to L, aren’t we making a Linear->relu->linear->sigmoid to the last layer? In fact, we’d end up with a L+1 length caches, don’t we?

kbur · January 18, 2022, 4:25am

I wonder if some of the confusion described in the posts above about looping over L or L-1 or L+1 layers and whether a sigmoid or ReLU activation should be applied at a given layer is due to the slightly non-intuitive nature of the python range command which is used to define the index values for the loop:

for l in range(1, L):

Originally I read this to mean that the range command would produce a series of values (1,2,3). However this didn’t match my understanding of applying ReLU to only the first two layers of the network and not the third layer which uses a sigmoid activation function.

I used the link posted by [Maitha_Shehab_Khanji] above to test out the range command in real time, which really helped me to identify the issues I was having.
https://www.w3schools.com/python/trypython.asp?filename=demo_for_range2

Back to our example where we have three layers (L = 3), ie 2x ReLU layers and than a single sigmoid layer. Using the link above to test out the code we get:

range(3) = 0,1,2
range(1,3) = 1,2

Reviewing the syntax for the range command shows:
https://docs.python.org/3/library/stdtypes.html#range

Syntax: range( *start* , *stop* [, *step* ])

The key thing we find out if we read a bit further down the help page for the range command is that the range command never produces the stop value. The last integer it produces has the value (stop -1) ie 2. And since a start value has been specified, the range command outputs the values 1,2. All good. We can now see that the for loop code will happily loop over just the first two layers in the network, as we would expect.

Hopefully this helps someone else.

Anass_Tazi · January 22, 2022, 4:37pm

I was having the same thought
but the for loop from 1 to L runs from 1 to L-1 and not L
Concretly, if you run this code alone:
for l in range(1, 5)
print (l)
the output is:
1
2
3
4
there is no 5

Henrik_Stokvad · February 28, 2022, 11:50am

A is also the output variable from the linear activation method, which each iteration of the loop begins with updating A_prev to. Means that after last iteration in the loop A_prev is not updated and you should use the output A directly when calculating AL in the sigmoid section of the code…

Aminu_Musa · November 17, 2022, 12:09am

I’m still stuck on this assignment, I kept getting valueError something that has to do with the shape

Rashmi · November 17, 2022, 9:53am

Hello Aminu Musa,

Welcome to the community.

The inner dimensions (3,4) & (3,4) always need to agree to each other. Please make it correct. Thanks.

Matt_Samelson · March 24, 2024, 5:56pm

I got the correct answer to this problem after struggling through and with the help of this thread. However, I think my understanding is still a bit shaky. Hoping someone can clarify for me.

I understand the relu portion. However, my question is on the sigmoid portion of the code.

The sigmoid is applied to the last layer of the neural network right? So that would be level L. Shouldn’t we still be passing in A_prev? or do we use A since it was the jost recent activation created by the loop and it resides in memory? And does ‘parameters’ passed in to the function have W and b for L=1 to L (all layers of the network?). Thanks in advance.

Kic · March 24, 2024, 8:43pm

Hi @Matt_Samelson ,

The purpose of training a network is to find a set of weights and bias where the cost is at the minimum. You can take a look at the code for L_model_forward() function to see how the network traverse through different layers, and how the weights and bias are used.

From menu bar at the top of the notebook, click:
file->open->dnn_app_utils_v3.py

Percy_Ayimbila · July 23, 2024, 10:59am

Assignment 4 Exercise 5 Walkthrough

Overview

When the function is called:

Caches: Stores a list of caches from linear_activation_forward().

A: Holds the input data X.

L: Represents the number of layers in the neural network. The length of parameters is divided by two because it contains weights and biases for each layer.

For Loop Explanation

The for loop runs from 1 to 𝐿−1

The output layer is not included because it uses a different activation function (sigmoid) compared to the ReLU activation function used in other layers.

First Iteration

A_prev is assigned the value of A, which holds the input data X.

linear_activation_forward is called with A_prev and other parameters to compute the linear and ReLU activation for the first layer.

The returned A and cache are stored in their respective variables.

The cache is appended to the list of caches.

Subsequent Iterations

A from the previous layer is assigned to A_prev.

linear_activation_forward is called with A_prev and other parameters to compute the layer’s activations.

The returned A and cache are stored in their respective variables.

These steps repeat until the last-but-one layer is executed.

Output Layer

The for loop does not handle the output layer since it requires a sigmoid activation function.

For the output layer, the activations from the last hidden layer (stored in A) are used directly.

linear_activation_forward is called with A instead of A_prev.

hellenxiahu · February 10, 2025, 11:40pm

have to define it before we call it later, but I got my issue with this assignment as well

Ula99 · February 13, 2025, 12:10am

for anyone who encounters the same problem :
make sure that you write the LINEAR of sigmoid outside the for loop and apply it just for the last layer with A not A_prev

Topic		Replies	Views
Week 4, Exercise 5 - L_model_forward Neural Networks and Deep Learning coursera-platform	35	968	July 7, 2023
W4 A1 Exercise 9 L_model_forward shapes not aligned Neural Networks and Deep Learning week-module-4 , coursera-platform	9	29	September 19, 2024
Week 04 assignment 1 exercise 5 Neural Networks and Deep Learning coursera-platform	2	540	August 24, 2022
Week 4 Exercise 5 L_model_forward Neural Networks and Deep Learning coursera-platform	7	1168	December 14, 2022
Week4, Programming Assignment 1 of 2, Exercise 5: L-model forward Neural Networks and Deep Learning coursera-platform	3	929	July 9, 2021

Week 4, Exercise 5 - L_model_forward()

Related topics