dZ for sigmoid in linear_activation_backward

Oleksandra_Sopova · October 27, 2022, 10:22am

I managed to implement the dZ for ReLU correctly.

However, I am struggling with getting dZ for sigmoid right.

To compute dZ, I am using the formula:

dZ = dA * g'(Z)

and

g'(Z) for sigmoid = Z*(1-Z)

So I am doing:

dZ = dA*activation_cache*(1-activation_cache)
and plug the result into linear_backward().

However, I am not getting the correct answer, so I must be missing something. What am I missing?

Also, I am quite confused what exactly is stored in linear_cache, activation_cache.

Is this correct:
linear_cache = Aprev, W, b
activation_cache = Z

Mubsi · October 27, 2022, 1:11pm

Hi @Oleksandra_Sopova,

It will be helpful for the mentors and save time to mention the week number and assignment name as well in your post.

Mubsi · October 27, 2022, 1:16pm

As for your query. You don’t have to re-write the values. The functions have been provided. If you read the description, it essentially is telling you exactly how to implement it:

Mubsi · October 27, 2022, 1:19pm

You can read the doc strings of the functions which mention what every parameter is.

For example, this is of Ex 8. The dc string is in the red rectangle:

Hope this all helps,
Mubsi

Oleksandra_Sopova · October 27, 2022, 7:32pm

yes, I read that, I just did not know what “activation_cache” consists of. To figure that out, I had to scroll back and forth and kind of assume based on other functions. But since we are supposed to use already implemented functions, like ‘sigmoid’ and ‘relu’, which return cache, it is not super clear, since it was not us that implemented them:

> 
> def linear_activation_forward(A_prev, W, b, activation):
>   
>     if activation == "sigmoid":
>         ...
>         A, **activation_cache** = **sigmoid**(Z)
>    
>     elif activation == "relu":
>         ..
>         A, **activation_cache** = **relu**(Z)
> ```

paulinpaloalto · October 27, 2022, 7:49pm

Sorry, but that is wrong. If we have:

A = sigmoid(Z)

Then

g'(Z) = A * (1 - A)

But you don’t really need that in order to compute dZ^{[L]}. They have you the shortcut that comes from that together with the formula for dA^{[L]}.

dZ^{[L]} = A^{[L]} - Y

Here’s another recent thread that may be worth a look.

paulinpaloalto · October 27, 2022, 7:54pm

You can examine those provided functions by clicking “File → Open” and then opening the appropriate “dot py” file. You can deduce the name by examining the “import” block, which is the first code block in the notebook. If this is news to you, perhaps it would be worth taking a look at the DLS FAQ Thread, since this is one of the topics covered there.

Topic		Replies	Views
Course 1 week 4, assignment 1, exercise 8: linear activation backward Neural Networks and Deep Learning	4	648	February 11, 2022
W4_A1_Computing Activation functions in Linear Activation Backward Neural Networks and Deep Learning	7	491	August 14, 2023
Sigmoid and Relu backward Neural Networks and Deep Learning	6	522	May 22, 2023
Queries on backwards activation functions (C1W4) Neural Networks and Deep Learning	1	516	December 7, 2021
Sigmoid Function in Layer L Neural Networks and Deep Learning	8	721	January 30, 2023

dZ for sigmoid in linear_activation_backward

Related topics