dZ for sigmoid in linear_activation_backward

I managed to implement the dZ for ReLU correctly.

However, I am struggling with getting dZ for sigmoid right.

To compute dZ, I am using the formula:

dZ = dA * g'(Z)

and

g'(Z) for sigmoid = Z*(1-Z)

So I am doing:

dZ = dA*activation_cache*(1-activation_cache)
and plug the result into linear_backward().

However, I am not getting the correct answer, so I must be missing something. What am I missing?

Also, I am quite confused what exactly is stored in linear_cache, activation_cache.

Is this correct:
linear_cache = Aprev, W, b
activation_cache = Z

1 Like

Hi @Oleksandra_Sopova,

It will be helpful for the mentors and save time to mention the week number and assignment name as well in your post.

1 Like

As for your query. You don’t have to re-write the values. The functions have been provided. If you read the description, it essentially is telling you exactly how to implement it:

You can read the doc strings of the functions which mention what every parameter is.

For example, this is of Ex 8. The dc string is in the red rectangle:

Hope this all helps,
Mubsi

yes, I read that, I just did not know what “activation_cache” consists of. To figure that out, I had to scroll back and forth and kind of assume based on other functions. But since we are supposed to use already implemented functions, like ‘sigmoid’ and ‘relu’, which return cache, it is not super clear, since it was not us that implemented them:

> 
> def linear_activation_forward(A_prev, W, b, activation):
>   
>     if activation == "sigmoid":
>         ...
>         A, **activation_cache** = **sigmoid**(Z)
>    
>     elif activation == "relu":
>         ..
>         A, **activation_cache** = **relu**(Z)
> ```

Sorry, but that is wrong. If we have:

A = sigmoid(Z)

Then

g'(Z) = A * (1 - A)

But you don’t really need that in order to compute dZ^{[L]}. They have you the shortcut that comes from that together with the formula for dA^{[L]}.

dZ^{[L]} = A^{[L]} - Y

Here’s another recent thread that may be worth a look.

You can examine those provided functions by clicking “File → Open” and then opening the appropriate “dot py” file. You can deduce the name by examining the “import” block, which is the first code block in the notebook. If this is news to you, perhaps it would be worth taking a look at the DLS FAQ Thread, since this is one of the topics covered there.