Doubt in a question from Key Concepts on Deep Neural Networks

I was giving this quiz as part of Week 4 of Course 1 of Deep Learning Specialization and I came across this True/False question stating the following:

\quad If L is the number of layers in a neural network, then dZ^{[L]} = A^{[L]} - Y.

I marked this as False as this inherently assumed that the activation function of the last layer is sigmoid. Since there is no other information given about the neural network, I don’t think we can assume this.

Please correct me if I’m wrong.

Hi @arcchitjain

The statement is True. This is based on the typical pairing of a sigmoid in the final layer with BCE loss, or a softmax with categorical cross-entropy loss. In both cases, the derivative dZ^{[L]} simplifies to A^{[L]} - Y ( A^{[L]} is the output of the final layer and Y is the true label).

Hope it helps! feel free to ask if you need further assistance.

Thanks for clarifying this

1 Like

You’re welcome! happy to help :raised_hands: