Sigmoid Function in Layer L

paulinpaloalto · January 26, 2023, 4:04pm

The point is that sigmoid_backward and relu_backward are calculating the formula that Raymond shows. Remember that these functions are intended to be general and it’s perfectly possible to use sigmoid in hidden layers as well, although it just happens we don’t do that.

The formula you show of AL - Y is a special case: that only applies at the output layer and it happens because they have already included the derivative of sigmoid. The activation is only sigmoid at the output layer in general. See the derivation of that on the famous thread from Eddy.

Topic		Replies	Views
Course 1 week 4, assignment 1, exercise 8: linear activation backward Neural Networks and Deep Learning	4	644	February 11, 2022
W4 - Shouldn't the activation A be also cached? Neural Networks and Deep Learning week-1	6	29	November 7, 2024
dZ for sigmoid in linear_activation_backward Neural Networks and Deep Learning	6	737	October 27, 2022
Confusion about Calculating dZ^[l] Neural Networks and Deep Learning	3	805	October 26, 2022
W 4 \| Quiz \| Error in Q.7 or am I just not thinking it straight? Neural Networks and Deep Learning	3	967	October 22, 2022

Sigmoid Function in Layer L

Related topics