Derivative of Relu in output layer

paulinpaloalto · November 29, 2022, 4:42pm

Eeeek! Sorry, I wasn’t thinking hard enough when I wrote the first response. Your no_relu_backward is not correct. Remember that what we are implementing there is:

dZ = dA * g'(Z)

Meaning that we’re not just returning the derivative of the activation function. So the fact that the derivative is 1 does not mean that the return value is 1, right? It means it’s equal to dA.

Actually while you’re at it, I’d feel more comfortable if you did the assignment of

A = Z

in no_relu with a method that produced a separate copy. They way you implemented it, A ends up being another reference to the same global object. I can’t think of a case in which the return value A is going to get modified, so it’s probably no harm done. But it just introduces some risk of later unpleasant surprises. There’s a reason why they did that np.array(..., copy = True) there in the original code you are copying. Please see this post for more information about how object references work.

Topic		Replies	Views
What is the role of ReLu derivative? Neural Networks and Deep Learning week-3	3	276	May 4, 2024
Week 4, Last assignment / General question Neural Networks and Deep Learning	2	538	December 5, 2021
Clarification of the Derivative of the Log Loss Function Neural Networks and Deep Learning	2	949	April 17, 2022
Backpropagation formulas Neural Networks and Deep Learning	7	1041	April 21, 2021
Week4- assignment 2- Difference in gradient calculation for the last layer activation in neural networks Neural Networks and Deep Learning	2	675	May 17, 2023

Derivative of Relu in output layer

Related topics