Derivative of Relu in output layer

paulinpaloalto · November 24, 2022, 5:33pm

Yes, that was what I was recommending. You can try LeakyReLU as well, but I think it’s worth trying just omitting the output activation function altogether.

If you implement LeakyReLU, here’s one way to code its derivative:

def leakyreluprime(Z, slope = 0.05):
    G = np.where(Z > 0, 1, slope)
    return G

Of course that is implemented as a separate function call. If you build it “in situ” by analogy to the way relu_backward works, you’re doing two things at once:

dZ = dA * g'(Z)

But the same idea can be adapted …

One point to emphasize here is that if you just duplicate the code in relu_backward to make leaky_relu_backward, be sure to understand the importance of the way they implemented this line:

If you “short-circuit” that by eliminating the “copy” there:

dZ = dA

that is a disaster, because you’re about to overwrite some of the values in dZ. Because of the way that parameter passing works in python and the way object assignments work, doing it without the copy modifies the global value of dA. See this post and this later reply on that thread.

Topic		Replies	Views
What is the role of ReLu derivative? Neural Networks and Deep Learning week-3	3	276	May 4, 2024
Week 4, Last assignment / General question Neural Networks and Deep Learning	2	538	December 5, 2021
Clarification of the Derivative of the Log Loss Function Neural Networks and Deep Learning	2	949	April 17, 2022
Backpropagation formulas Neural Networks and Deep Learning	7	1041	April 21, 2021
Week4- assignment 2- Difference in gradient calculation for the last layer activation in neural networks Neural Networks and Deep Learning	2	675	May 17, 2023

Derivative of Relu in output layer

Related topics