Activation function: ReLU

When creating a post, please add:

Sorry, I don’t understand your question.

Maybe it would help if you posted a screen capture image from one of the lectures or labs.

Hi!
l1[z1<0] is a step function implementing ReLu derivative and gradient multiplication.
What I mean is that during the calculation of gradients using chain rule, we arrive at point when we have to perform :

\begin{align} Gradient * Derv(activation func) &= W_2^T(\hat y - y) * Relu'(Z_1) \\ &= \begin{cases} W_2^T(\hat y - y)*1 , & \text{if } Z_1 > 0\\ W_2^T(\hat y - y)*0 , & \text{if } Z_1 < 0\\ \end{cases} \end{align}

In python we can implement the above condition using a step function l1.step(z1) or l1[z1<0] = 0

2 Likes

Also note that we are taking gradient of Relu. In your question, perhaps you understood that we have to apply Relu.

Oh this makes sense now thank you.