Course 2 Week 4 Assignment ex 4 Backpropagation through ReLU using hidden vector h

VINOD_BAJAJ · October 27, 2022, 10:17am

In the implementation of back_prop in the CBOW model.

Why I recieve error when I use “h” for backpropagation through relu layer in the step.

Apply relu to l1 step
l1 = None

This step of backpropagation through can be done by
l1[h<0] = 0

I received error with the unit-test “w4_unittest.test_back_prop(back_prop)”.

arvyzukai · October 27, 2022, 11:31am

Because you index l1 on h<0 condition, that is not what you are asked to do (# Apply relu to l1). One of the correct ways of implementing relu is using numpy.maximum, like: np.maximum(0,l1)

Cheers

P.S. you could also condition on l1<0 like l1[l1<0] = 0.0 and this would also be correct.

VINOD_BAJAJ · October 27, 2022, 11:50am

Thanks for the reply.

Okay, as per the instruction, I agree applying ReLU is the same as you explained.

On the other hand, the step that we are talking about is meant for passing the gradient backward through ReLU activation. Isn’t it?
l1 represent the gradient of the loss “L” with respect to the hidden activation “h” of the first layer, ie dL/dh.

This step is meant for calculating gradient of the loss with respect to z1, i.e., dL/dz1,
where h = relu(z1) and z1 = W1 x + b1.
In that case, one can look at hidden vector “h” to set the gradients 0 accordingly.

Topic		Replies	Views
NLP course 2 week 4 assignment question 4 NLP with Probabilistic Models week-module-4	3	320	March 7, 2024
Backprop: no context around the 'step' notation NLP with Probabilistic Models week-module-4	1	350	October 25, 2024
Incorrect Backprop Equations NLP Course 2 Week 4 NLP with Probabilistic Models week-module-4	11	686	July 13, 2023
Why apply relu on L1 (L1 = np.dot(W2.T, (yhat - y))) instead oof on the activations? NLP with Probabilistic Models week-module-4	1	557	June 13, 2022
W4_unittest.test_back_prop(back_prop)- Failing NLP with Probabilistic Models week-module-4	3	586	November 25, 2023

Course 2 Week 4 Assignment ex 4 Backpropagation through ReLU using hidden vector h

Related topics