Hi everyone
I have a question about the c2w4 assignment. I have an issue with Exercise 4. In the image above the code for exercise 4, it shows that we should obtain the step function of z1. but I think that the step function of z1 is not being computed correctly in the code. This is because l1 is not equal to z1. I have tried to compute step(z1) using z1 = w1.x_b1 as mentioned in the image, and I have also tried implementing the step function as shown in the code. but I am getting an error that says the values for grad-b1 are incorrect (in both ways). I have been trying to solve this problem for a few days now. Could someone please help me with this problem?
Hi @vanooshe
Here is my explanation of the diagram:
Note that:
- step(z1) means 0 values where z1<0, and 1’s everywhere else;
- the final l1 value is the same as initial l1 element-wise multiplied (not dot product) with step(z1) (here the diagram could be clearer on that). This is implemented for you (
l1[z1 < 0] = 0
) is equivalent to element-wise multiplication by 1s and 0s. This array is suggested to be used in further calculations (# use "l1" to compute gradients below
), it’s the biggest part in these calculations.
- also note, that h here is not equal to l1 or z1, since h here is equal to relu(z1);
- also note, that you don’t have to dot multiply by 1^{T}_{m}, you can use
np.sum(?, axis=?)
since these are equivalent.
And since you commented on this thread I assume you saw the calculations that you can check your solution against.
Cheers
2 Likes
Thank you @arvyzukai .I was really confused but your clear explanation solved my problem. I really appreciate your help.
1 Like
Hey @vanooshe ! I am getting the same issue here, did you ever manage to solve it? I think there is one unit test where the negative values of grad_b1 and grad_b2 are becoming [0.] in the output.