# C2_W4 - UNQC4 Step Function clarification

Hi,

I’m struggling with the definition of the step function in CNQ4. I have been over the other posts in the forum, and while they’ve provided some insight (such as the step-function already being defined in the given code), I’m confused on one particular point.

``````    # if z1 < 0, then l1 = 0
# otherwise l1 = l1
# (this is already implemented for you)

l1[z1 < 0] = 0 # use "l1" to compute gradients below

``````

I am assuming the last line “l1[z1 < 0] = 0” implements the step function. From my understanding, this function is modifying the values in l1 by replacing them with 0 whenever z1 < 0. z1 itself remains unmodified.

In that case, is step(z1) referring to the updated l1, or is it referring to z1 (which hasn’t changed)? When the comment in the code line suggests to use ‘l1’ to compute the gradients below, I’m not sure whether it means in in lieu of the original definition of l1 (W2^T(Yhat-Y)), or as a substitute for step(z1).

I could be misunderstanding the step function and which values are changing as I’m a novice python user, so I would appreciate any clarification. I have actually tried calculating grad_W1 with every iteration of z1, l1, W2.T * Yhat-Y, etc., but I’m not able to get it to work.

Hi @fae.13 great question !

The step function is a type of activation function used in neural networks. It’s a simple function that outputs either 0 or 1 based on the input. In this context, `z1` is typically the input to the step function, and `l1` is the output after applying the step function.

The line `l1[z1 < 0] = 0` is implementing the step function. What this does is it goes through each element of `z1` and checks if it is less than 0. If it is, the corresponding element in `l1` is set to 0. This effectively applies the step function to each element of `z1`, with `l1` storing the result.

Regarding your question about `step(z1)` referring to `l1` or `z1`, in this context, `step(z1)` is the operation you’re performing and the result of this operation is stored in `l1`. So, `l1` is the output of `step(z1)`. The original `z1` remains unchanged; you’re just using its values to determine what `l1` should be.

When the code comments mention using `l1` to compute gradients, it means that you should use the updated `l1` values (after applying the step function) in your gradient calculations. This is because the gradients depend on the outputs of the activation function (in this case, the step function).

So, in summary:

• `l1[z1 < 0] = 0` is applying the step function to `z1`, storing the result in `l1`.
• `step(z1)` refers to the operation being performed, with `l1` being the result.
• For gradient computations, use the updated `l1` which now contains the outputs of the step function applied to `z1`.

I hope this helps!

This was VERY helpful! I was interpreting l1 . step(z1) as TWO different elements…I now see they are to be treated as just one (aka, the updated l1). I removed the extra z1 and voila, all tests pass. Thank you!

1 Like

foe me too , many tx !