Unable to understand the equations for calculating grad_b1 and grad_b2 in the back_prop function

Can any one explain the equation for grad_b1 and grad_b2. The lecture says

what is meant by step(Z1) and the matrix 1m

1 Like

Hi @Amit_Gairola1,

grad_b1 and grad_b2 are the gradients w.r.t. biases, i.e.
\frac{\partial J_{batch}}{\partial b_1} and \frac{\partial J_{batch}}{\partial b_2} respectively. The step function is needed for the backward propagation through the ReLU non-linearity. For every positive element of Z1 it should output one, and for other elements it should output zero. 1_m is a row vector containing m elements, all equal to 1. As you can see on the slide, the result of A \cdot 1^\top_m is equivalent to summing the elements of each row of the matrix A .

1 Like