Emojifier-V1 has the following gradient descent code in the function model():

# Compute gradients

dz = a - Y_oh[i]

dW += np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))

db += dz

How are the gradients derived?

Thanks in advance,

Emojifier-V1 has the following gradient descent code in the function model():

dz = a - Y_oh[i]

dW += np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))

db += dz

How are the gradients derived?

Thanks in advance,

The derivation of backpropagation requires calculus which is beyond the scope of these courses. There was an optional section in the first assignment in C5 W1 (Building an RNN Step by Step) that shows the formulas for the general cases of backpropagation for an RNN or LSTM network. Please have a look at the general case there and perhaps that will “map” to what we are doing here.