I am on last lecture of Week3, about enforcing gradient penalty. Please give some intuition on how we would do backprop with the gradient penalty term.
The gradient penalty is computed wrt inputs of another random interpolated image. I don’t have any idea what to do to compute its derivative.
1 Like
Hey, I am not very sure as to what would suffice your query, still, I will try to answer whatever I know.
The backprop with the gradient penalty term is nothing different than doing backprop in a simple logistic regression model with L2/L1 regularization. We add a regularization term in the loss function in both the cases, and take the derivative wrt the weights which we want to update. So, if we want to update the weights of the generator, we will take the derivative wrt to weights of g, and if we want to update the weights of the discriminator, we will take the derivative wrt to weights of c.
As to the second question, I hope you have understood how to calculate x_hat, i.e., by doing interpolation of real and generated images. And now in order to compute the derivative, we simply differentiate the gradient penalty term wrt x_hat itself. The other terms will simply be 0, since they are not functions of x_hat.
Hope this helps
1 Like