Here why we’re adding the regularization term to the W-Loss function?
I understand that Critic wants to maximize W-Loss and we enforce L-1 continuity by adding regularization term, which gets bigger if norm of critic is bigger than 1.
But Isn’t that if norm of gradient of critic is bigger than 1, then it actually increasing loss function, thus benefiting Critic? Isn’t that if we want penalize lshouldn’t we subtract it?