One of the goals of this assignement is to apply gradient clipping in a model built almost from scracthc.
how would one apply gradient clipping in practices e.g. plain vanilla fully connected neural network?
tf.clip_by_value | TensorFlow Core v2.7.0
tf.clip_by_norm | TensorFlow Core v2.7.0
do you call this clipping on the optimizer before you compile a model?