Professor Ng shows us two ways to apply gradient on the variables:
- tape.gradient and optimizer.apply_gradients
- optimizer.minimize()
In the programming exercise of week 3, the default code uses the first method, and I try to convert it to the second method. However, it always throw an error about no gradients provided
default code:
my code:
this is the error:
Can anyone show me how to use the the optimizer.minimize correctly?
Thanks
This is an example:
Create an optimizer with the desired parameters.
opt = keras.optimizers.SGD(learning_rate=0.1)
var1, var2 = tf.Variable(1.0), tf.Variable(2.0)
loss
is a callable that takes no argument and returns the value
to minimize.
loss = lambda: 3 * var1 * var1 + 2 * var2 * var2
Call minimize to update the list of variables.
opt.minimize(loss, var_list=[var1, var2])
taken from Tensorflow page and thats your source page when dealing with tensorflow all the time
https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Optimizer
Thanks for your replying. I understand that the optimizer.minimize can only take callable function so that I try to wrap the compute_total_loss into a function “cost_fn”.
From your example, the loss function can be expressed by each trainable variables easily. However, in our programming exercise, the loss computation is involved in forward_propagation and compute_total_loss. Since opt.minimize cannot take loss function with arguments, I cannot directly put the compute_total_loss(Z3, tf.transpose(minibatch_Y)) in to the opt.minimize function.
Can you show me how to express the loss in the opt.minimize in this case?