I am on last lab of week three, it is called 3.3 - Train the Model. It does not seem to asks us to write anything and just run the lab, and I ran the lab it kept failing on apply optimizer code, saying “No gradients provided for any variable: [‘Variable:0’, ‘Variable:0’, ‘Variable:0’, ‘Variable:0’, ‘Variable:0’, ‘Variable:0’].”, but we are not changing the code and there’s previous step of grads = tape.gradient(minibatch_total_loss, trainable_variables) which supposed to provide the gradian. And I also attached the original code, which didn’t ask us to add any program to explicitly create gradian. What can I do?
Hello, @biz2024,
From your post, I understand that you had passed all previous tests and you didn’t expect any error from this provided code. However, this provided code used functions that you wrote and if any of them didn’t work perfectly, provided code will crash. There are two possibilities here:
-
your functions are all correct, but some provided code was somehow altered unintentionally. In this case, you can get a fresh copy and retry. This post shows you how.
-
some of your functions were not perfect, and unfortunately, the tests were not able to detect them. In this case, we need to understand the origin of the error, and try to sort it out ourselves.
Origin of error
This error happens when there are no valid gradient values for all variables, which you can verify by adding a print
like below.
With the error, this should show a list of six None
. In normal case, they should not be None
.
This problem happens when it didn’t use any of the trainable variables to compute the loss. In other words, to avoid this error, you need to make sure all variables are involved in the calculation of the loss.
By the design of the notebook, as your screenshot showed, the variables should go through the path ( 1 → 5 ) and end up as the loss:
Note that this path uses your functions. Your work would be to follow this path and check each of your function to make sure the trainable variables are correctly used, and they and their downstream processed outputs are always processed by tensorflow functions (tensorflow functions start with tf.xxxxx
, if you see any np.xxxxx
, there is a problem . tf
stands for tensorflow, whereas np
stands for numpy. We need to use only tensorflow functions here. )
If you used any np
function in the middle of the path, even you started out using the variables correctly, the np
function will convert them (or their downstream processed outputs) to something else which means the variables (or their downstream processed outputs) will get kicked out of the remaining steps of the path. We don’t want them to get kicked out, so no np
function ever in the path.
Good luck!
Cheers,
Raymond
Yes, Raymond’s very complete explanation must be what is going on: in the functions you wrote that are being called during training, you must have used a numpy operation and those do not support the automatic generation of gradients, as the TF calls do. So it breaks the compute graph and you get no gradients, so the training can’t happen.
I’ve seen this error before and here’s one surprising and kind of “sneaky” way you can get this error: notice that in the compute_total_loss
function, you need to transpose the labels
and logits
before calling the TF loss function. If you do that using np.transpose
or just labels.T
, then it can have this effect. Try using tf.transpose
instead, if that is what your code looks like.
You guys can find these kind of stuff? You are wonderful and I am impressed.
Paul is our Super Mentor !
And Raymond is our Super Explainer!