Course 2 week 3 Assigment: train the model

The function model at the end of the assigment under the section 3.3 Train the model
def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001, num_epochs = 1500, minibatch_size = 32, print_cost = True):
shows this error when I run it.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-170-438f3af55d6e> in <module>
----> 1 parameters, costs, train_acc, test_acc = model(new_train, new_y_train, new_test, new_y_test, num_epochs=100)

<ipython-input-169-c0cc2367aa76> in model(X_train, Y_train, X_test, Y_test, learning_rate, num_epochs, minibatch_size, print_cost)
     72             trainable_variables = [W1, b1, W2, b2, W3, b3]
     73             grads = tape.gradient(minibatch_cost, trainable_variables)
---> 74             optimizer.apply_gradients(zip(grads, trainable_variables))
     75             epoch_cost += minibatch_cost
     76 

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in apply_gradients(self, grads_and_vars, name, experimental_aggregate_gradients)
    511       ValueError: If none of the variables have gradients.
    512     """
--> 513     grads_and_vars = _filter_grads(grads_and_vars)
    514     var_list = [v for (_, v) in grads_and_vars]
    515 

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in _filter_grads(grads_and_vars)
   1269   if not filtered:
   1270     raise ValueError("No gradients provided for any variable: %s." %
-> 1271                      ([v.name for _, v in grads_and_vars],))
   1272   if vars_with_empty_grads:
   1273     logging.warning(

ValueError: No gradients provided for any variable: ['Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0', 'Variable:0'].

You shouldn’t need to change any of that model code, but it looks like you must have accidentally broken some of it related to the tf.GradientTape logic. Are you sure you didn’t mess with the guts of that function? You can get a clean copy and compare. There’s a topic about that on the FAQ Thread.

@paulinpaloalto I have exactly the same issue. I have already tried your suggestion - re-created PA file by renaming it, pressing Help → Update lab and copying my code snippets from old file. It doesn’t help. All other tests in Jupyter notebook are passing. I can still make it since grader gives me 80/100 so it’s not a critical issue but still.

Error in notebook is exactly as for topic starter. Grader output is:

[ValidateApp | INFO] Validating '/home/jovyan/work/submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynb'
[ValidateApp | INFO] Executing notebook with kernel: python3
2021-10-17 05:17:40.797845: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-10-17 05:17:40.797885: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-10-17 05:17:41.953188: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-10-17 05:17:41.953222: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2021-10-17 05:17:41.953246: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-2-80-243.ec2.internal): /proc/driver/nvidia/version does not exist
2021-10-17 05:17:41.953475: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-17 05:17:41.980348: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2999995000 Hz
2021-10-17 05:17:41.982137: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ff517948d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-17 05:17:41.982169: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Tests failed on 1 cell(s)! These tests could be hidden. Please check your submission.

Paul helped me to debug the issue. It was caused by compute_cost function. I used .numpy().T on datasets for transposition when tf.transpose() should’ve been used instead. Now everything works well.

1 Like

In my case it was np.exp, use tf.exp instead…

Glad to hear that you found the solution! For anyone else who sees this, it’s worth making the point that Nikita described a bit more explicitly:

When you are using “TF GradientTape”, it is critical that all elements of the computation graph be TF functions. You cannot include any numpy functions, because gradients do not get automatically computed for them and that breaks the computation graph. Because of the Chain Rule, you need the derivatives at every step of the computation.