Tensor Flow Programming - grader says error

I am getting the following error on the last programming assignment (Tensor Flow Introduction). When I run it, it seems to pass all the cells including the graded assignments - so I am unsure why - any suggestions on how I can debug this?

[ValidateApp | INFO] Validating ‘/home/jovyan/work/submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynb’
[ValidateApp | INFO] Executing notebook with kernel: python3
2023-01-26 18:04:49.221041: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.1’; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2023-01-26 18:04:49.221078: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-01-26 18:04:50.431344: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-01-26 18:04:50.431380: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2023-01-26 18:04:50.431404: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-2-34-193.ec2.internal): /proc/driver/nvidia/version does not exist
2023-01-26 18:04:50.431642: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-26 18:04:50.439876: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2999995000 Hz
2023-01-26 18:04:50.441799: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55efcbb79650 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-01-26 18:04:50.441829: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
[ValidateApp | ERROR] Timeout waiting for execute reply (30s).
[ValidateApp | ERROR] Interrupting kernel
Tests failed on 1 cell(s)! These tests could be hidden. Please check your submission.

Believe it or not, all those error messages are normal. The only one that really matters is the last one:

Tests failed on 1 cell(s)! These tests could be hidden. Please check your submission.

That must mean that one of the functions you wrote is not general enough. It happens to pass the tests in the notebook, but it fails the grader with a different test case. The unfortunate thing is that for some reason the grader cannot tell us which function it is. The type of error we’re looking for is things like making hard-coded assumptions about the sizes of input objects or referencing global variables from within the scope of your functions, instead of referencing the formal parameters. If you directly reference the variable that happens to get passed, it can still work in the notebook, but then fails in the grader.

Thank you for the insight. I think I know where the ‘error’ could be, I have changed that code, but now I am unable to get the submit to work (!).

Am now getting:

" * ### Platform Error: We’re experiencing some technical difficulties

Please try submitting again in 10 minutes. If the problem persists for 24 hours, please reach out to Coursera through our Help Center."

For the last 1 hour. Will try later tonight.

Me too! I still have this problem after I obtain the latest version.

Yes, it sounds like they are having problems on the server side. The best I can suggest is what you proposed: wait a few hours and try again. Let us know what happens. We have seen server problems occasionally before and sometimes it takes as long as 10 or 12 hours to fix. But of course it all depends on the situation …

All works now, the issue was as identified by @paulinpaloalto…my function wasn’t general enough - hard one value (on a shape) hard coded vs. using .shape output. Once I addressed this, the code works and I get 100% (vs. 80%).

System now seems to be back up and accepting submissions.