When I try to submit the Tensorflow Introduction assignment in DLS Course 2, I get dead kernel errors that are causing my assignment to fail despite the answers being correct. Could someone please help? This is the only issue blocking me from completing the deep learning specialization.
Try submitting your notebook for grading without running it in the lab.
The grader doesnât have this âdead kernelâ issue.
Coursera fixed an earlier instance of the âdead kernelâ bug a few weeks ago, but it appears it may have returned. Yours is the 2nd post about this issue today.
Hello @Tricia_Lobo!
Try running it at a different time. This user tried for a different time and succeed. Or perhaps try rebooting by clicking on the âHelpâ on the top right and when the panel opens, click on âRebootâ.
Meanwhile, as Tom mentioned, you can submit your assignment as grader doesnât need the output of your code.
Best,
Saif.
Got the âdead kernelâ bug on DLS Course2, Week3 Assignment while running the Exercise 6 cell. Waited 12 hours, rebooted, restarted kernel with cleared output, then ran the notebook - still dead. Submitted assignment and received 0/100 even though Exercises 1 thru 5 passed all tests.
Did you add any print() statements to your notebook for debugging? If so and it outputs a lot of data, that can cause problems for the kernel.
Also did you add any new cells?
Hi TMosh. Just double-checked and verified no extra print statements or cells.
Grader Output:
[ValidateApp | INFO] Validating â/home/jovyan/work/submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynbâ
[ValidateApp | INFO] Executing notebook with kernel: python3
2024-08-03 16:10:19.061179: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library âlibcudart.so.10.1â; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2024-08-03 16:10:19.061216: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2024-08-03 16:10:20.722184: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library âlibcuda.so.1â; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2024-08-03 16:10:20.722220: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2024-08-03 16:10:20.722251: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-2-87-67.ec2.internal): /proc/driver/nvidia/version does not exist
2024-08-03 16:10:20.723195: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-08-03 16:10:20.756101: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2999995000 Hz
2024-08-03 16:10:20.758537: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x562dea269840 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-08-03 16:10:20.758565: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
[ValidateApp | ERROR] Kernel died while waiting for execute reply.
[ValidateApp | ERROR] Traceback (most recent call last):
File â/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.pyâ, line 478, in _poll_for_reply
msg = self.kc.shell_channel.get_msg(timeout=timeout)
File â/opt/conda/lib/python3.7/site-packages/jupyter_client/blocking/channels.pyâ, line 57, in get_msg
raise Empty
_queue.Empty
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/nbgrader/preprocessors/execute.py", line 41, in preprocess
output = super(Execute, self).preprocess(nb, resources)
File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess
nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess
nb.cells[index], resources = self.preprocess_cell(cell, resources, index)
File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 438, in preprocess_cell
reply, outputs = self.run_cell(cell, cell_index, store_history)
File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 578, in run_cell
exec_reply = self._poll_for_reply(parent_msg_id, cell, timeout)
File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 483, in _poll_for_reply
self._check_alive()
File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 510, in _check_alive
raise DeadKernelError("Kernel died")
nbconvert.preprocessors.execute.DeadKernelError: Kernel died
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/nbgrader/apps/validateapp.py", line 72, in start
validator.validate_and_print(filename)
File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 340, in validate_and_print
results = self.validate(filename)
File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 311, in validate
nb = self._preprocess(nb)
File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 290, in _preprocess
nb, resources = pp.preprocess(nb, resources)
File "/opt/conda/lib/python3.7/site-packages/nbgrader/preprocessors/execute.py", line 44, in preprocess
raise UnresponsiveKernelError()
nbgrader.preprocessors.execute.UnresponsiveKernelError
[ValidateApp | ERROR] nbgrader encountered a fatal error while trying to validate âsubmitted/courseraLearner/W3A1/Tensorflow_introduction.ipynbâ
The first cell in the notebook has a version number.
What does yours say?
I donât have a completed notebook for this assignment that I can run for comparison, but Iâll work on that later today. Weekend tasks need doing at the moment.
Hi @tybenz25
Try to get a completely fresh copy of assignment and then write codes only between markers ###START AND END CODE HERE### without removing any instructions or codes in between these markers.
Also make sure not to remove or edit any part of the codes outside of these start and end code markers. Even removing any header or return statement.
avoid editing or deleting any part of the codes of unit test cell.
Then re-submit. You will pass the test if you are confident of your codes to be correct.
Let us know if the issue still persist.
Regards
DP
Hi Deepti - Iâm very confident I followed all of your suggestions, but I will try to get a completely fresh copy of the assignment and re-write the code.
Looks like that did the trick. Thanks for the help!
hi @tybenz25
To get a fresh copy,
Click file, then select open. You will find all the files related to the assignment. Here if you have saved a copy of your notebook with renaming other than what was mentioned. Then go ahead and delete the assignment notebook by selecting the particular notebook.
Once deleted, you will find 404 not found image on your browser. Then close the browser.
Open the assignment page again, when you open you will find 404 not found.
At this time, click on the right hand top corner Before Grades, where you will find Reboot. Click reboot.
Then click the same on the right hand top corner, then click Get latest version and then Update lab.
Only make sure this time when you write codes limit your codes between the markers mentioned in previous comment. I am not stating you might have, but sometimes when we try to run a cell, keyboard click here and there could lead to an editing which autograder might not allow and when we cannot find such minute error, this would be last resort.
Regards
DP
I can report that I implemented the notebook (âTensorflow_introductionâ) and was able to pass all the tests and have 100% grading.
Other things to verify:
- That you did not rename your notebook file. The grader always uses the original notebook file name.
- That you made no changes except for where âYOUR CODE HEREâ is marked.
I also got all those âcudaâ warning messages and these messages:
[ValidateApp | ERROR] Timeout waiting for execute reply (30s).
[ValidateApp | ERROR] Interrupting kernel
Success! Your notebook passes all the tests.
But that doesnât impact your grade.