Week 3 Graded Lab - Kernel issues

When creating a post, please add:

Thanks

Could you please share a screenshot of the error?

[ValidateApp | INFO] Validating ‘/home/jovyan/work/submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynb’
[ValidateApp | INFO] Executing notebook with kernel: python3
2024-01-25 14:31:43.680990: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.1’; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2024-01-25 14:31:43.681032: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2024-01-25 14:31:45.077592: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2024-01-25 14:31:45.077630: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2024-01-25 14:31:45.077662: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-2-45-110.ec2.internal): /proc/driver/nvidia/version does not exist
2024-01-25 14:31:45.077905: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-01-25 14:31:45.105722: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2999995000 Hz
2024-01-25 14:31:45.109486: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x561ffe8b9ed0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-01-25 14:31:45.109551: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2024-01-25 14:31:46.523378: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 30198988800 exceeds 10% of free system memory.
[ValidateApp | ERROR] Kernel died while waiting for execute reply.
[ValidateApp | ERROR] Traceback (most recent call last):
File “/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py”, line 478, in _poll_for_reply
msg = self.kc.shell_channel.get_msg(timeout=timeout)
File “/opt/conda/lib/python3.7/site-packages/jupyter_client/blocking/channels.py”, line 57, in get_msg
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/preprocessors/execute.py", line 41, in preprocess
    output = super(Execute, self).preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess
    nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess
    nb.cells[index], resources = self.preprocess_cell(cell, resources, index)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 438, in preprocess_cell
    reply, outputs = self.run_cell(cell, cell_index, store_history)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 578, in run_cell
    exec_reply = self._poll_for_reply(parent_msg_id, cell, timeout)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 483, in _poll_for_reply
    self._check_alive()
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 510, in _check_alive
    raise DeadKernelError("Kernel died")
nbconvert.preprocessors.execute.DeadKernelError: Kernel died

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/apps/validateapp.py", line 72, in start
    validator.validate_and_print(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 340, in validate_and_print
    results = self.validate(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 311, in validate
    nb = self._preprocess(nb)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 290, in _preprocess
    nb, resources = pp.preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/preprocessors/execute.py", line 44, in preprocess
    raise UnresponsiveKernelError()
nbgrader.preprocessors.execute.UnresponsiveKernelError

[ValidateApp | ERROR] nbgrader encountered a fatal error while trying to validate ‘submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynb’

[ValidateApp | INFO] Validating ‘/home/jovyan/work/submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynb’
[ValidateApp | ERROR] Traceback (most recent call last):
File “/opt/conda/lib/python3.7/site-packages/nbformat/reader.py”, line 14, in parse_json
nb_dict = json.loads(s, **kwargs)
File “/opt/conda/lib/python3.7/json/init.py”, line 348, in loads
return _default_decoder.decode(s)
File “/opt/conda/lib/python3.7/json/decoder.py”, line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File “/opt/conda/lib/python3.7/json/decoder.py”, line 355, in raw_decode
raise JSONDecodeError(“Expecting value”, s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/apps/validateapp.py", line 72, in start
    validator.validate_and_print(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 340, in validate_and_print
    results = self.validate(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 298, in validate
    nb = read_nb(basename, as_version=current_nbformat)
  File "/opt/conda/lib/python3.7/site-packages/nbformat/__init__.py", line 141, in read
    return reads(f.read(), as_version, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/nbformat/__init__.py", line 73, in reads
    nb = reader.reads(s, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/nbformat/reader.py", line 58, in reads
    nb_dict = parse_json(s, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/nbformat/reader.py", line 17, in parse_json
    raise NotJSONError(("Notebook does not appear to be JSON: %r" % s)[:77] + "...")
nbformat.reader.NotJSONError: Notebook does not appear to be JSON: ''...

[ValidateApp | ERROR] nbgrader encountered a fatal error while trying to validate ‘submitted/courseraLearner/W3A1/Tensorflow_introduction.ipynb’

My guess based on that is that your notebook is fundamentally corrupted somehow. Did you maybe download it, work on it on a different platform like your local computer or Colab, and then upload it back to Coursera? If so, that can cause weird errors like this.

This is the part of the error I’m talking about:

raise JSONDecodeError(“Expecting value”, s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

My interpretation of that is that it’s not even a valid JSON text file somehow. You might want to try getting a clean copy and then carefully copy/pasting over just your completed code from the “YOUR CODE HERE” sections and see if that helps.

I guess the other theory would be that it’s some kind of weird IT problem, e.g. a proxy server messing with the connection to the grader, but this is DLS C2 W3 and there have been a lot of other assignments before this one. I assume you didn’t have any problems like this on the C1 and earlier C2 assignments. If not, then the only way the “proxy server” theory could work is if you have changed your network setup recently, e.g. before you submitted successfully from your home network and now you are at work or at school on their network, which might have stricter security policies.

1 Like

Thanks Paul.

I did not upload any code from my end to course lab. I tired getting a clean copy and its didn’t work. But it helped in figure out that during the forward forward_propagation function i was using a different tensorflow function that leading to crashing of the notebook, while no error was shown in the notebook, hence i got confused. But its solved now. Thanks for your support.

Thanks for replying and helping me out. I issue is resolved. Have posted the solution.

That’s great news that you were able to figure out what the problem was just based on my vague theories, which actually sound like they were pretty much unrelated to what the problem actually was. :grinning: