[Week 3] Image Segmentation Grading Issue

I followed the instructions to click on kernel → restart and clear outputs, then save checkpoints and submitted. Also, I’ve run the full code and submitted and I’ve got the same issue.

In my code when I’ve run it, I got ‘All test passed’ but I’ve got 0/100 grade.

The grader displays the following message:

[ValidateApp | INFO] Validating ‘/home/jovyan/work/submitted/courseraLearner/W3A2/Image_segmentation_Unet_v1.ipynb’
[ValidateApp | INFO] Executing notebook with kernel: python3
2021-05-04 03:59:02.668856: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.1’; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-05-04 03:59:02.668894: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-05-04 03:59:04.256342: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-05-04 03:59:04.256376: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2021-05-04 03:59:04.256396: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-3-0-59.ec2.internal): /proc/driver/nvidia/version does not exist
2021-05-04 03:59:04.256634: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-04 03:59:04.263890: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2999995000 Hz
2021-05-04 03:59:04.265260: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e4c97ca050 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-05-04 03:59:04.265287: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-05-04 03:59:43.250169: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 232 of 500
2021-05-04 03:59:53.284230: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 435 of 500
2021-05-04 03:59:56.580348: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:221] Shuffle buffer filled.
[ValidateApp | ERROR] Timeout waiting for execute reply (30s).
[ValidateApp | ERROR] Interrupting kernel
[ValidateApp | ERROR] Traceback (most recent call last):
File “/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py”, line 589, in run_cell
msg = self.kc.iopub_channel.get_msg(timeout=timeout)
File “/opt/conda/lib/python3.7/site-packages/jupyter_client/blocking/channels.py”, line 57, in get_msg
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/apps/validateapp.py", line 72, in start
    validator.validate_and_print(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 340, in validate_and_print
    results = self.validate(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 311, in validate
    nb = self._preprocess(nb)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 290, in _preprocess
    nb, resources = pp.preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/preprocessors/execute.py", line 41, in preprocess
    output = super(Execute, self).preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess
    nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess
    nb.cells[index], resources = self.preprocess_cell(cell, resources, index)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 438, in preprocess_cell
    reply, outputs = self.run_cell(cell, cell_index, store_history)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 597, in run_cell
    raise TimeoutError("Timeout waiting for IOPub output")
TimeoutError: Timeout waiting for IOPub output

[ValidateApp | ERROR] nbgrader encountered a fatal error while trying to validate ‘submitted/courseraLearner/W3A2/Image_segmentation_Unet_v1.ipynb’[ValidateApp | INFO] Validating ‘/home/jovyan/work/submitted/courseraLearner/W3A2/Image_segmentation_Unet_v1.ipynb’
[ValidateApp | INFO] Executing notebook with kernel: python3
2021-05-04 03:59:02.668856: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcudart.so.10.1’; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2021-05-04 03:59:02.668894: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-05-04 03:59:04.256342: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘libcuda.so.1’; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-05-04 03:59:04.256376: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2021-05-04 03:59:04.256396: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-3-0-59.ec2.internal): /proc/driver/nvidia/version does not exist
2021-05-04 03:59:04.256634: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-04 03:59:04.263890: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2999995000 Hz
2021-05-04 03:59:04.265260: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e4c97ca050 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-05-04 03:59:04.265287: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-05-04 03:59:43.250169: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 232 of 500
2021-05-04 03:59:53.284230: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:172] Filling up shuffle buffer (this may take a while): 435 of 500
2021-05-04 03:59:56.580348: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:221] Shuffle buffer filled.
[ValidateApp | ERROR] Timeout waiting for execute reply (30s).
[ValidateApp | ERROR] Interrupting kernel
[ValidateApp | ERROR] Traceback (most recent call last):
File “/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py”, line 589, in run_cell
msg = self.kc.iopub_channel.get_msg(timeout=timeout)
File “/opt/conda/lib/python3.7/site-packages/jupyter_client/blocking/channels.py”, line 57, in get_msg
raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/apps/validateapp.py", line 72, in start
    validator.validate_and_print(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 340, in validate_and_print
    results = self.validate(filename)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 311, in validate
    nb = self._preprocess(nb)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/validator.py", line 290, in _preprocess
    nb, resources = pp.preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbgrader/preprocessors/execute.py", line 41, in preprocess
    output = super(Execute, self).preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess
    nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess
    nb.cells[index], resources = self.preprocess_cell(cell, resources, index)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 438, in preprocess_cell
    reply, outputs = self.run_cell(cell, cell_index, store_history)
  File "/opt/conda/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 597, in run_cell
    raise TimeoutError("Timeout waiting for IOPub output")
TimeoutError: Timeout waiting for IOPub output

[ValidateApp | ERROR] nbgrader encountered a fatal error while trying to validate ‘submitted/courseraLearner/W3A2/Image_segmentation_Unet_v1.ipynb’

2 Likes

I’ve also gotten this exact grader log multiple times. Would love to know how to fix the issue. In addition to

  • Restart and Clear Output → Save and Checkpoint → Submit

I’ve done

  • Restart and Run All → Submit

to identical results. Also, the grading takes around 10-15 minutes. I’m not sure if that’s intended.

I am also facing the same issue and the same error in the grading. Did the issue get solved for you?

Hey @cmaroblesg, @ravenJB, @vrohithk56 , there’s a time out set on this code blocks, and if it takes longer than that for it to run they throw an error. Right now the only solution is, that if you are confident your solution is correct and you are passing all the tests then keep on submitting until you pass. I apologise for this inconvenience, I know this is not ideal. But we are working on this.

2 Likes

It is not resolved, I’m still trying but it isn’t working yet.

Thanks Mubsi for responding here. A lot of us are having the same issue with the Week 2 programming submissions.

1 Like

I have tried to submit more than 20 times and still gives the same error.

Hi Mubsi i tried more than 20 times but still the same error and my question is that
Deadline Pass this assignment by May 16, 11:59 PM PDT so within that is this issue resolves or what is the solution
Thanks in advance

Hey everyone, unfortunately we are facing an issue at the backend of the infrastructure, the part that deals with assignment grading. The assignment will not grade. We and Coursera are collectively at it, trying to solve it at the earliest, but it will take a while as this happened after office hours. We apologise for the inconvenience. We are trying our best to solve it as soon as possible. I shall keep you guys posted.

And as for the deadline, don’t worry about those, Coursera automatically offers to enroll in a new session if you are nearer to the deadline and still have to go through the course.

1 Like

Hi @Mubsi, I am also facing the same issue. All the tests were passed and the grader graded it to 0. I have also followed the steps

  1. P​ress: Kernel → Restart & Clear Output
  2. P​ress: File → Save and Checkpoint

and it’s graded to zero. Please mention me also once the issue is resolved. Thanks

hi guys, you comment the line
model_history = unet.fit(train_dataset, epochs=EPOCHS)
and then submit. I did it and got 100/100

1 Like

Please frequently check this post regarding the update: Course 4 Assignment Submission Issues

Hi vrohith if you comment that line next model_history .history will not print how you are submitting are you clearing the outputs then save the check points then you will submit or till that point execution then submit

@vrohithk56 Thanks. It works for me. @shivraj you can comment that line mentioned by vrohit and then do

P​ress: Kernel → Restart & Clear Output
P​ress: File → Save and Checkpoint

and submit. This will work.

It takes a bit time. Roughly 15 minutes in my case.

yeah its taking same time and please help me in resolving the issue in week 4 assignment only last training step is pending so

This works perfectly