C3W3 - Distributed Multi-worker TensorFlow Training on Kubernetes

Unable to complete " Distributed Multi-worker TensorFlow Training on Kubernetes" with multiple issues.

  1. tfjob.yaml file hangs for an indefinite time period after editing the bucket location.
  2. gcr.io/<YOUR_PROJECT_ID>/mnist-train shows “bash: YOUR_PROJECT_ID: No such file or directory” even though project_id is correctly mentioned.
  3. After several attempts qwiklabs show a message that the quota is completed and cannot access further.

Could any one please help me out in solving this issues. I was stuck at this lab almost from past 2 weeks and unable to complete it and now i exceeded past due date I fear coursera will charge me again.

hi @vikramnimmakuri , welcome to the course!

From your posted error message, it suggests that you should replace <YOUR_PROJECT_ID> with the real project id generated, something likes qwiklabs-gcp-abcd1234etc .

For extending the quota you can check this answer.

hi @tranvinhcuong thanks for the reply. I have solved the quota issue of qwiklabs, however the “-bash: gcr.io/qwiklabs-gcp-01-fab0569fdaae/mnist-train: No such file or directory” still exists. I have also attached screenshot of the error message for reference.

hi @vikramnimmakuri , the gcr.io/.... is not supposed to be entered in the shell, it is a part of the yaml file.

(I updated the title from C2W3 to C3W3)