MLEP C3W3 Qwiklabs problem

I got through most of “Distributed Multi-worker TensorFlow Training on Kubernetes” on Qwiklabs, but the last part took so long that it timed out before it finished running (I’m talking literally 45 minutes). None of the multiworkers became “ready” in that entire time. Is this normal? Or is it because I’m using Firefox instead of Chrome?

Hi @dotthewise
I don’t know the reason of your long running process.
Just to be sure I have run once more the lab on my side and I didn’t have issues. Maybe my PC is more powerful or has more RAM, I don’'t know. I used Firefox under Windows 10.
Anyway you can ask the Qwiklabs for an extension. Please take a look at this post. The problem you have has been already faced by many other learners with the Qwiklabs exercises.
Please let me know if it worked.

I tried to delete the current job and relaunch, because in my case, I haven’t saved the changes of the file tfjob.yaml

So, use kubectl delete tfjob $JOB_NAME and then kubectl apply -f tfjob.yaml

1 Like