C3W3 Image ImagePullBackOff error

Hello Team,

I am getting the below error when tried to check for the pods.
Note: 1) I have changed the tfjobyaml from to the current GCP project id
2) I have also used the updated one for cluster creation:
gcloud container clusters create $CLUSTER_NAME
–project=$PROJECT_ID
–release-channel=stable
–cluster-version=1.21
–machine-type=n1-standard-4
–scopes compute-rw,gke-default,storage-rw
–num-nodes=3

Hi,

I think the problem is that you didn’t set the value of image in tfjob.yaml

Hello Paul,

I have set the image field as well, PFB for reference. When I executed it was showing all the pods in running state, but after some time it is turning out to be in an error state. And I am stuck with the last assessment. It’s been nearly 10 attempts for me going like this.


Before (you can see the timings) :

After:

Regards,
Prateek

Hi Prateek. It seems you forgot to append the -bucket string to the model and checkpoint paths. Please see the example tfjob.yaml file shown in the lab instructions for reference. You will see where to place the -bucket there. It’s appended to the project ID. Hope this helps!

Hi Chris

I have appended the -bucket string as well.
Still facing Image pullback off.

Please help.




Please help me solve this error. Been trying since 2 days.