Error in upgrading TFjob Manifest

Please help me solve this

@chris.favila please help. I have tried it 12 times now .

Hi Apica! Can you post your tfjob.yaml here? Please check that you edited it correctly. You should modify these fields:

  • image
  • --saved_model_path
  • --checkpoint_path

All three fields involve your project ID in the string. Please check the sample file in the lab instructions. You can use the Cloud Editor to edit the said file. On the upper right of the Cloud Shell you are typing on, there is a Cloud Editor button. Please click that to open a file manager-like interface. Then navigate to tfjob.yaml under the lab-files folder. From there, you can edit the file to have the correct image and paths. Save then click the Cloud Shell button again to go back to your terminal and execute the next commands.

Also, can you let us know what command you were running that led to the 4th screenshot above? There’s an object must be a non-empty string error which I haven’t encountered yet in this lab. That might help us in debugging.

Lastly, please also provide a screenshot of the output of this command:

gcloud container images list

Just leave them here and I’ll take a look asap tomorrow. I have to log out now because it’s past midnight already in my side of the world. Hope you understand. Rest assured that you will be able to complete this lab. Thanks!

tried again and this time getting error

Hi Apica. The tfjob.yaml looks correct. I’ll try this out and update you asap.

By the way, if you will reply, please use the reply button at the bottom left of this box. Click that instead of the blue Reply button at the bottom of all threads. I think that only sends notifications to the owner of the topic (i.e. you). Thanks!

Hello again. I retried the lab and did not run into any issues. For your next attempt, please send a screenshot of these 3 commands when you encounter them in the lab:

  1. gsutil ls - The result should look something like this:

  2. gcloud container images list - It should look something like this:

kubectl describe tfjob $JOB_NAME

You sent something similar above. It will look something like this. Notice that the image and bucket names are the same as the output of the first two commands above.

After that, the training should commence. Hope you’re able to complete the lab in the next one!

Oh I think I see the issue now. In your tfjob.yaml, there was a string prepended to the paths. That should be removed because that points to the Google Cloud Registry and that’s not where your buckets are stored. Please the sample output in my post above. Thanks!

I did not see any number of attempts notice before. now it says that “you quota has been exceeded for the lab”
what do i do ?

Okay checked the other thread regarding this. Reached support and issue is solved

with and without in both image path and saved model path , GETTING THE SAME ERROR

Hi Apica. Looks like you misspelled qwiklabs. It is shown as qwicklabs in the screenshot. Kindly refer to my instructions in this post to make sure you don’t misspell anything and to ensure the resources are properly created. Please provide those three screenshots next time for easier debugging.

Also, please do not remove in the image field. You only need to remove it in saved_model_path and checkpoint_path. As mentioned before, the strings you will put there will depend on the output of gsutil ls and gcloud container images list. Thanks.

@chris.favila thank you. It is done.

Awesome! Glad you were able to complete it. Next time, kindly create the topic in the correct course category so the course mentors can see it. This was initially posted in the News and Announcements category and they are not monitoring that. Thanks!

For whoever came to this thread because of the object must be a non-empty string error - please check if you forgot to add -bucket as part of the bucket name in the tfjob.yaml file.

Yep, that was me :man_facepalming: