[C3W3 Task2] install the TFJob custom fail resource

I used the command " cd
SRC_REPO=GitHub - kubeflow/manifests: A repository for Kustomize manifests
kpt pkg get $SRC_REPO/tf-training@v1.1.0 tf-training" .
=> Success.

I used the command " kubectl create namespace kubeflow" .
=> Success.

I used the command " kubectl apply --kustomize tf-training/tf-job-crds/base" .
=> fail

error: resource mapping not found for name: “tfjobs.kubeflow.org” namespace: “” from “tf-training/tf-job-crds/base”: no matches for kind “CustomResourceDefinition” in version “apiextensions.k8s.io/v1beta1
ensure CRDs are installed first

Please help me solve that, thank you.

solved!

Hi, I got the same error as you. Can you explain/help me regarding to the solution?

Thanks in advance, BR

The Task1 command.

gcloud container clusters create $CLUSTER_NAME
–project=$PROJECT_ID
–release-channel=stable
–machine-type=n1-standard-4
–scopes compute-rw,gke-default,storage-rw
–num-nodes=3

Change to the following:

gcloud container clusters create $CLUSTER_NAME \
  --project=$PROJECT_ID \
  --release-channel=stable \
  --cluster-version=1.21	 \
  --machine-type=n1-standard-4 \
  --scopes compute-rw,gke-default,storage-rw \
  --num-nodes=3

If you’ve already created the cluster earlier, you can delete it first with this before creating the new one:

gcloud container clusters delete $CLUSTER_NAME

Other commands are the same.

5 Likes

Thanks for your help!!! BR

I’m still getting an error on the following command:

> kubectl apply  --kustomize tf-training/tf-job-crds/base
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/tfjobs.kubeflow.org unchanged

This is not error.

Just warning.

Ignore it.

The lab’s check progress failed when I ran it a minute ago and now it’s passing.
Thank you for your help!

Hi! Thank you for spotting the problem and sharing the solution! I’ve edited your post so the new command is formatted like the command in the lab. Also placed another command to delete the earlier cluster (which uses v1.22) in case the learner has already created it. We will be reporting this to our partners so they can update the lab. Thanks again!

Thank you, this helped me successfully complete the lab along with @ Himanshu_Goyal’s post. Thank you.

did you solve it?

Yes, as stated above.

Hello, I am still getting the same error after applying the fixes.

I have run the Task1 command, only changing the cluster-version as 1.21 is not available -

gcloud container clusters create $CLUSTER_NAME
–project=$PROJECT_ID
–cluster-version=1.27
–machine-type=n1-standard-4
–scopes compute-rw,gke-default,storage-rw
–num-nodes=3

Any ideas how to solve this? Thanks!

1 Like

Hello All,

I am also getting same error. Which cluster-version would you recommend?