C4W3: Implementing Canary Releases of TensorFlow Model Deployments with Kubernetes and Anthos Service Mesh

Derrick_Cline · July 5, 2023, 11:15pm

Despite following the instructions on the entry page to use num_nodes=4 I cannot get past task 2 and cannot complete the exercise. It just spins on attempting to deploy the cluster “Cluster is being deployed…working…”. I have now used all my credits and am very frustrated at why this exercise is not optional despite the laundry list of issues this exercise has encountered by looking at the community posts. Can someone please advise, thank you!

Isaak_Kamau · July 6, 2023, 9:06am

Hello @Derrick_Cline

Here are a few steps you can try to troubleshoot the problem:

Refresh the page/open the lab in a new browser
Check your internet connection: Unstable or slow internet connectivity can hinder the deployment process. If possible, try switching to a different network or check if there are any network restrictions or firewalls in place.
If the lab deployment is still not progressing, consider reaching out to Qwiklabs support for assistance.

Regard
Isaak

JonathanJordan21 · July 6, 2023, 7:02pm

i’m facing the same problem, any solution?

Edit : I’ve found the problem. It was the us-west1-c zone that was too crowded. I changed to us-east1-c and get 90 points. Anthos Installation assessment was failed, but the other assessments were success.

Anthos Service Mesh was installed with us-east-1c zone, the assessment requires it to have us-west-1c as the zone. But it didnt affect the other assessments.

Jdafx · July 14, 2023, 6:58am

I had same problem, cluster creation in us-west1-c did not complete after two attempts. I changed to us-west1-a and was able to create.
You will not get credit for the Anthos Service Mesh installation step as it says “Please create a Kubernetes cluster in us-west1-c zone.”
However, I proceeded to complete all other steps in the lab with total score 80% which seemed to be enough to pass.

Mario_Schiappacasse · July 14, 2023, 11:50pm

Thanks, I have the exact same problem. I’m contacting qwiklabs support.

Also, when doing kubectl get events I get back couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp 127.0.0.1:8080: connect: connection refused

Isaak_Kamau · July 15, 2023, 5:30pm

HI @Mario_Schiappacasse
If Kubernetes clusters are already available on your local machine: kubectl cluster-info
I suggest you restart the Kubernetes components and the API server.

jamesblund · July 19, 2023, 7:17pm

Cluster deploy stuck at 64%. I tried the above-suggested zones plus us-central1-a - still nothing.

Error logs are showing: ZONE_RESOURCE_POOL_EXHAUSTED

UPDATE: Tried europe-west1-c and it seems to be working

Isaak_Kamau · July 19, 2023, 7:34pm

HI, @jamesblund
Cool!
ZONE_RESOURCE_POOL_EXHAUSTED error is usually solved by trying out different zone

grdvnl · July 20, 2023, 4:39am

I am facing similar issues completing this lab. The cluster creation process is stuck at

Creating cluster cluster-1 in us-west1-c... Cluster is being deployed...working.

It seems like I have also exhausted the quota of how many times I can start the lab. This is the only exercise left for me to finish the specialization.

Can someone advice me on the best course of action?

Thanks!

chris.favila · July 20, 2023, 11:35am

Hi everyone! Thank you for bringing this to our attention. We’ve reported it to our partners so it can be fixed asap. Please watch this thread for updates. In the meantime, you can try switching the CLUSTER_ZONE to europe-west1-c or us-west1-a as mentioned by other learners. Hopefully, that will have more resources to create the cluster. You may not get the perfect score but it will be enough to pass the lab. Thank you and sorry for the inconvenience!

Sudharsan_Sundararaj · July 21, 2023, 6:42am

Thanks, you are life saver. I was attempting the last free experiment since all the previous labs failed due to resource issues. Changing the region helped me to achieve the specialisation

Shabbir_Marfatiya · July 27, 2023, 4:24am

I am facing this issue after creating cluster in europe-west1-c

Shabbir_Marfatiya · July 27, 2023, 5:10am

I am getting it due to a version problem. I have checked kubectl api-versions and I find out there’s no " autoscaling/v2beta1" available. Available versions are “autoscaling/v1 and autoscaling/v2”. I have tried to change the API version to v1 and v2 but I am facing errors there also. Please help me out if I am missing something.

Claudia_Fernandes · July 29, 2023, 12:41pm

I don’t know if this would work for you, but I completed my entire project with europe cluster. Completed all the assessments.

Then went back, deleted the cluster and created one again with us_west1.

If you can see, i ran the
first and then the

Cause if you run them both together, you will get an error of resources as stated in my screenshot

After that, i just ran the cell that you mentioned and done, my assessment was cleared.

Hope it helps

Mario_Mitter · August 2, 2023, 8:13am

i have the same issue

Mario_Mitter · August 2, 2023, 8:14am

i face the same problem - could be a versioning problem (kubernetes - No matches for kind "HorizontalPodAutoscaler" in version "autoscaling/v2" when installing knative serving - Stack Overflow) - no clue how to fix it in this assignment

Mario_Mitter · August 2, 2023, 8:28am

you can fix this by using the newer config located in samples/gateways/istio-ingressgateway/autoscalingv2 - just replace samples/gateways/istio-ingressgateway/autoscaling-v2beta1.yaml with samples/gateways/istio-ingressgateway/autoscalingv2/autoscaling-v2.yaml

Agustin_Ferreira_Pos · August 5, 2023, 2:35pm

I have the same issue, which seems to come from us following the instructions at the task to switch to europe…

chris.favila · August 8, 2023, 2:57pm

Hi everyone! The zone resource issue should now be resolved. I just tried the lab and was able to create the cluster.

Hope it also works for you!

kanchanrp · August 10, 2023, 12:09pm

Hi. I was trying this one yesterday and having the same problem. I have tried multiple zones but keep getting the same resource not available error. Now I only have one credit chance and I do not want to use it and then have to pay and even fail then. Is there a way to confirm the resource availability before trying it?

Topic		Replies	Views
C4W3: Issue with cluster zone after installing Anthos Service Mesh Deploying Machine Learning Models in Production	1	420	August 1, 2023
C4W3 - task2 anthos installation Deploying Machine Learning Models in Production	13	691	December 17, 2022
Weak 3 Assignment Deploying Machine Learning Models in Production	8	566	May 12, 2023
MLEP C4W3: Bug \| Does not have minimum availability \| Impossible to complete lab Deploying Machine Learning Models in Production	4	540	May 2, 2023
When can "Implementing Canary Releases of TensorFlow Model Deployments with Kubernetes and Anthos Service Mesh" be available again Deploying Machine Learning Models in Production	5	657	December 27, 2022

C4W3: Implementing Canary Releases of TensorFlow Model Deployments with Kubernetes and Anthos Service Mesh

Related topics