MLEP C4W3: Bug | Does not have minimum availability | Impossible to complete lab

The lab: Implementing Canary Releases of TensorFlow Model Deployments with Kubernetes and Anthos Service Mesh

When you get to Task 6 there is the error:

Does not have minimum availability

I’ve tried to set –num_nodes value to 4 and also used the default of 3 (currently set in the lab).

This error happens every time no matter how carefully I do it. At this point I am not able to finish the course due to this bug in the lab.

Hello @Matt_Weber
Try to review the logs and error messages from your deployment to see if they provide any additional insight into the problem. Try the kubectl logs command to view the logs and you can check the resource utilization using the kubectl top command.

Thank you for the reply. I retried, and eventually, it worked with 4 nodes. Maybe there were some temporary resource issues?

Hi. I am getting similar issues at Task 6. When inspecting the pod with kubectl describe pods I get the warning message:

Tolerations:        node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Existsfor 300s
Events:
  Type     Reason             Age    From                Message
  ----     ------             ----   ----                -------
  Warning  FailedScheduling   3m39s  default-scheduler   0/3 nodes are available: 3 Insufficient cpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
  Normal   NotTriggerScaleUp  3m37s  cluster-autoscaler  pod didn't trigger scale-up:

I keep getting signed out of the account as well. I tried completing the lab twice now, both times this occurred at the step 6.2.
I successfully create the deployment, but the deployment is then never ready due to the insufficient cpu.

Should I just wait and try again later or am I doing something wrong here?

Hello @pq53ui
Try increasing the CPU request of the pod, You can update the pod spec to request fewer CPU resources or increase the CPU limit of the nodes in the cluster or try adding more nodes to the cluster. Please refer to this thread for more info: Assign CPU Resources to Containers and Pods | Kubernetes