The lab: Implementing Canary Releases of TensorFlow Model Deployments with Kubernetes and Anthos Service Mesh
When you get to Task 6 there is the error:
Does not have minimum availability
I’ve tried to set –num_nodes value to
4 and also used the default of 3 (currently set in the lab).
This error happens every time no matter how carefully I do it. At this point I am not able to finish the course due to this bug in the lab.
Try to review the logs and error messages from your deployment to see if they provide any additional insight into the problem. Try the
kubectl logs command to view the logs and you can check the resource utilization using the
kubectl top command.
Thank you for the reply. I retried, and eventually, it worked with 4 nodes. Maybe there were some temporary resource issues?
Hi. I am getting similar issues at Task 6. When inspecting the pod with kubectl describe pods I get the warning message:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Existsfor 300s
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 3m39s default-scheduler 0/3 nodes are available: 3 Insufficient cpu. preemption: 0/3 nodes are available: 3 No preemption victims found for incoming pod.
Normal NotTriggerScaleUp 3m37s cluster-autoscaler pod didn't trigger scale-up:
I keep getting signed out of the account as well. I tried completing the lab twice now, both times this occurred at the step 6.2.
I successfully create the deployment, but the deployment is then never ready due to the insufficient cpu.
Should I just wait and try again later or am I doing something wrong here?
Try increasing the CPU request of the pod, You can update the pod spec to request fewer CPU resources or increase the CPU limit of the nodes in the cluster or try adding more nodes to the cluster. Please refer to this thread for more info: Assign CPU Resources to Containers and Pods | Kubernetes