I found that whenever I made a request, I received the following error:
upstream connect error or disconnect/reset before headers. reset reason: connection termination
I could still pass the lab, but could not see the canary deployment of ResNet101. Does anyone know why this occurs, and how to get around it?
I also got the same error and for me the issue was due to wrong model path in the configmap.
Otherwise I would make sure that the model server is actually up and listening on port 8501, e.g.
# enter the default container for your pod for resnet50/resnet101
kubectl exec -it <POD_NAME> /bin/bash
# check model server process
ps aux | grep tensorflow_model_server
# check port 8501 is open
apt-get update
apt-get install lsof
lsof -i :8501
I’m also getting the same error.
Same error, cannot get it to predict even on resnet50.
Looking at the pod status with kubectl get pods
, the pods get restarted after the request, so it looks like something is crashing the pod upon arrival of the request.
I tried following @tatoooo 's suggestions but the service looks healthy and the ports are open.
@d1ggs I faced the same issue and the solution I provide here seems to solve it.
Concretely, you can try to modify tf-serving
image on the deployment manifest of tf-serving/deployment-resnet50.yaml
and tf-serving/deployment-resnet101.yaml
by adding a specific version tag (such as 2.8.0). After that you can reapply the deployment e.g.
kubectl apply -f tf-serving/deployment-resnet<version>.yaml
To make sure the deployment is updated, you can also delete the deployment before reapplying it.
kubectl delete deploy image-classifier-resnet<version>
Hope this helps.
N.B. Other debugging method you could try