Real time endpoint Sagemaker Concurrency Issue

Hi all, I tried to deploy a real time endpoint in sagemaker using the instance ml.t2.xlarge, with the initial variant weight: 1, and initial instance count: 1, when I test the endpoint it works fine but I tried to incorporate it in an application, and when 4 users call a POST request to the endpoint it fails, the issue is related when I call the endpoint simultaneously multiple time, is this is a configuration fixable issue? Or how can I solve this?

Welcome to the community @Danny_Xie_Li :grinning:
I don’t know why it doesn’t work, but maybe you could try using two small instances instead of using the ml.t2.xlarge?