Tensorflow data load from cpu to gpu for inference taking time

Hi, I am trying to scale up my nlp ensemble model( retvec and glove embedding) the response time of the prediction is 30 ms but overall response time goes up to 70 ms due to loading data from host cpu to gpu. Is there any solution for this also is there any way to decerease the prediction time further tried tensorrt, onnx but no use