Usage of Transfer Learning

I have tried to implement pneumonia detection using transfer learning with Keras and TensorFlow. And Im getting the following results :

                                         Training Accuracy:             98.90%
                                          Validation Accuracy:        97.69%
                                          Test Accuracy:                   90.38%

I can notice that there is approximately 8% difference between the Training and test set accuracy, Which means there is a high variance if I’m not wrong.

Is my current model performance good or there is a way I can improve the performance on the test set? since the end goal is to deploy the model.

For more details on the model building process:

                     Usage of pretained Model:   ResnNet50 

                     Method1:            Feature Extraction with data augmentation  

                     Method2:           Fine-tuning the pre-trained model ( Best performance)

How many examples are you using in training, validation, and test?

How many features are in the data set?

Also, you need to consider whether an 8% difference is statistically significant.

I use the dataset of Chest X-Ray Images (Pneumonia) 5,863 images, 2 categories from the Kaggle website: Chest X-Ray Images (Pneumonia) | Kaggle >

                              Training set: 5216
                              Validation set: 8 
                               test set:  624 

This is the number of features: