Hello , I am currently working on new project which consist to develop a model which can predict price of claim insurance depend to informations about clients, contracts and claims datas. The trainins set had over 5400 examples of data . Iused a neural network model and I got overfitting with low MSE on training set and big MSE on CV and Test set. I’ve try different method, like feature engineering, regularization term tuning, adjust number of epochs…But I still got the same thing, low MSE on training set and big difference on Cv and Test set .
Please I need help, because, it’s an important project for me !
Also I think, your dataset seems to be small and probably has much variability (meaning different data not much related to each other). You might need a bigger dataset which can introduce less variability among the data.
In the cases of small datasets I think Cross Fold Validation can help in improving performance too!
I was training, neural network,
If I use a simple with low number layers in neural network, I got high error, in training and cv set.
If I am trying to go further away by adding layers no make the model fit well, I willk get overfitting, I was trying to apply features engineering, or L2 regularization, or try to adjust the nulber of epochs and batch size, but still got the overffiting issue.