I have a problem with overfitting that is way abrupt so therefore it’s not letting me get good predictions, for this I have tried (with no success) so far with multiples values for Regularization L1, L2, L1_L2, Dropout, Layers, Neurons, Data size, Learning rate, Patience and couldn’t make it better, I would really appreciate any suggestion to help me improve this training
Maybe I’m applying it wrong so if somebody could please give me some feedback I would really appreciate it
This is the link for the project in my github:
I have commented some details for some guidance so you can understand what is going on
I’ll be pending on any update and if needed more information please let me know
I’ll just toss in a couple of comments, keeping in mind that the purpose of the Coursera Projects is for students to demonstrate their own skills.
You’re using both L1 and Dropout. Both of them are regularization methods. Why use both? I think that just complicates your efforts to understand what’s happening.
Your model is pretty complicated. Did you try simpler models first? How much performance did you gain with the more complicated models?
Thanks for you feedback @TMosh ! I really appreciate your comments
I agree with the purpose of Coursera Projects, in fact the highest score that I got with this project was 68%, this was after many attemps trying different setups with the model.
As result I wasn’t really satisfied with the decision that lead me to get that grade, because I trained the model, wasn’t that good with the loss and val_loss but I adjusted the prediction limit to pass as 1 and got a higher score, and before I got better results with the loss and val_loss but the grade was not good
This is why I decided to ask for help here for some comments like yours! That could lead me to understand better this part because I feel I’m a bit stuck.
I’m going to try making it simplier again because in the way of trying to make it better I made it more complex by adding more methods, that’s true, thanks for the answer!
Hey @TMosh, just wanted to make an update on what I did:
Started with a simplier model, using 2 hidden layers, then 3 to see how it affected the results and the best result I got them with 3 hidden layers
Used independently Dropout and L1 and the best results I got them with Dropout, using 0.2 in the first hidden layer, when started combinating them the variance in the loss and accuracy raised
I tried changing the batch size but 256 was still the best size
The learning rate for the optimizer also decreased and increased it but wasn’t really effective in the variance
On the other hand, I tried taking less parameters as input but didn’t change the result
The best I have gotten so far has been 0.58 as loss and 0.69 for accuracy, I’m really aiming to decrease the loss as much as possible and increase the accuracy, I would really use some other tips or help if possible.