Looking for help with Coursera Guided Project: Data Science Coding Challenge: Loan Default Prediction

Alejandro_Ilamo · May 17, 2024, 12:44am

Hello, I’m looking for some help tunning my ML model for a guided course in Coursera (https://coursera.org/share/8ecfebb08838db1ea35a29e328efa120).

I have a problem with overfitting that is way abrupt so therefore it’s not letting me get good predictions, for this I have tried (with no success) so far with multiples values for Regularization L1, L2, L1_L2, Dropout, Layers, Neurons, Data size, Learning rate, Patience and couldn’t make it better, I would really appreciate any suggestion to help me improve this training

Maybe I’m applying it wrong so if somebody could please give me some feedback I would really appreciate it

This is the link for the project in my github:

I have commented some details for some guidance so you can understand what is going on

I’ll be pending on any update and if needed more information please let me know

Thannks in advance!

TMosh · May 17, 2024, 1:37am

I’ll just toss in a couple of comments, keeping in mind that the purpose of the Coursera Projects is for students to demonstrate their own skills.

You’re using both L1 and Dropout. Both of them are regularization methods. Why use both? I think that just complicates your efforts to understand what’s happening.
Your model is pretty complicated. Did you try simpler models first? How much performance did you gain with the more complicated models?

Alejandro_Ilamo · May 17, 2024, 11:33am

Thanks for you feedback @TMosh ! I really appreciate your comments

I agree with the purpose of Coursera Projects, in fact the highest score that I got with this project was 68%, this was after many attemps trying different setups with the model.

As result I wasn’t really satisfied with the decision that lead me to get that grade, because I trained the model, wasn’t that good with the loss and val_loss but I adjusted the prediction limit to pass as 1 and got a higher score, and before I got better results with the loss and val_loss but the grade was not good

This is why I decided to ask for help here for some comments like yours! That could lead me to understand better this part because I feel I’m a bit stuck.

I’m going to try making it simplier again because in the way of trying to make it better I made it more complex by adding more methods, that’s true, thanks for the answer!

Alejandro_Ilamo · May 20, 2024, 10:18pm

Hey @TMosh, just wanted to make an update on what I did:

Started with a simplier model, using 2 hidden layers, then 3 to see how it affected the results and the best result I got them with 3 hidden layers
Used independently Dropout and L1 and the best results I got them with Dropout, using 0.2 in the first hidden layer, when started combinating them the variance in the loss and accuracy raised
I tried changing the batch size but 256 was still the best size
The learning rate for the optimizer also decreased and increased it but wasn’t really effective in the variance
On the other hand, I tried taking less parameters as input but didn’t change the result

The best I have gotten so far has been 0.58 as loss and 0.69 for accuracy, I’m really aiming to decrease the loss as much as possible and increase the accuracy, I would really use some other tips or help if possible.

Thanks in advance

TMosh · May 20, 2024, 10:35pm

The magnitude of the numerical value of the loss doesn’t really matter. You’re just trying to find where it reaches its minimum.

Then you look at the test set accuracy to determine how well it works.

Alejandro_Ilamo · May 20, 2024, 11:45pm

Ok, I will keep that in mind, and what about the variance that the model gives towards the new inputs? The best I have gotten yet is this:

And this other one:

Thanks for your answers!

TMosh · May 21, 2024, 12:11am

Looking at the vertical scale, maybe that’s not a significant amount of difference.

Topic		Replies	Views
I made a Discord :) Supervised ML: Regression and Classification week-1	3	861	July 11, 2022
Hi from Sweden Introductions introductions	5	121	February 17, 2024
Machine learning and al Supervised ML: Regression and Classification week-2	1	488	January 21, 2023
Help with the Practice Lab of Week 3 Supervised ML: Regression and Classification week-3	1	494	January 15, 2023
Week 3 Image Segmentation and Car detection Grading error Convolutional Neural Networks	5	390	October 1, 2023

Looking for help with Coursera Guided Project: Data Science Coding Challenge: Loan Default Prediction

Related topics