Oob_score and control for overfitting in a random forest model

mehmet_baki_deniz · February 16, 2023, 9:22am

hi
here is the code. the score is decent enough yet oob_score shows the cross validation score
do I need to check the training score to understand whether there is overfitting ?

for instance, if the the training score is 99% which is possible with tree-ensembles, I would say there is an overfitting problem.
or is this analysis redandant here with oob_scores?
what I understand is by setting oob_score to true, the model uses the unused samples in the building of the ensemble for test score calculation.

the book that I read suggest that random forests are overfitting free by desing.

. turns out that decision tree hyperparameters are not as significant within
random forests since random forests cut down on variance by design

what do you guys think about this?

rf = RandomForestClassifier(n_estimators=50, oob_score=True, random_state=2, n_jobs=-1)
rf.fit(X_census, y_census)
rf.oob_score_
The score is as follows:
0.8518780135745216

Mubsi · February 16, 2023, 9:41am

Hi @mehmet_baki_deniz,

Can you mention the week # and the course name this is from ?

Thanks,
Mubsi

mehmet_baki_deniz · February 16, 2023, 9:49am

hi
this code is not from any course. It is from a book that I read on decision trees. That is why I posted it to the general discussions.

Topic		Replies	Views
C2_W4_Lab_02_Tree_Ensemble question Advanced Learning Algorithms week-4	3	267	April 26, 2024
Results evaluation UNQ_C3 AI for Medical Prognosis week-2	7	235	March 14, 2024
[Seeking Guidance] Heart Disease Classification AI Discussions project	2	249	December 16, 2023
AI4M Course 2 Week4 RandomForest model performing worse than Cox AI for Medical Prognosis week-4	1	546	January 17, 2023
Overfit in boosted trees Advanced Learning Algorithms week-4	5	508	July 18, 2022

Oob_score and control for overfitting in a random forest model

Related topics