hi
here is the code. the score is decent enough yet oob_score shows the cross validation score
do I need to check the training score to understand whether there is overfitting ?
for instance, if the the training score is 99% which is possible with tree-ensembles, I would say there is an overfitting problem.
or is this analysis redandant here with oob_scores?
what I understand is by setting oob_score to true, the model uses the unused samples in the building of the ensemble for test score calculation.
the book that I read suggest that random forests are overfitting free by desing.
. turns out that decision tree hyperparameters are not as significant within
random forests since random forests cut down on variance by design
what do you guys think about this?
rf = RandomForestClassifier(n_estimators=50, oob_score=True, random_state=2, n_jobs=-1)
rf.fit(X_census, y_census)
rf.oob_score_
The score is as follows:
0.8518780135745216