hi all,
I have trained a logistic regression model after choosing 5 important features out of 12 features.
then I test the model with various polynomial degrees and output both cvs and training accuracy.
I observe that the accuracy for the training set with 12 features combined is bigger than the model with 5 features for polynomial degree 2.
That is normal right ? After all, we select most relevant features to avoid overfitting but a more complex model with all the feautres would always bring more training accuracy at the expense of increased overfitting.
in my model, I found out that 12 features model also doesnt have any over-fitting issue.
So this makes me think that, It is always a good practice to start with all the features at hand and only if there is an issue of overfitting for even degree 1 model, I should consider feature selection. because a complex model will bring higher accuracy provided that there is no overfitting.
I wonder what your reflections are to my discussion.
warmly, mehmet