Do we need to normalize features when we use boosting algorithms( like xgboost , catboost, lightbgm…) for regression problems and classification?
and when we use these algorithms do we need to normalize the target in regression problems?
I think you are asking about continuous features. It depends on how the algorithm finds split, but I guess the “greedy” algorithm is the basic for any boosting trees packages.
The greedy algorithm considers feature-by-feature, and it generally first sorts the samples by the values within a feature, and then decide the splitting point. For example, any samples with feature values less than the splitting point will go to the left leaf, and the rest to the right leaf.
Because what matters is how the samples are sorted, as long as the ordering won’t be changed by our feature normalization method (which is true for the methods we have learnt), the method has no effect to the outcome of the splitting, and therefore, we don’t need to normalize our features.
We also don’t need to normalize the continous target, but if we do normalize it, depending on the loss function we choose, we might need to adjust some hyperparamters accordingly such as the regularization parameters. Normalizing the target will not bring us a better model, but searching for the best set of hyperparameters will.