I proposed this approach to explore the existence of a possible non-linear relatiobnship.so maybe x doesnt correlate with the target value but maybe x^2 does. But then I thought, how can we possible explore a possible non-linearity such as x1XxX2 with my method ? I donno.
for really big feature datasets, I just suspect visualization wouldn’t tell us much about selecting the features. so, I speculate, finding a more generalized mathematical formula for feature selection is needed.
I also didnt get what you mean by transformation but I asked it in the thread that you referred since you already have one explanatory post in it. I mean I get the reason-it saves you from non-linearity by feature engineering. but what is the method for transforming?
p.s maybe another safer method could be lasso regularization to butcher all the irrelevan features from the model ?