Wrapper Methods: STAY AWAY from Stepwise selection

I’m in the MLOps course. So far so good. However Bob Crowe talks about “wrapper methods” for feature selection, which is just stepwise selection in ML jargon-ology. There is so much literature out there about the problems w/ Stepwise selection; the fact that it’s even in there is almost embarrassing. “modern” / best practice methods would include lasso/ridge/elastic net, or even better, talking to domain experts. The course goes on to discuss embedded methods, which LASSO based methods are based on (L1/L2 regularization).

The problem with stepwise / “wrapper” methods is demonstrated in the results table that Bob puts up. It’s identical to correlation.