I’m working through the Optional lab where we use PolynomialFeatures to “alter the model” for a better fit. It seems we are just simulating a model alteration by altering the training data (adding the x^2 feature). When I started this I thought there would be some way to directly alter the model equation, but I have not run across that. Is the lab approach the way its done in practice?
Thanks @TMosh. It seems strange to me but seems you know you stuff.
If you have a moment - if this is the industry approach, then the “model representation” is not actually encoded anywhere in the regression part of the solution. Does that imply that at prediction time, the transformation to add these feature values occurs upstream? And if it changes, we need to think about deployment of that little chunk of transformation code from an MLOps perspective?
Yes. This can be tricky, because your prediction code has to know how to create the new features. So that’s best encapsulated in a support function, which is used in both training and predictions.
Note that this is only a useful technique in limited circumstances. It’s part of the standard “intro to machine learning” curriculum.
For example, you can’t learn the equation for the distance traveled by a falling ball if you only have elapsed time and distance - because the relationship involves the square of the time. So you’d need to add a t^2 term to get a good model.
But, if you’re using a model that has built-in non-linear functions (like a Neural Network), then you don’t need to create new features by hand. They will automatically be created in the hidden layers of the NN, because a NN always has a non-linear function in its hidden layers.