Hello everyone,
I’m currently taking the Machine Learning course and came across the feature engineering example in the lesson about frontage, depth, and house price prediction. I understand the concept of creating new features to improve model performance, and I found it insightful how frontage and depth were combined into a new feature, “area.”
However, I have a concern regarding multicollinearity. Since the area is derived from multiplying frontage and depth, it seems that adding this feature might lead to collinearity between area, frontage, and depth. As a result, I’m wondering how this interaction between the features might affect the model’s stability, particularly in linear models, where multicollinearity can inflate the variance of coefficient estimates.
Could anyone provide clarification on how this concern might be addressed in practice or explain why adding the “area” feature doesn’t significantly impact the model? I’d really appreciate any insights.
Thanks in advance for your help!