How to check which factors affect the model the most?

hey guys i am using polynomial linear regression for resale car price prediction…how can i know which feature to affect how much on my model? if i use model.coef_ it is giving too long list because data have categorical columns also what is more robust way to watch factors affecting most by column wise any other techniques?

You can try implementing PCA (Principal Component Analysis).

Running a regression is equal parts art and science. Finding the right mix of features is the main challenge.

For this reason, a common strategy for regression is called ‘feed forward’. You test features once by one, observing the loss and variance, iterating until you reach a desired outcome.

Here is an article where the author uses statsmodels, a stats based regression package, common for regression.

Be mindful that your data meets the appropriate assumptions for regression:

  • The true relationship is linear
  • Errors are normally distributed
  • Homoscedasticity of errors (or, equal variance around the line).
  • Independence of the observations

Hope this helps.

Cheers

If you have normalized features, then you can get a first-order feeling for the importance of each feature by just looking at the magnitude of the weight for that feature.

  • Features that aren’t very important will have weights near zero.
  • Features that are highly important will have large magnitude weights, either positive or negative.

Hello @Rajguru_Bhosale

Based on your description, did you use sklearn.linear_model.LinearRegression in your analysis.

Kindly go through the documentation, which will help you to understand converting categorical columns if required in your factors of resale of car price. We can give better suggestion without knowing about your dataset, factors, columns and what you are planning to create.

Regards
DP