Difference between .transform and .fit_transform when feature scaling

In the Model Evaluation and Selection lab, there are two different functions used by the notebook author when feature scaling using scikit-learn’s StandardScalar(). Those two functions are .transfrom(x_train) and .fit_transform(x_train). What is the difference between their functions?

Additionally, when the notebook author uses the PolynomialFeatures() object, they also use a .fit_transform(x_train) function. Does that function do the same thing in PolynomialFeatures() and StandardScalar(), or are they different?

Below, you can find clarifications for fit_transform and transform methods; I am not sure about the other functions because I haven’t gone through this course.

  1. .fit_transform(x_train):
  • This method is a combination of two steps: .fit() and .transform().
  • .fit(x_train): This step computes the necessary statistics or parameters from the data (e.g., mean and standard deviation for scaling, or the unique categories for encoding). It essentially “learns” from the data.
  • .transform(x_train): After fitting, this step applies the transformation to the data using the learned parameters.
  • .fit_transform(x_train): By combining these two steps, it performs both fitting and transforming in one go, which is often more convenient and efficient when you want to transform the training data.
  1. .transform(x_train):
  • This method is used to apply a transformation to the data using the parameters that have already been learned with .fit().
  • It does not compute or learn anything new; it simply uses the existing parameters to transform the data.
  • You would typically use .transform() on new data (e.g., validation or test sets) after you have already fitted the transformer on the training data.