Purpose: The model uses the internal weights it learned during the fit() step to generate predictions for a given set of input features.
Action: It calculates a new value for each row in X_norm using
Y_pred = X_norm * coef + intercept.
Output: Returns an array of predicted values (y_predict)
The reason we use same X_norm for both is to predict values only to evaluate how well the model learned the training set. Sometimes we called it as “training error” or “training accuracy”.
Helps us to learn if model has enough capacity to learn the data, or if it is underfit.
It is not mandatory to get the same plot as y_train are the actual, true target values (the ground truth). y_predict are the predicted values generated by the model’s best guess.
If your model were perfect, y_predict would equal y_train which highly unlikely in real world data.
Plotting y_train vs X_norm shows the true data points. Plotting y_predict vs X_norm shows the linear line (or hyperplane) the model created to fit those points.
fit() is for learning; predict() is for applying what was learned. Using the same data to predict just shows how well the model learned that data, but the predicted output (y_predict ) is almost always different from the true output (y_train) unless one is able to create the most perfect model, that’s is highly unlikely as we say the best model also has 99.999% accuracy but in the real world unseen data, it can cause perfect model to not perform as perfectly as it did during training.