Training error vs test error

In the lecture video , sir Andrew ng explained that when we plot j train and j cv , it looks like this -

if we increase the degree of polynomial ,j train doesn’t increase
but when I tried to plot this in my model, I got this -

j train is increasing as polynomial degree gets higher .I can’t understand why this is happening.


Hello @Harshit_Kumar2, I guess this is your own work but not part of any assignment? If this is not part of any assignment, would you mind share your code that will produce the plot here so that I can try running it and maybe I can give you some explanations or directions.


Hello Sir @rmwkwok , This is my code that produced the plot -

Polynomial Ln.ipynb (120.4 KB)

Hello @Harshit_Kumar2, very nice work. I have 2 suggestions for you:

  1. The practice is, we train only one model with our training data, and we use the testing data to validate the model, so this line is not needed: lr_cv = LinearRegression()

  2. Theoretically speaking feature scaling is not needed if you use sklearn’s Linear Regression because it is not gradient descent based, but since you are testing polynomial features up to a pretty high order, the generated features might be too large for the sklearn’s underlying code to handle. Large numbers have the risk of overflowing the variables that store them, and when overflow occurs, the result is no longer reliable. After applying scaling, I get a more reasonable graph and the training (& cv) errors won’t go up with the degree of polynomial features used.

Keep trying!