C2W3 Lab Qn - Model Evaluation and Selection

Venkat_Subramani · May 3, 2024, 7:39am

Hi

In the first part of Linear regression, the y was not scaled. If we scale that also, the MSE lowers from
training MSE (using sklearn function): 406.19374192533155
training MSE (for-loop implementation): 406.19374192533155
To
training MSE (using sklearn function): 0.020931674858814992
training MSE (for-loop implementation): 0.020931674858814985

Is scaling advisable only for features?

TMosh · May 3, 2024, 7:46am

Feature normalization is only used on the features. Not on the labels.

The purpose of normalizing the features is so that gradient descent works better. It’s not to lower the absolute cost.

We really don’t care about the absolute cost - just finding the minimum of the cost curve. The scaling of the cost doesn’t matter.

Venkat_Subramani · May 4, 2024, 9:49am

I see. So does that mean had we scaled Label, we still would have arrived at the same conclusion and picked the 4th order polynomial?

I hope I can ask my remaining questions on the pynbk here itself.

poly = PolynomialFeatures(degree=2, include_bias=False)
Why is bias turned off?

On the second section of neural network, when I compare polynomial degree, here are the results. Wouldn’t you say putting degree would help the neural network?

degree =1
RESULTS:
Model 1: Training MSE: 406.19, CV MSE: 551.78
Model 2: Training MSE: 73.40, CV MSE: 112.09
Model 3: Training MSE: 73.40, CV MSE: 111.34

degree =4
Model 1: Training MSE: 50.77, CV MSE: 78.37
Model 2: Training MSE: 49.60, CV MSE: 74.63
Model 3: Training MSE: 9702.85, CV MSE: 10231.34

Venkat_Subramani · May 4, 2024, 10:03am

On neural Network, When I keep on rerunning from the Neural Network Section with degree =1 , now I see all models with same training MSE. Scratching my head. On my first try, I saw the first Model1 had MSE in order of 406.

RESULTS:
Model 1: Training MSE: 73.62, CV MSE: 107.61
Model 2: Training MSE: 73.36, CV MSE: 111.91
Model 3: Training MSE: 73.40, CV MSE: 112.29

Venkat_Subramani · May 8, 2024, 3:01pm

Hope some one would respond. Or should I post these in separate threads?
Thanks

TMosh · May 8, 2024, 5:57pm

I just refreshed my access to that course, I’ll take a look at the expected results.

TMosh · May 9, 2024, 3:14am

Did you change anything in the lab? Because here are the results I get:

Venkat_Subramani · May 9, 2024, 4:58am

In Neural Section, there was a suggestion to increase the degree of polynomial from 1 to 4 and the expected conclusion was no benefit in Cost. However when I tried it, I noticed the difference posted with degree =1 and degree =4 .
I just tried it and here are the results:

Degree = 1
Model 1: Training MSE: 75.39, CV MSE: 98.93
Model 2: Training MSE: 73.40, CV MSE: 112.30
Model 3: Training MSE: 76.42, CV MSE: 120.22

Degree = 4
RESULTS:
Model 1: Training MSE: 50.71, CV MSE: 78.40
Model 2: Training MSE: 46.63, CV MSE: 72.14
Model 3: Training MSE: 44.37, CV MSE: 82.38

BTW In the same post, there is another question of why include_bias is set to False.

TMosh · May 9, 2024, 10:49pm

It would really help if you posted more specific information about what part of the lab your question applies to.

TMosh · May 9, 2024, 10:55pm

Regarding the NN part of this lab:

Here are the details about the three NN models that are used.

Since model 2 and 3 give just about as good results as Model 1, it shows that adding these additional Dense layers isn’t helpful for this set of data.

The other lesson taught in this assignment is specifically that NN’s don’t benefit from you creating additional polynomial terms. This is because the non-linear activations in the hidden layers are already making the model more complex - you don’t need to add polynomial terms at all.

It’s discussed in this text in the notebook.

TMosh · May 9, 2024, 10:59pm

Regarding why the lab uses include_bias = False.

Since the dataset is normalized, it has a mean value of zero. That means the bias will be zero, so we don’t need a bias term.

Venkat_Subramani · May 10, 2024, 5:31am

I am referring to this puny, In the Prepare the Data Section of Neural Network, where it sets degree =1.
You also explained that change the degree is moot as NN learns it automatically. However my test with degree =1 and degree =4 shows lower Cost with degree =4 for NN.

Prepare the Data

You will use the same training, cross validation, and test sets you generated in the previous section. From earlier lectures in this course, you may have known that neural networks can learn non-linear relationships so you can opt to skip adding polynomial features. The code is still included below in case you want to try later and see what effect it will have on your results. The default degree is set to 1 to indicate that it will just use x_train, x_cv, and x_test as is (i.e. without any additional polynomial features).

TMosh · May 10, 2024, 6:03am

The difference in the numbers you posted aren’t really significant. There may be some small influence from the degree.

Topic		Replies	Views
Questions of C2W3_Lab_01_Model_Evaluation_and_Selection" with sklearn Advanced Learning Algorithms week-module-3	3	214	March 11, 2024
Week2: polynomial regression Supervised ML: Regression and Classification week-module-2	1	34	October 21, 2024
Effect of feature scaling on a model's parameters Supervised ML: Regression and Classification week-module-2	4	37	December 3, 2024
DLS, Course 2, Week 1, Question to normalizing Inputs Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	499	April 27, 2023
Training error vs test error Advanced Learning Algorithms week-module-3	3	698	September 7, 2022

C2W3 Lab Qn - Model Evaluation and Selection

Prepare the Data

Related topics