DLS, Course 2, Week 1, Question to normalizing Inputs

M_R2 · April 27, 2023, 8:13am

Hello,

after seeing this video, I tried it with my own little project.
The dataset has around 3000 training examples with 29 features. This I split 80/20 into train and test set.

In this project I have a relatively deep MLP with 11 layers (10x 512 units with ReLU and 1x 1 linear output unit). I am just testing around with this and wanted to get just a big NN.

What I found is that without normalizing, my RMSE score on the train and the test set very fast converged. And no matter how big I made my NN I could not get my model to overfit. So my first question is, how come that my model could not on purpose overfit the training data even with increasing the size of my NN?

When I scaled my input features, the train RMSE score suddenly increased a lot but my train set still stayed the same.
In the image you can see all the mentioned scores.

My second question, what is going on here? How come scaling suddenly increased my train RMSE score (no other changes)?. According to the video normalizing will increase my model speed but as we can see, it was already converging so even if I let the model without scaling train for much longer it would not reach the “new” level of RMSE score with normalizing.

Apart from that (third question ) is it save to say that after normalizing, my model is clearly overfitting (which I should try to tackle with L2 or dropout)?

Thanks a lot

gent.spah · April 27, 2023, 9:36am

I think:

On the first question your model is just doing poorly on all sets.

On the second question when you scaled features the model is training good but validating poorly (overfitting).

For the third I think you are right.

M_R2 · April 27, 2023, 10:57am

My question is exactly why on first point my model is doing poorly when the only difference is normalizing and according to the video it should only increase speed.

And even if the model is doing poorly, I was under the assumption that a big enough NN would eventually always be able to overfit.

gent.spah · April 27, 2023, 11:54am

Normalizing also helps convergence thats for sure.

Regarding your model those are my personal thoughts.

M_R2 · April 27, 2023, 12:02pm

Yes it helps convergence. But it looks like my model without normalizing already converged so I do not really understand that. Faster convergence (with normalizing) does not mean it converges to a better value or does it? Just that it converges faster to the same end value (or?)

ABHINAV_KAUSHIK_RA20 · April 27, 2023, 12:11pm

Normalizing also helps convergence thats for sure.

Regarding your model those are my personal thoughts.

M_R2 · April 27, 2023, 12:55pm

@ABHINAV_KAUSHIK_RA20 This is just a quote of someone else’s answer?

TMosh · April 27, 2023, 3:17pm

11 hidden layers is quite a lot to debug,
What do you get if you try using one or two hidden layers?

Topic		Replies	Views
Data input normalization Improving Deep Neural Networks: Hyperparameter tun	5	578	April 29, 2021
Optimization methods vs normalizing input features Improving Deep Neural Networks: Hyperparameter tun	5	687	February 26, 2022
Questions on normalizing really huge data Improving Deep Neural Networks: Hyperparameter tun	3	553	December 27, 2022
Week 3 normalization 2 questions Improving Deep Neural Networks: Hyperparameter tun	3	556	January 9, 2022
Optional Lab: Feature Engineering and Polynomial Regression (Feature scaling impact on Convergence) Supervised ML: Regression and Classification week-2	2	529	September 2, 2022

DLS, Course 2, Week 1, Question to normalizing Inputs

Related topics