C1_W2_Lab1 Exercise 7. Why does normalization change the results so much?

jpgeek · June 9, 2022, 2:22am

I understand how normalization would affect accuracy when dealing with multiple features that have different scales, i.e. “millimeters between planets” vs “number of times a person gets married”.

I can also see how using different types of normalization could de-emphasize outliers. I don’t understand what is happening in this example though. We are simply rescaling 0-255 to 0-1. The relative scale is still exactly the same. What is the exact mechanism that is causing the huge difference in accuracy between un-normalized and normalized? Is it something in the particular optimizer or loss function (“adam”, “sparse_categorical_crossentropy”) that works better with values between 0 and 1 than arbitrary ranges?

balaji.ambresh · June 9, 2022, 7:03am

Please try the following :

Define a function get_sigmoid(pixel_value, weight) where you return the sigmoid for the pixel_value * weight value.
Create a bunch of weights using np.linspace between 0 and 1e-5.
See how sigmoid varies when pixel value is 255 and 1.

This should tell you which scale of values will help speed up learning keeping backpropagation in mind.

Have you seen this?

jpgeek · June 10, 2022, 1:42am

Excellent reply @balaji.ambresh! Thank you.

Expanding on your reply.
Even though the sigmoid function technically returns different values for 0 < x < 255, in practice any x over 36 overflows the significant digits of float and returns 1. Values for 0 < x < 1 however don’t have this problem.

import tensorflow as tf
import numpy as np
import math

def get_sigmoid(x):
    return 1 / (1 + math.exp(-x))

def show_floats():
    print_sigmoids([f / 100.0 for f in range(100)])
    
def show_ints():
    print_sigmoids([float(i) for i in range(255)])

def print_sigmoids(vals):
    for i in vals:
        print(f"val: {i}, sigmoid: {get_sigmoid(i)}")
        
show_ints()
print('-------------------')
show_floats()

For 0 < x < 255:

val: 0.0, sigmoid: 0.5
val: 1.0, sigmoid: 0.7310585786300049
val: 2.0, sigmoid: 0.8807970779778823
val: 3.0, sigmoid: 0.9525741268224334
...
val: 36.0, sigmoid: 0.9999999999999998
val: 37.0, sigmoid: 1.0
val: 38.0, sigmoid: 1.0
val: 39.0, sigmoid: 1.0
...

For 0 < x < 1:

val: 0.0, sigmoid: 0.5
val: 0.01, sigmoid: 0.5024999791668749
val: 0.02, sigmoid: 0.5049998333399998
val: 0.03, sigmoid: 0.5074994375506203
...
val: 0.97, sigmoid: 0.7251194977898231
val: 0.98, sigmoid: 0.7271082163411295
val: 0.99, sigmoid: 0.7290879223493065

Even if we did have more significant digits in floats, the range of softmax return values for numbers over 36 gets incredibly small (and smaller still as we increase x). Values between 0 and 1 provide much more differentiation:
sigmoid_graph

Note: in this exercise, we are not actually using the sigmoid function, but the softmax. However it is basically the same idea since softmax is:

softmax

Softmax

Teddy_Wong · August 3, 2022, 11:13pm

For some reason Exercise 7 was grabbing the wrong dataset. Instead of:
tf.keras.datasets.fashion_mnist
it was pointing to:
tf.keras.datasets.mnist

Just something to be aware of if the accuracy went up when you removed the normalization.

balaji.ambresh · August 4, 2022, 5:18am

Thanks for pointing this out.

mnist was used since exercise 1. But, I agree with you that there is inconsistency in the notebook. There’s no point in exploring fashion mnist and using mnist for rest of the exercises. I’ve asked the staff to look into this.

Ritwik_Sarkar · September 5, 2022, 12:07pm

Failed test case: model was not originally set to train for 10 epochs.
Expected:
10,
but got:
8.
Pls help

balaji.ambresh · September 5, 2022, 5:29pm

@Ritwik_Sarkar

This is the comment in the starter code:

# Fit the model for 10 epochs adding the callbacks
# and save the training history

Why would you train the model for 8 epochs?

Ritwik_Sarkar · September 6, 2022, 7:35am

Uploading: Screenshot_2022-09-05-17-53-20.png…

[code removed - moderator]

balaji.ambresh · September 6, 2022, 6:24pm

@Ritwik_Sarkar
Please click my name and message your notebook as an attachment.

Ritwik_Sarkar · September 7, 2022, 4:55am

Uploading: C1W2_Assignment_12.ipynb…

Ritwik_Sarkar · September 7, 2022, 5:14am

[code removed - moderator]

balaji.ambresh · September 7, 2022, 6:27am

@Ritwik_Sarkar

Stop posting your code in public. It’s ok to send code via direct message to a mentor.

balaji.ambresh · September 7, 2022, 6:35am

@Ritwik_Sarkar

Are you still seeing the same error?

Failed test case: model was not originally set to train for 10 epochs.
Expected:
10,
but got:
8.

Ritwik_Sarkar · September 7, 2022, 6:40am

Yes same to same show

balaji.ambresh · September 7, 2022, 7:09am

I just ran your notebook. This is the grader feedback:

Failed test case: model trained for more than 8 epochs. The callback should have fired by now..
Expected:
a maximum of 8 epochs,
but got:
10.

There’s a difference between the above feedback and what you’ve provided.
The one above means that the callback should’ve fired before the 9th epoch. This asks you to tune the model architecture.

The feedback you shared means that the model was trained for 8 epochs instead of 10.

Please do 2 things:

Update your model architecture to trigger the callback before the 9th epoch. Leave the number of training epochs to 10.
If the feedback you got it different from what I shared now, reply with the lab ID. I’ll ask the staff to look into the grader.

Ritwik_Sarkar · September 7, 2022, 8:01am

Failed test case: model was not originally set to train for 10 epochs.
Expected:
10,
but got:
9.

Failed test case: model trained for more than 8 epochs. The callback should have fired by now…
Expected:
a maximum of 8 epochs,
but got:
9.
Lab I’d= vtfbrvmn

balaji.ambresh · September 7, 2022, 12:00pm

Please click my name and message your notebook as an attachment along with screenshot of expanded grader feedback. I’ll forward them to the staff to look at.

balaji.ambresh · September 7, 2022, 12:15pm

@Ritwik_Sarkar

Don’t forget to fix the typo: stop_traning

balaji.ambresh · September 7, 2022, 3:00pm

@Ritwik_Sarkar
You are training the model for 9 epochs. That’s incorrect.
See this:

Ritwik_Sarkar · September 7, 2022, 3:17pm

I don’t understand
I am already try 10 epochs

Topic		Replies	Views
Week 3 normalization 2 questions Improving Deep Neural Networks: Hyperparameter tun	3	556	January 9, 2022
C4 W4 A1 who_is_it 0.0000003 difference in distance Convolutional Neural Networks	15	578	March 15, 2022
Course 2, Week 3 - sigmoid function Improving Deep Neural Networks: Hyperparameter tun	5	680	May 27, 2021
Course 2, Week 3, compute_total_loss, failed with really close result Improving Deep Neural Networks: Hyperparameter tun week-3	6	835	February 1, 2024
Week3, Normalizing components of image tensor Improving Deep Neural Networks: Hyperparameter tun	2	500	July 21, 2021

C1_W2_Lab1 Exercise 7. Why does normalization change the results so much?

Related topics