Hi there, had a question on assignment week-2. train_x and test_x are derived by dividing with maximum value i.e. 255. The comment in jupyter notebook says it is standardization. Isnt this normalization ? I mean the minimum value of a pixel in 64x64x3 is 0. So it is basically (x-min)/(max-min) which is normalization as far as i know.
Standardization is with respect to zero mean correct ?
The term “standardise you dataset” is not used as to perform standardisation on the data, but more like, “have your dataset in some sort of standard form”. This can be seen in the markdown, where it is mentioned (former part of the sentence), where it is mentioned, One common preprocessing step in machine learning is to center and standardize your dataset, meaning that you substract the mean of the whole numpy array from each example, and then divide each example by the standard deviation of the whole numpy array.
But the latter half, the important one to note for this particular case, as it is being done in the assignment is, But for picture datasets, it is simpler and more convenient and works almost as well to just divide every row of the dataset by 255 (the maximum value of a pixel channel).
Or to state Mubsi’s point in a slightly different way:
When people say “normalization”, that term is usually reserved for “mean normalization”. But for image inputs, the pixel color values are unsigned 8 bit integers with a range 0 - 255. For image data, it is simpler and more effective to simply divide all the pixel color values by 255, which then gives floating point values in the range 0. to 1. Image processing algorithms and rendering algorithms also handle “scaled” images in that form as one of the standard image representations. It’s cheap to compute and gives good behavior, so that is one of the common methods of preparing image data for use with DL algorithms. The term “standardization” is used for this method to distinguish it from “normalization”.