MNIST mean and std

In C1 M2 Lab 1 (Building Your First Image Classifier), why are the values of MNIST mean and standard deviation hardcoded to 0.1307 and 0.3081 for transformation normalization? How to calculate the mean & standard deviation for the MNIST dataset to find out how these values are derived?

The values are the mean and standard deviation of the training split of MNIST.

import torch
from torchvision import datasets, transforms

transform = transforms.Compose([transforms.ToTensor()])
trainset = datasets.MNIST('mnist_train', train=True, download=True, transform=transform)
images = torch.stack(list(data[0] for data in trainset), dim=0)

print(images.shape) # torch.Size([60000, 1, 28, 28])
print(images.mean()) # tensor(0.1307)
print(images.std()) # tensor(0.3081)
3 Likes

@balaji.ambresh I actually had this question too, though I knew .mean() and .std() were functions of tensors, I wasn’t quite sure how to ‘get at’ the data.

Can you explain a little more what you have going on in this line:

Particularly the use of .stack ?

torch.stack concatenates a sequence of tensors across the specified dimension (dim=0 does vertical concatenation). Python list is a sequence. Each row of trainset has image data as tensor in index 0. The entire training set has 60000 data points. Each image has shape (1, 28, 28). Using list constructor, the sequence was created for passing into torch.stack. One could use list comprehension for constructing the sequence as well.

2 Likes