C1W2 weights_init() why are we initialize BatchNorm2d?

Great Class. Really well done

I am confused.

In C1W1 we used nn.BatchNorm1d . We did not need to initialize it or any of the nn.Linear() layers.

My guess is nn.Linear() must randomly init itself. Initialization is not mentioned in the documentation Linear — PyTorch master documentation )

Why do we use initialization in DCGAN?

# You initialize the weights to the normal distribution
# with mean 0 and standard deviation 0.02
def weights_init(m):
    if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
        torch.nn.init.normal_(m.weight, 0.0, 0.02)
    if isinstance(m, nn.BatchNorm2d):
        torch.nn.init.normal_(m.weight, 0.0, 0.02)
        torch.nn.init.constant_(m.bias, 0)
gen = gen.apply(weights_init)
disc = disc.apply(weights_init)

Kind regards

Andy

I believe this is only because they want to use a different initialization than the default. Notice that when they do it explicitly, they are using the Normal Distribution. I think the default in PyTorch is to use the Uniform Distribution. There are lots of other possibilities for initialization algorithms and sometimes you need to tweak it.

Here’s a thread that was the top hit for “pytorch default weight initialization”.

And the initialization actually was mentioned on the PyTorch Master Documentation page for Linear if you paged down a bit:

2 Likes