Default weight initialization process in pytorch custom Module

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        self.conv1 = nn.Conv2d(3, 16, 3, padding=1)

    def forward(self, x):
        x = F.tanh(self.conv1(x))

        return x

Will F.tanh() automatically initialize the conv1 weights using the Xavier or Joshua benjio formula?

You filed this question under DLS C2, but PyTorch is not used anywhere in DLS. The only courses I am aware of here at DeepLearning that use PyTorch are the GANs specialization.

I would not expect activation functions to do any initialization, since they do not have weights or other parameters, right? They just take inputs and produce outputs according to the definition of the relevant function. But any layer that has weights and bias values associated with it (e.g. FC or Conv layers) will have initialization applied when the layer is instantiated. The default initialization algorithm used in PyTorch uses a Uniform Distribution with the range depending on the size of the layer with a formula that looks pretty similar to Xavier initialization. You can explicitly ask for a different initialization function. Here’s the section of the Pytorch documentation that lists the available init functions.

Here’s a chunk of template code from one of the assignments in GANs C1 that shows one way to apply specific initialization functions:

gen = Generator(z_dim).to(device)
gen_opt = torch.optim.Adam(gen.parameters(), lr=lr, betas=(beta_1, beta_2))
disc = Discriminator().to(device) 
disc_opt = torch.optim.Adam(disc.parameters(), lr=lr, betas=(beta_1, beta_2))

# You initialize the weights to the normal distribution
# with mean 0 and standard deviation 0.02
def weights_init(m):
    if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d):
        torch.nn.init.normal_(m.weight, 0.0, 0.02)
    if isinstance(m, nn.BatchNorm2d):
        torch.nn.init.normal_(m.weight, 0.0, 0.02)
        torch.nn.init.constant_(m.bias, 0)
gen = gen.apply(weights_init)
disc = disc.apply(weights_init)

Note that they use Gaussian in most cases instead of Uniform. If you’d like to explore further, the Pytorch documentation and the Pytorch Forums are all indexed by the standard search engines.

Thanks for the answer and sorry for the unrelated question. I find your answer so professional and detailed that I can’t resist relating my questions to the course topics.

Are there any non-course related topics in this forum?

There’s always “General Discussion” for topics that are not related to any of the courses.