Pytorch initialisation

Nevermnd · January 25, 2026, 12:27am

I kind of wasn’t sure where to put this question, but in the Deep Learning Specialisation a good deal of emphasis was put on He/Xavier initialisations; But in the Pytorch class this isn’t even mentioned.

Does it ‘do it for you’ and if so, how ?

balaji.ambresh · January 25, 2026, 5:47am

Does this help?

Deepti_Prasad · January 25, 2026, 11:16am

True @Nevermnd, perhaps this should go into feedback to include in future courses related to pytorch.

Actually I even find very less research paper related to complex architecture using pytorch on web

But the he and xavier initialisation class is also very interesting topic, this is what I found, probably you must have too

Model

import torch
import torch.nn as nn

class MyNetwork(nn.Module):
    def __init__(self):
        super(MyNetwork, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(128, 64)
        self.tanh1 = nn.Tanh() # Example of using Tanh for demonstration
        self.fc3 = nn.Linear(64, 10)
    
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu1(x)
        x = self.fc2(x)
        x = self.tanh1(x)
        x = self.fc3(x)
        return x

initialisation function implementation

He Initialisation

def initialize_weights(model):
    for m in model.modules():
        if isinstance(m, nn.Linear):
            # He initialization (kaiming) for layers before ReLU
            nn.init.kaiming_normal_(m.weight, mode='fan_in', nonlinearity='relu')
            if m.bias is not None:
                nn.init.zeros_(m.bias)
        # You can add conditions for other layer types (e.g., Conv2d, BatchNorm)
        # and other activation functions.

Xavier Initialisation - If one is specifically wanted to use Xavier initialization for all layers (like using tanh or sigmoid throughout), one would use nn.init.xavier_normal_ or nn.init.xavier_uniform.

def initialize_weights_xavier(model):
    for m in model.modules():
        if isinstance(m, nn.Linear):
            # Xavier uniform initialization (suitable for tanh/sigmoid)
            nn.init.xavier_uniform_(m.weight)
            if m.bias is not None:
                nn.init.zeros_(m.bias)

Applying the function

model = MyNetwork()
initialize_weights(model)

Probably Pytorch for Advanced Techniques Specialisation should be created with complex models, including various implementation and making sure the model accuracy is better i.r.t. to the dataset used. But probably these labs will run more successfully on colab assisted environment as GPU/TPU usuage is available with colab just like how Tensorflow Advanced Technique specialisation had larger dataset and complex model architecture, challenging but fun.

Nevermnd · January 26, 2026, 2:47pm

@Deepti_Prasad thanks for this feedback and I am just surprised it wasn’t mentioned/didn’t come up. I wonder what a standard layer inits to without making these calls. Just zeros ?

paulinpaloalto · January 26, 2026, 3:30pm

But we know that wouldn’t work, because of the need for Symmetry Breaking, right?

Here’s what google has to say:

In TensorFlow, the default initialization is Glorot Uniform (Xavier Uniform):

Of course in both platforms, APIs are provided if you want or need to use different initializations, as Balaji and Deepti showed us for torch above.

Deepti_Prasad · January 26, 2026, 5:12pm

@Nevermnd

as far as my experience @Nevermnd pytorch is more focused on how it passes various functions, layer, methods into model architectures.

Ofcourse the course didn’t dig into various techniques like object detection, generative ai related models, but you surely cannot disagree, Pytorch explained very basic implementation techniques that is used while creating and implementing a model into training and testing.

As it was first specialisation in PyTorch, developers must have wanted to take first basic standard route than explaining complex initialisation, understanding the versatility of learners encouragement towards Pytorch.

I am not staff, but surely we can hope more complex advanced techniques with Pytorch is what we learners can look forward in future.

Probably you can give your feedback as I said earlier by selecting feedback tag and course category, and let the community coordinator know about what you to know more or learn related to Pytorch and look forward to.

Topic		Replies	Views
Default weight initialization process in pytorch custom Module Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	1487	September 4, 2023
C1W2 weights_init() why are we initialize BatchNorm2d? Build Basic Generative Adversarial Networks week-module-2	2	311	February 18, 2024
A little note on parameter initializations (Glorot, Xavier, He) Improving Deep Neural Networks: Hyperparameter tun week-module-1 , coursera-platform	0	78	March 2, 2025
What is He initialization? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	590	January 2, 2022
Weight initialization Course 2 week 1 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	585	January 27, 2023

Pytorch initialisation

Related topics