I kind of wasn’t sure where to put this question, but in the Deep Learning Specialisation a good deal of emphasis was put on He/Xavier initialisations; But in the Pytorch class this isn’t even mentioned.
Does it ‘do it for you’ and if so, how ?
I kind of wasn’t sure where to put this question, but in the Deep Learning Specialisation a good deal of emphasis was put on He/Xavier initialisations; But in the Pytorch class this isn’t even mentioned.
Does it ‘do it for you’ and if so, how ?
Does this help?
True @Nevermnd, perhaps this should go into feedback to include in future courses related to pytorch.
Actually I even find very less research paper related to complex architecture using pytorch on web
But the he and xavier initialisation class is also very interesting topic, this is what I found, probably you must have too
Model ![]()
import torch
import torch.nn as nn
class MyNetwork(nn.Module):
def __init__(self):
super(MyNetwork, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.relu1 = nn.ReLU()
self.fc2 = nn.Linear(128, 64)
self.tanh1 = nn.Tanh() # Example of using Tanh for demonstration
self.fc3 = nn.Linear(64, 10)
def forward(self, x):
x = self.fc1(x)
x = self.relu1(x)
x = self.fc2(x)
x = self.tanh1(x)
x = self.fc3(x)
return x
He Initialisation
def initialize_weights(model):
for m in model.modules():
if isinstance(m, nn.Linear):
# He initialization (kaiming) for layers before ReLU
nn.init.kaiming_normal_(m.weight, mode='fan_in', nonlinearity='relu')
if m.bias is not None:
nn.init.zeros_(m.bias)
# You can add conditions for other layer types (e.g., Conv2d, BatchNorm)
# and other activation functions.
Xavier Initialisation - If one is specifically wanted to use Xavier initialization for all layers (like using tanh or sigmoid throughout), one would use nn.init.xavier_normal_ or nn.init.xavier_uniform.
def initialize_weights_xavier(model):
for m in model.modules():
if isinstance(m, nn.Linear):
# Xavier uniform initialization (suitable for tanh/sigmoid)
nn.init.xavier_uniform_(m.weight)
if m.bias is not None:
nn.init.zeros_(m.bias)
model = MyNetwork()
initialize_weights(model)
Probably Pytorch for Advanced Techniques Specialisation should be created with complex models, including various implementation and making sure the model accuracy is better i.r.t. to the dataset used. But probably these labs will run more successfully on colab assisted environment as GPU/TPU usuage is available with colab just like how Tensorflow Advanced Technique specialisation had larger dataset and complex model architecture, challenging but fun.
@Deepti_Prasad thanks for this feedback and I am just surprised it wasn’t mentioned/didn’t come up. I wonder what a standard layer inits to without making these calls. Just zeros ?
But we know that wouldn’t work, because of the need for Symmetry Breaking, right?
Here’s what google has to say:
In TensorFlow, the default initialization is Glorot Uniform (Xavier Uniform):
Of course in both platforms, APIs are provided if you want or need to use different initializations, as Balaji and Deepti showed us for torch above.
as far as my experience @Nevermnd pytorch is more focused on how it passes various functions, layer, methods into model architectures.
Ofcourse the course didn’t dig into various techniques like object detection, generative ai related models, but you surely cannot disagree, Pytorch explained very basic implementation techniques that is used while creating and implementing a model into training and testing.
As it was first specialisation in PyTorch, developers must have wanted to take first basic standard route than explaining complex initialisation, understanding the versatility of learners encouragement towards Pytorch.
I am not staff, but surely we can hope more complex advanced techniques with Pytorch is what we learners can look forward in future.
Probably you can give your feedback as I said earlier by selecting feedback tag and course category, and let the community coordinator know about what you to know more or learn related to Pytorch and look forward to.