Pruning untrained networks

In the lab (C3_M4_lab3) we preform pruning on an untrained network. Is this just for demonstration purposes? I mean you would you want to prune a trained network and is that not the whole point ? Otherwise instead of pruning you could just re-structure your network from the get go.

Also, I’ll ask it here since there is an example of it in the same lab. For network structure we have:

self.conv1 = nn.Conv2d(3, 16, kernel_size=3)
self.relu1 = nn.ReLU()
self.fc1 = nn.Linear(16 * 4 * 4, 10)  # Assuming input is 6x6

Can someone remind me of what the formula is we get the 16 * 4 * 4 from?

1 Like

There are 16 channels in the output of the conv layer, the inputs are 6 x 6 and we are doing a convolution with stride = 1, padding = 0 and filter size = 3. Here’s the formula for figuring out the h and w dimensions of the output:

n_{out} = \displaystyle \lfloor \frac {n_{in} + 2p - f}{s} \rfloor + 1

So we have:

n_{out} = 6 + 2 * 0 - 3 + 1 = 4

2 Likes

Sorry, on the pruning question I haven’t gotten to PyTorch C3 yet, so I dunno what Professor Laurence says on that. But just on general principles, your analysis sounds right to me: what’s the point of pruning before training? You have no information until you try the training and see what results you get.

i didn’t check the lab yet, but as far as I remember @Nevermnd,
aren’t they are using trained model weights prune the selectively low value weight, or less significant parameters or removed any less significant layer. They probably called this selective pruned model as untrained network but it is using trained model weight, so your point of pruning should be done on trained model is actually perfectly right.

But when I did some digging, I got to know in some practice they do model pruning in untrained model too which honestly probably be based on data parallel training techniques, choosing between which parameters would hold more significant when it comes better model training outcome.

@paulinpaloalto you say ‘assume it is 6 x 6’… But how would you know it is 6 x 6 if the full formula 16 * 4 * 4 wasn’t given to you. I don’t get where you’d get that from the conv2D call.

I’m not familiar with that exercise, so I don’t know what they said in the assignment about the input. But there is a comment in the code you pasted saying “Assuming input is 6 x 6”. That is what I based that on. And it all makes sense from the formula that was shown.

Of course, you can also do the algebra using that formula to “backsolve” for n_{in} if n_{out} is 4. We see from reading the docpage for torch.nn.Conv2d that the defaults are stride = 1 and padding = 0. From there, it’s simple:

4 = n_{in} + 2 * 0 - 3 + 1 = n_{in} - 2
n_{in} = 6

just checked lab, they are using resmet 18 trained model as baseline model, and using the simple model to define the structure and unstructured pruning.

Also for the input, the statement assuming input is 6*6 here is probably direct indication as in explanation they didn’t mention input shape.

Probably they should add this part of description in the instructions, so they aren’t confused on the input shape.

Yes, my goal here is not so much the labs but when I make my own NNs I know how to do the math for the linear layer transformation.

1 Like

I will probably create repo ticket on this, as it is valid concern, atleast adding input shape information in instructions instead of the assumption statement is a better way to clear the confusion.

Thank you for reporting this.
Regards
DP

@Mubsi

I have raised repo ticket based on learner’s query tk add instructions in the exercise section about input in Model definition. Kindly look in the issue.

Regards
DP

Thanks all.

@Nevermnd, I have updated the markdown and code to avoid any confusions:


Model Definition

To demonstrate various pruning techniques, you’ll define a SimpleModel for this purpose.

  • Demonstration Only: This model will not be used for training on real data. It is a dummy architecture designed solely to illustrate pruning mechanics on a small, manageable scale.
  • Key Layers: It contains the two layers you will target for pruning: a torch.nn.Conv2d layer (conv1) and a torch.nn.Linear layer (fc1).
  • Strict Input Requirement: The model is hard-coded to accept an input size of 6 × 6 pixels.
  • Dimension Logic: The fully connected layer is initialized as nn.Linear(16 * 4 * 4, 10) based on standard convolution arithmetic:
    • A 6 × 6 input processed by a 3 × 3 kernel (with no padding) results in a 4 × 4 spatial output.
    • Flattening this output (16 channels × 4 × 4) requires exactly 256 inputs for the linear layer.
  • Output Assumption: The final layer produces 10 output features, assuming a classification task with 10 distinct classes.
# Create a simple model for demonstration
class SimpleModel(nn.Module):
    """
    A simple convolutional neural network defined for demonstration purposes.
    """
    def __init__(self):
        """
        Initializes the SimpleModel with a convolutional layer and a fully connected layer.
        """
        # Call the parent class initializer
        super(SimpleModel, self).__init__()
        # Define the first convolutional layer with 3 input channels and 16 output channels
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3)
        # Define the ReLU activation function
        self.relu1 = nn.ReLU()
        # Define the fully connected layer
        # The input size is calculated based on a 6x6 input image: (6-3+1) = 4x4 spatial output
        self.fc1 = nn.Linear(16 * 4 * 4, 10)  # Assuming input is 6x6

    def forward(self, x):
        """
        Defines the forward pass of the model.

        Args:
            x: The input tensor containing the image data.

        Returns:
            The output tensor after passing through the network.
        """
        # Apply the convolutional layer followed by the ReLU activation
        x = self.relu1(self.conv1(x))
        # Flatten the tensor dimensions for the fully connected layer
        x = x.view(x.size(0), -1)
        # Pass the flattened tensor through the fully connected layer
        x = self.fc1(x)
        return x

Paul already provided you with the math on how to do the calculations.

Best,
Mubsi

2 Likes

The optional section covers this part. Everything before is the what, the how and the types of weights pruning.

Tengo una curiosidad, eres cirujano oral ? Porque yo soy cirujana Maxilofacial en RD

1 Like

@Cangrejamenor13

I am Dental surgeon with specialisation in Endodontics. Sorry my Spanish is not good.

Good to know dentist presence here in dlai, I always felt as an outlier here :joy::rofl:

By the way, please always create a new topic instead of posting on other learner’s topic so as to avoid getting away from the learner’s topic.

Muchas gracias.
Deepti