Why does linear quantization not suitable for Convolutional Layer？

ziyingsk · May 8, 2024, 2:57pm

During the course, the teachers seem to avoid to quantize convolutional Layer, but why? According to my understanding of convolutional Layer, it should also be ok to quantize it.

gent.spah · May 9, 2024, 6:04am

I have not done this course, but I think because you don’t want to compromise the structure of the neural network. You want to quantize where its not critically damaging in doing so!

ziyingsk · May 9, 2024, 6:26am

Hi! Thanks for your reply. I am still confused: Why will quantizing the convolutional kernels cause greater damage, compared to linear layer?

gent.spah · May 9, 2024, 6:27am

I would guess that a linear layer is straight forward and can be simplified easier than a more complex structure like a convolutional layer!

ziyingsk · May 9, 2024, 6:39am

I asked Chatgpt, and it says it should be ok to quantize convolutional layer.
Also I found one documents talking about quantizing convolutional layer:
(gemmlowp/doc/quantization.md at master · google/gemmlowp · GitHub)

So I assume maybe this is just because it is too complicated to show it in the course?

gent.spah · May 9, 2024, 6:43am

Hmm, as long as the most important information is not lost in the process, , maybe its right!

Deepti_Prasad · May 9, 2024, 7:07am

Instructor meant from the perspective of not wanting to reduce the bit-width of the weights and activations which would eventually reduce the model size and computation cost, hence stating not suitable for convolution layer.

quantization for convolution layer could be linear or non-linear, here the instructor didn’t wanted probably because didn’t wanted to reduce the model size and yes linear quantisation can be done for convolution layer.

Regards
DP

ziyingsk · May 9, 2024, 8:22am

Cool! Thanks a lot.

Rishan_Tan · May 28, 2024, 6:20pm

It may also be that the FC layers in comparison with conv layers have more number of parameters which if we quantize them can save more memory storage.

Topic		Replies	Views
Why quantisation using Quanto from HF only for Linear Layers? Quantization Fundamentals with Hugging Face ai-discussions	1	24	September 23, 2025
Questions on quantizing both activation and weights for inference layers? Quantization In Depth	0	116	May 28, 2024
Use of 1x1 convolution block Convolutional Neural Networks coursera-platform	5	397	October 22, 2023
Why inference using dequantized model? Efficiently Serving LLMs	1	288	March 19, 2024
DLS Course 4 Week 2 Exercise 1: 1x1 convolution with strides=2 Convolutional Neural Networks coursera-platform	3	608	February 20, 2024

Why does linear quantization not suitable for Convolutional Layer？

Related topics