Quantization on quanto course

Hi guys, I’m locally implementing the quantization of the EleutherAI/pythia-410m model from lesson 4. When I’m going to perform the quantization and call the method

quantize(model, weights=torch.int8, activations=None)

gives me the following error

KeyError: torch.int8

I have higher versions than those provided in the course. If someone can explain to me what is happening I would appreciate it. Thanks in advance.

The versions you are using appear to not be compatible with some of your other tools.

Hi, TMosh, yeah it’s correct I have the latest versions of the packages. Now the issue is resolved. I share it here for anyone to avoid the same errors

First import the quanto package
–>import quanto

thereafter, implement the method as follows:
→ quanto.quantize(model, weights=quanto.qint8, activations=None)
the results when printing the model are the same as the lesson shows
(q): QLinear(in_features=512, out_features=384, bias=False)
(k): QLinear(in_features=512, out_features=384, bias=False)
(v): QLinear(in_features=512, out_features=384, bias=False)
(o): QLinear(in_features=384, out_features=512, bias=False)

The same issue happened with the implementation of the T5-FLAN

I expect to help someone else with the same problem


The courses seldom use the newest versions of any packages. It’s very difficult to keep the courses updated with the rapid rate of change in the industry.


Thank you @freewilly - your solution worked with the latest torch and quanto packages.