Quantization on quanto course

Hi guys, I’m locally implementing the quantization of the EleutherAI/pythia-410m model from lesson 4. When I’m going to perform the quantization and call the method

quantize(model, weights=torch.int8, activations=None)

gives me the following error

KeyError: torch.int8

I have higher versions than those provided in the course. If someone can explain to me what is happening I would appreciate it. Thanks in advance.

The versions you are using appear to not be compatible with some of your other tools.

Hi, TMosh, yeah it’s correct I have the latest versions of the packages. Now the issue is resolved. I share it here for anyone to avoid the same errors

First import the quanto package
–>import quanto

thereafter, implement the method as follows:
→ quanto.quantize(model, weights=quanto.qint8, activations=None)
the results when printing the model are the same as the lesson shows
(q): QLinear(in_features=512, out_features=384, bias=False)
(k): QLinear(in_features=512, out_features=384, bias=False)
(v): QLinear(in_features=512, out_features=384, bias=False)
(o): QLinear(in_features=384, out_features=512, bias=False)

The same issue happened with the implementation of the T5-FLAN

I expect to help someone else with the same problem

8 Likes

The courses seldom use the newest versions of any packages. It’s very difficult to keep the courses updated with the rapid rate of change in the industry.

3 Likes

Thank you @freewilly - your solution worked with the latest torch and quanto packages.