Hi, after viewing and reviewing the pruning, I am not sure if I got the whole picture. Model pruning identifies weights in the model that could be zeroed, and thus, in some cases a new model architecture may be better for speed/memory. Yet, after pruning and removing the hooks, the model needs to be inspected to decide what blocks may be removed. What if in some cases, the weights are sparsely zeroed so that the module cannot be removed? in short, pruning may be theoretically nice, but not sure if it would have some practical significance, what am I missing?
General question about model pruning
Hello @Marcelo_Bacher,
I think you need to consider what is the device that you are working on, the model pruning technique is to compress the data set, in order to work with devices that don’t met certain specifications of high computational capacity, without loosing much accuracy and other parameters that could affect the desirable output of your application. Let me give some example from my expertise of working with embedded devices (Most of them Microcontrollers Development Boards). In my case the devices don’t have such huge storage to hold large data sets, most of them comes without expansion card slots that could give a little bit more storage, in that scenario we need to consider the technique as benefit to run the application locally, and of course it depends on what is the data set, and what is the output that you want.
I hope this can give you new ways to think, after all you going to see some examples as you advance thru the course.
Best Regards.
hello @Victor_Hugo_Alves_Ca
can you please any model pruning scenario related to real-world data, even I wasn’t happy with last week videos.
I ended up getting more confused than understanding although I understand pruning is done more from memory based perspective, then why does one needed to do selective layer pruning which instructor mentions.
I also got confused when quantization was explained. I understood static and dynamic quantization but want to know some real world examples if you know, can you share or any GitHub repo that follow these methods? so I can understand significance more elaborately.
Thank you in advance
DP
Thanks for your replay. I understood from the videos and also some web search/gemini that pruning is related to model weights and not to data, maybe here I missed something. The model itself remains with the same size because the weights are zeroed for the selected pruning. Only in case of structured pruning some new architecture may be helpful in reducing the size of the model (e.g., reduction of channels, kernel sizes, etc.).