Unzipping Pruned Models before running will increase the space taken by the model on edge devices

If zipping is essential after pruning to reduce the size of the model, then do we need to unzip it on edge devices before passing it to interpreter? If yes, then does it not take away the advantage of pruning as the unzipped model’s size will be the same as the original model. I am ofcourse assuming that the model is non-quantized.
Even if we consider quantization after pruning, the unzipped model size on edge devices will be similar to just the quantized models.
Is there a way to run zipped models, wihtout unzipping them?

You can’t interpret model weights without unzip operation.

We zip the final model for 2 reasons:

  1. When we quantize and don’t prune, a good amount of space can be saved when a lot of weights are 0.
  2. When distributing the final model, it makes sense to compare the footprint of zip file against your pruned and quantized model. This matters considering power requirements for fetching the model from a server.