Assume that hardware provides int8, We can quantize both activations and weights. so we put everything in int8. if there are more than one FC/linear layers, Should we re-quantize from one layer to another directly without repeatedly doing quantized/de-quantized activation between layers? How will this compare with doing weights quantized alone from the perspective of precision?