Abstract: Highlights•For the first time, introducing the EMA mechanism for stabilized activation scale updating.•Introducing more dissimilar activation maps into weight reconstruction optimization for better PTQ accuracy.•Achieving remarkable improvements in several different bit quantization especially W2A4.
Loading