2019 (modified: 06 Feb 2025)ICML 2019Readers: Everyone
Abstract:Quantization can improve the execution latency and energy efficiency of neural networks on both commodity GPUs and specialized accelerators. The majority of existing literature focuses on training ...