BAQET: BRAM-aware Quantization for Efficient Transformer Inference via Stream-based Architecture on an FPGA

Published: 01 Jan 2025, Last Modified: 12 May 2025FPGA 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading