SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization

Published: 01 Jan 2024, Last Modified: 05 Oct 2025NeurIPS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading