PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System

Published: 2025, Last Modified: 28 May 2026ASPLOS (2) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading