Efficiency Unleashed: Inference Acceleration for LLM-based Recommender Systems with Speculative Decoding

Yunjia Xi, Hangyu Wang, Bo Chen, Jianghao Lin, Menghui Zhu, Weiwen Liu, Ruiming Tang, Zhewei Wei, Weinan Zhang, Yong Yu

Published: 13 Jul 2025, Last Modified: 21 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Loading