You Only Cache Once: Decoder-Decoder Architectures for Language Models.

Yutao Sun, Li Dong 0004, Yi Zhu, Shaohan Huang, Wenhui Wang 0003, Shuming Ma, Quanlu Zhang, Jianyong Wang 0001, Furu Wei

07 Nov 2025NeurIPS 2024EveryoneCC BY-SA 4.0
Loading