TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling.

Liang-Hsuan Tseng, Yi-Chang Chen, Kuan-Yi Lee, Da-Shan Shiu, Hung-yi Lee

09 Oct 2025CoRR 2025EveryoneCC BY-SA 4.0
Loading