FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

Jiaqi Li; Yao Qian; Yuxuan Hu; leying zhang; Xiaofei Wang; Heng Lu; Manthan Thakker; Jinyu Li; sheng zhao; Zhizheng Wu

FlexiCodec: A Dynamic Neural Audio Codec for Low Frame Rates

Jiaqi Li, Yao Qian, Yuxuan Hu, leying zhang, Xiaofei Wang, Heng Lu, Manthan Thakker, Jinyu Li, sheng zhao, Zhizheng Wu

Published: 26 Jan 2026, Last Modified: 11 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Audio coding, neural audio codecs, speech language model

Abstract: Neural audio codecs are foundational to speech language models. It is expected to have a low frame rate and decoupled semantic and acoustic information. A lower frame rate codec can reduce the computational cost of speech language models by shortening the sequence length. Recent studies have developed 12.5Hz low-frame-rate audio codecs, but even lower frame rate codecs remain underexplored. We find that pushing existing audio codecs to very low frame rates loses much semantic information. We suggest that low-frame-rate codecs' limitations are in both insufficient semantic decoupling and insufficient time resolution at capturing transient phonetic details. This paper introduces **FlexiCodec** to address this limitation. FlexiCodec improves semantic preservation with a **dynamic frame rate** approach and introduces a novel architecture featuring an **ASR feature-assisted dual stream** encoding and Transformer bottlenecks. With dynamic frame rates, it uses less frames at information-sparse regions through adaptively merging semantically similar frames. A dynamic frame rate also allows FlexiCodec to support inference-time **controllable frame rates** between 3Hz and 12.5Hz. Experiments on **6.25Hz, 8.3Hz and 12.5Hz** average frame rates confirm that FlexiCodec excels over baseline systems in semantic information preservation and delivers a high audio reconstruction quality. We also validate the effectiveness of FlexiCodec in language model-based TTS. Demos are available at: https://flexicodec.github.io.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 8557

Loading