Abstract: In this paper, we propose an effective multi-stage neural audio coding algorithm that encodes full-band audio signals (up to 20 kHz) using an end-to-end training criterion. By predefining several dyadic subband signals as training targets, we progressively encode input audio signals in each stage such that deeper stages of the network encode the residual error terms from the previous encoding stage. Our proposed audio codec successfully decodes full-band audio signals by using an effective multi-stage vector quantization scheme to represent key encoding features extracted in the latent space. Subjective listening tests show that the decoded outputs of the proposed audio codec achieve almost transparent quality at an average bitrate of 132 kbps.
0 Replies
Loading