High Parameter Frequency Resolution Encoding Scheme for Spatial Audio Objects Using Stacked Sparse AutoencoderDownload PDFOpen Website

Published: 2022, Last Modified: 12 May 2023Neural Process. Lett. 2022Readers: Everyone
Abstract: Object-based audio systems have become common in recent years as they provide the flexibility for many auditory scenarios, such as virtual reality games, interactive theater, and spatial audio communication. For saving bitrates, multiple audio objects are compressed into a mono downmix signal and side information parameters. However, side information parameter frequency resolution is too low to cause aliasing distortion. To overcome this issue, a new encoding scheme based on high parameter frequency resolution (224 sub-bands in a frame) is proposed in this paper. The side information parameters with high frequency resolution are compressed and reconstructed via SSAE (stacked sparse autoencoder) neural network and further used for recovering the audio objects. The performance of the proposed method is compared against existing SAOC (spatial audio object coding) methods at the same overall bitrate, judged by both objective and subjective results. The evaluation shows that our approach can facilitate the high quality of spatial audio objects.
0 Replies

Loading