Establishing degrees of closeness between audio recordings along different dimensions using large-scale cross-lingual modelsDownload PDF

Anonymous

16 Oct 2023ACL ARR 2023 October Blind SubmissionReaders: Everyone
Abstract: In the highly constrained context of low-resource language studies, we propose a new unsupervised method using ABX tests on audio recordings with carefully curated metadata to shed light on the type of information present in the representations. ABX tests determine if the representations computed by a multilingual speech model encode a given characteristic. Two experiments are devised: one on acoustic aspects, specifically room acoustic characteristics, and one on phonetic aspects. The results confirm that the representations extracted from recordings with different linguistic/extra-linguistic characteristics differ along the same lines. Embedding more audio signal in one vector better discriminates extra-linguistic characteristics, whereas shorter snippets are better to distinguish segmental information. The method is fully unsupervised, potentially opening new research avenues for comparative work on under-documented languages.
Paper Type: short
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability, Approaches to low-resource settings
Languages Studied: Yongning Na (nru), Lataddi Na (nru)
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies

Loading