LeBenchmark 2.0: A standardized, replicable and enhanced framework for self-supervised representations of French speech

Published: 01 Jan 2024, Last Modified: 20 May 2025Comput. Speech Lang. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Open-source framework for assessing self-supervised representations in the French language.•14,000 h of heterogeneous speech documented into four datasets.•14 pre-trained self-supervised models for French, ranging from 26 to 965 million neural parameters.•6 standardized tasks for the evaluation of French self-supervised models.
Loading