Listening to Formulas: Pioneering Models and Datasets for Converting Speech to LaTeX Equations

22 Sept 2024 (modified: 28 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: speech recognition, LLM, LaTeX, speech to text, ASR, STT
Abstract: Recognizing spoken mathematical expressions is a challenging task that involves transcribing speech into a strictly structured symbolic representation while addressing the ambiguity inherent in the pronunciation of equations. Although significant progress has been achieved in both automatic speech recognition (ASR) and language models (LM), the specific problem of translating spoken formulas into LaTeX has received relatively little attention. This task is particularly important in educational and research domains, for example, for lecture transcription. To address this issue, in this paper, we present a pioneering study on Speech-to-LaTeX conversion, introducing a novel, diverse human-uttered dataset in English and Russian comprising 16000 (10000 in English and 6000 in Russian) distinct spoken equations uttered by 3 different speakers. Our approaches, which incorporate ASR post-correction and multi-modal language models, demonstrate a notable performance with up to a 25% character error rate (CER).
Supplementary Material: zip
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2684
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview