TALL Trainable Architecture for Enhancing LLM Performance in Low Resource Languages

TALL Trainable Architecture for Enhancing LLM Performance in Low Resource Languages

ACL ARR 2025 February Submission2049 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) excel in high-resource languages but struggle with low-resource languages due to limited training data and linguistic diversity. This paper presents TALL (Trainable Architecture for Enhancing LLM Performance in Low-Resource Languages), a novel approach designed to bridge this gap. The key innovation of TALL lies in its integration of three pre-trained models: a high-resource LLM and two bilingual translation models. By transforming inputs from low-resource languages into high-resource language representations, TALL leverages the robust reasoning capabilities of the LLM. Subsequently, it refines the output through dimensional alignment layers and custom transformers, enabling accurate decoding into the target low-resource language. We validate TALL through experiments on Hebrew, chosen for its rich morphology, complex syntax, and limited annotated datasets. To ensure a realistic evaluation, the experiments utilized models with different levels of exposure to Hebrew. Results demonstrate significant improvements in accuracy, showcasing the effectiveness of TALL's modular design and trainable alignment layers. This architecture offers a scalable and adaptable framework for cross-lingual transfer and improved processing in low-resource language settings, with potential applicability to a wide range of languages and NLP tasks. Furthermore, TALL employs a parameter-efficient training strategy, freezing pre-trained components while training only lightweight modules, thus balancing computational efficiency with performance gains.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: parameter-efficient-training, NLP in resource-constrained settings, multilingualism, multilingual MT, model architectures, less-resourced languages, cross-lingual transfer, multilingual representations, multilingual pre-training

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: Hebrew, English

Submission Number: 2049

Loading