Beyond Single Representations: Multi-Model Embedding Fusion for Stable Text Classification

Beyond Single Representations: Multi-Model Embedding Fusion for Stable Text Classification

ACL ARR 2026 January Submission6413 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: embedding fusion, layer-aware representation, text classification, LLM

Abstract: Embedding fusion has emerged as a powerful technique for enhancing performance across various NLP tasks. While prior research suggests that different layers of language models encode distinct representations and that pooling strategies influence performance, there is a lack of systematic analysis regarding the practical significance of these differences or the impact of combining embeddings from multiple models. This study provides a rigorous evaluation of layer-wise fusion strategies to determine their actual contribution to classification performance. Our findings reveal that the effectiveness of individual layers is more dependent on dataset characteristics than on the model architecture itself. Furthermore, we demonstrate that fusing embeddings from multiple models yields more robust and consistent representations across tasks, with the influence of any single model diminishing as the number of integrated models increases. Notably, experiments on low-resource datasets show that embedding fusion provides particularly significant gains when training data is scarce, highlighting its robustness and adaptability in data-constrained environments.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: generative models,representation learning,word embedding

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: english

Submission Number: 6413

Loading