Rethinking Language Modeling via Path Decomposition and Selection

ACL ARR 2024 December Submission1793 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Recent generative language models assume a pre-defined monotonic left-to-right sequence decomposition format to learn, which has been proven very effective in current well-known decoder-only autoregressive large language models, but might be inefficient in learning many specific task such as reasoning. In this paper, we explore the potential of other feasible decomposition formats for language models to effectively compensate the autoregressive language modeling paradigm. Specifically, we aim to find the appropriate composition from multiple candidates through introducing effective path selection in both training and decoding. Experiments on total \textbf{11} zero-shot reasoning tasks and \textbf{2} language generation tasks demonstrate the effectiveness of our methods, indicating that more suitable decomposition formats beyond a left-to-right order do exist, and superior performance can be achieved by simply selecting and optimizing the decoding paths.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: Language Modeling, Generation
Contribution Types: Model analysis & interpretability, Reproduction study, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 1793
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview