Coupling RNN with LLM: Does Their Integration Improve Highly Order-Sensitive Language Understanding?

Md Mostafizer Rahman; Ariful Islam Shiplu; Yutaka Watanobe; Syed Rameez Naqvi; Lu Peng

Coupling RNN with LLM: Does Their Integration Improve Highly Order-Sensitive Language Understanding?

Md Mostafizer Rahman, Ariful Islam Shiplu, Yutaka Watanobe, Syed Rameez Naqvi, Lu Peng

08 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Coupling RNN and LLM, Highly Ordered-Sensitive, Sequential Model, LLM, RNN

TL;DR: We comprehensively investigate whether RNN integration improves LLM performance on Highly Order-Sensitive Language Understanding.

Abstract: Pretrained large language models (LLMs) have demonstrated remarkable success across various language modeling tasks. However, in domain-specific applications, particularly those involving highly order-sensitive data, general LLMs exhibit limitations in achieving state-of-the-art performance. One notable issue is that the contextual embeddings by LLMs still lack a strong positional inductive bias, especially for long and highly ordered sequences, leading to the "lost in the middle" problem. In this work, we utilized the potential of sequential models (RNNs) with LLMs to address the issue and investigate whether RNN integration improves LLM performance. The LLM generates rich contextual embeddings using the attention mechanism of the Transformer. The RNN further processes the LLM embeddings to capture the contextual semantics of long and order-sensitive dependencies. The LLM-RNN model leverages the potential of both Transformer and recurrent structures to enhance performance in domain-specific tasks. We perform a wide range of experiments leveraging multiple types of LLMs (encoder-only, encoder-decoder, and decoder-only) and RNNs (GRU, LSTM, BiGRU, and BiLSTM) across diverse public and real-world datasets to investigate the potential (either positive or negative) of LLM-RNN models. The experimental results highlight the superiority of the LLM-RNN model, showing improvements in commonsense reasoning, code understanding, and biomedical reasoning tasks.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 3174

Loading