Abstract: The fixed context size of the Transformer renders the generation of coherent long text with GPT models challenging. In this paper, we introduce RecurrentGPT, a language-based simulacrum of the recurrence mechanism in RNNs, for text generation.
RecurrentGPT is built upon a large language model (LLM) such as ChatGPT and uses a recurrent prompting mechanism that uses natural language to simulate the recurrent computation mechanism in RNNs to generate arbitrarily long texts. At each timestep, RecurrentGPT uses prompting to generate a paragraph of text and update its language-based long-short term memory. This recurrent prompting mechanism enables RecurrentGPT to generate texts of arbitrary length without the need to fit long texts in the context. Since human users can easily observe and edit the natural language memories, RecurrentGPT is naturally interpretable and enables interactive generation of long text. RecurrentGPT is an initial step towards next-generation computer-assisted writing systems that go beyond local editing suggestions. Our experiments show that RecurrentGPT can generate long texts of better quality and coherence compared to other long text generation strategies.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Yu_Meng1
Submission Number: 4739
Loading