Abstract: The ability of generative commonsense reasoning (GCR) reflects how well an AI system can produce trustworthy outputs that align with real-world commonsense knowledge. Despite the growing research efforts towards improved GCR, current studies still fall short in robustness and token-by-token generation. In this work, we propose a novel Retrieval-augmented Diffusion Language Model for Generative Commonsense Reasoning (RaDi4GCR). RaDi4GCR not only allows for gradually refining the output via the denoising process, but also improves generation quality by injecting contextually relevant retrieved information, especially for low-resource scenarios that purely relying on parametric knowledge would suffer from. A comprehensive evaluation on the CommonGen benchmark demonstrates that RaDi4GCR significantly outperforms the state-of-the-art baseline (a 9.5% improvement in terms of SPICE), as well as surpassing multiple cutting-edge LLMs (such as GPT-4o and Llama3).
Paper Type: Long
Research Area: Generation
Research Area Keywords: retrieval-augmented generation, model architectures, text-to-text generation,domain adaptation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Publicly available software and/or pre-trained models
Languages Studied: english
Submission Number: 1763
Loading