SEER: Facilitating Structured Reasoning and Explanation via Reinforcement LearningDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning through supervised learning, ignoring logical dependencies between steps. Moreover, existing reinforcement learning (RL) based methods overlook the structured relationships, underutilizing the potential of RL in structured reasoning. In this paper, we propose SEER, a novel method that maximizes a structure-based return to facilitate structured reasoning and explanation. Our proposed structure-based return precisely describes the hierarchical and branching structure inherent in structured reasoning, effectively capturing the intricate relationships between different reasoning steps. In addition, we introduce a fine-grained reward function to meticulously delineate diverse reasoning steps. Extensive experiments show that SEER significantly outperforms state-of-the-art methods, achieving an absolute improvement of 6.9% over RL-based methods on EntailmentBank, a 4.4% average improvement on STREET benchmark, and exhibiting outstanding efficiency and cross-dataset generalization performance.
Paper Type: long
Research Area: Question Answering
Contribution Types: Model analysis & interpretability
Languages Studied: English
Preprint Status: There is a non-anonymous preprint (URL specified in the next question).
A1: yes
A1 Elaboration For Yes Or No: (Section) Limitations
A2: yes
A2 Elaboration For Yes Or No: (Section) Ethics Statement
A3: yes
A3 Elaboration For Yes Or No: (Section) Abstract, Introduction
B: no
C: yes
C1: yes
C1 Elaboration For Yes Or No: (Section) 4.3 Implementation Details, (Section_Appendix) Implementation Details
C2: yes
C2 Elaboration For Yes Or No: (Section) 4.3 Implementation Details, (Section_Appendix) Implementation Details
C3: yes
C3 Elaboration For Yes Or No: (Section) 4 Experiments
C4: yes
C4 Elaboration For Yes Or No: (Section) 4.3 Implementation Details, (Section_Appendix) Implementation Details
D: no
E: yes
E1: yes
E1 Elaboration For Yes Or No: (Section) 4 Experiments
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview