SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning

Anonymous

SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning through supervised learning, ignoring logical dependencies between steps. Moreover, existing reinforcement learning (RL) based methods overlook the structured relationships, underutilizing the potential of RL in structured reasoning. In this paper, we propose SEER, a novel method that maximizes a structure-based return to facilitate structured reasoning and explanation. Our proposed structure-based return precisely describes the hierarchical and branching structure inherent in structured reasoning, effectively capturing the intricate relationships between different reasoning steps. In addition, we introduce a fine-grained reward function to meticulously delineate diverse reasoning steps. Extensive experiments show that SEER significantly outperforms state-of-the-art methods, achieving an absolute improvement of 6.9% over RL-based methods on EntailmentBank, a 4.4% average improvement on STREET benchmark, and exhibiting outstanding efficiency and cross-dataset generalization performance.

Paper Type: long

Research Area: Question Answering

Contribution Types: Model analysis & interpretability

Languages Studied: English

Preprint Status: There is a non-anonymous preprint (URL specified in the next question).

A1: yes

A1 Elaboration For Yes Or No: (Section) Limitations

A2: yes

A2 Elaboration For Yes Or No: (Section) Ethics Statement

A3: yes

A3 Elaboration For Yes Or No: (Section) Abstract, Introduction

B: no

C: yes

C1: yes

C1 Elaboration For Yes Or No: (Section) 4.3 Implementation Details, (Section_Appendix) Implementation Details

C2: yes

C2 Elaboration For Yes Or No: (Section) 4.3 Implementation Details, (Section_Appendix) Implementation Details

C3: yes

C3 Elaboration For Yes Or No: (Section) 4 Experiments

C4: yes

C4 Elaboration For Yes Or No: (Section) 4.3 Implementation Details, (Section_Appendix) Implementation Details

D: no

E: yes

E1: yes

E1 Elaboration For Yes Or No: (Section) 4 Experiments

0 Replies

Loading