Probing Discourse Structure in Dialogue: Evaluating and Fine-Tuning RoBERTa and BART on Sentence Ordering and Next Sentence Prediction

Probing Discourse Structure in Dialogue: Evaluating and Fine-Tuning RoBERTa and BART on Sentence Ordering and Next Sentence Prediction

ACL ARR 2026 January Submission965 Authors

26 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Discourse Structure, Dialogue Coherence, Probing Tasks, Sentence Ordering, Next Sentence Prediction, Fine-Tuning, RoBERTa, BART, Transformer Models

Abstract: This study investigates how large language models capture discourse structure in dialogue through two probing tasks: sentence ordering and next sentence prediction (NSP). Using the STAC corpus of multi-party conversational data, RoBERTa and BART are evaluated in both pre-trained and fine-tuned settings. Fine-tuning yields substantial gains across both tasks, with BART's generative architecture proving more effective for sentence ordering (ρ=0.473) while RoBERTa excels at NSP classification (accuracy=0.697). Focusing on four core discourse relations, the analysis finds that models handle frequent, surface-cued relations (Question-Answer Pairs, Comment) effectively but struggle with relations requiring deeper semantic dependencies (Continuation, Elaboration). These findings highlight both the capabilities and limitations of current models in capturing discourse-level coherence in dialogue.

Paper Type: Short

Research Area: Discourse, Pragmatics, and Reasoning

Research Area Keywords: coherence, discourse relations, dialogue, conversation

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 965

Loading