SFQ: A Sentence-level Framework for Medical Follow-up Questioning

ACL ARR 2026 January Submission9023 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: medical-case diagnosis, follow-up questioning, information seeking, sentence-level abstraction, LLM interactive reasoning, reinforcement learning
Abstract: Follow-up questioning is essential for safe medical decision-making but remains poorly evaluated in large language models (LLMs). In real clinical settings, patients provide incomplete information, requiring clinicians to actively elicit missing evidence before diagnosis. However, existing medical LLM benchmarks primarily evaluate diagnosis accuracy under complete or fixed case presentations, making it unclear whether models genuinely seek missing information. We propose Sentence-level Follow-up Questioning (SFQ), a framework that formulates follow-up behavior as an explicit sentence-level recall task. Medical cases are decomposed into atomic clinical sentences, and incomplete cases are constructed by selectively hiding clinically relevant facts. Models must recover these facts through follow-up questions before committing to a diagnosis, enabling fine-grained and interpretable evaluation beyond diagnosis accuracy. Experiments on controlled sentence-abstracted benchmarks and a realistic external follow-up dataset show that recovering missing clinical information is associated with higher diagnostic accuracy. Recall-aware post-training improves both information recovery and diagnostic performance under incomplete inputs, demonstrating that explicitly modeling follow-up questioning leads to safer and more reliable medical reasoning.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Question Answering, AI / LLM Agents, Clinical NLP, Dialogue and Interactive Systems
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 9023
Loading