Abstract: Radiology education requires trainees to develop both perceptual and interpretive expertise. However, refinement of these skills is often impeded by the limited availability of mentorship, a consequence of the demanding schedules of experienced radiologists. This lack of personalized guidance makes it difficult for learners to recognize the mistakes they make, understand why those errors occurred and how to refine their perceptual processes. Many of these errors arise from subtle differences in visual attention, such as failing to fixate on an abnormality, allocating an insufficient fixation time, or overlooking an abnormality despite scanning the correct region. Although Large Language Models (LLMs) and Large Multimodal Models (LMMs) have been explored for radiology tasks, they often struggle to detect such fine-grained multimodal variations, particularly when comparing gaze behavior between experts and trainees. To address these limitations, we introduce Structural Chain of Thoughts (SCoT), a novel framework that enhances LLMs and LMMs sensitivity to nuanced multimodal differences by structuring gaze data and radiology report into a thought graph. By leveraging a structural prior, SCoT systematically identifies key perceptual and interpretive discrepancies, allowing models to provide targeted, context-aware feedback. This structured approach not only highlights missed findings but also explains the reasoning behind perceptual errors, turning them into learning opportunities. Applied within radiology education, SCoT bridges the gap between expert and novice performance, offering a scalable solution for AI-driven diagnostic training. We further contribute a simulated dataset of perceptual errors in chest X-ray (CXR) interpretation, facilitating future research into multimodal reasoning and AI-driven medical education. Unlike conventional Chain-of-Thought approaches, SCoT explicitly integrates gaze and textual information into a structured reasoning process, yielding interpretable, fine-grained, and personalized feedback tailored to the unique needs of radiology training. The code and data will be available here: GitHub Repository.
External IDs:doi:10.1016/j.knosys.2025.114433
Loading