Abstract: Automatic soccer commentary generation aims to bridge the gap between raw visual content and professional, tactical commentary. However, existing datasets often lack semantic richness and detailed scene analysis. They also fail to capture the continuity between events, resulting in fragmented and contextually disconnected commentaries. To address these issues, we propose two manually curated datasets: SN-Short and SN-Long. SN-Short focuses on enhancing the semantic description of scene details, while SN-Long captures event continuity to enable coherent, context-aware commentary. In this paper, we also introduce SCORE (Soccer Commentary Generation via Contextual Expansion and Information Retrieval), a novel framework designed to address both detailed scene understanding and global context awareness. SCORE employs a commentary expansion pipeline that integrates visual features with sparse annotations to generate detailed scene descriptions, and it utilizes a retrieval-augmented generation model that incorporates contextual cues from previous events to produce coherent commentary aligned with the visual flow of the game. The experimental results show that SCORE significantly outperforms existing baselines in the proposed datasets.
Paper Type: Long
Research Area: Generation
Research Area Keywords: retrieval-augmented generation, data-to-text generation, text-to-text generation
Languages Studied: english
Submission Number: 4802
Loading