Abstract: The emergence of video captioning makes it possible to automatically generate natural language description for a given video. However, generating detailed video descriptions that incorporate domain-specific information remains an unsolved challenge, holding significant research and application value, particularly in domains such as sports commentary generation. Moreover, sports event commentary goes beyond being a mere game report, it involves entertaining, metaphorical, and emotional descriptions. To promote the field of sports commentary automatic generation, in this paper, we introduce a novel dataset, the Basketball Highlight Commentary (BH-Commentary), comprising approximately 4K basketball highlight videos with ground-truth commentaries from professional commentators. In addition, we propose an end-to-end framework as a benchmark for basketball highlight commentary generation task, in which a lightweight and effective prompt strategy is designed to enhance alignment fusion among visual and textual features. Extensive experiments on the BH-Commentary dataset demonstrate the validity of the dataset and the effectiveness of the proposed benchmark for sports highlight commentary generation. (The dataset is available at https://anonymous.4open.science/r/dataset-DC8E)
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Content] Vision and Language, [Content] Multimodal Fusion
Relevance To Conference: The emergence of video captioning makes it possible to automatically generate natural language description for a given video. While as for sports videos, sports event commentary goes beyond being a mere game report, it involves entertaining, metaphorical, and emotional descriptions. To promote the field of sports commentary automatic generation, in this work, we aim to introduce a novel descriptive basketball highlight dataset for sports highlight commentary generation and propose an effective benchmark model for this task.
Supplementary Material: zip
Submission Number: 2900
Loading