Abstract: Highlights•Introducing a new Korean dialogue state tracking dataset in football broadcasts.•Evaluating current large language models on this dataset with our proposed metric.•Results show models struggle with long utterances, suggesting areas for improvement.
Loading