An Alignment-based Approach to Text Segmentation Similarity ScoringDownload PDF

20 Oct 2022OpenReview Archive Direct UploadReaders: Everyone
Abstract: Text segmentation is a natural language processing task with popular applications, such as topic segmentation, element discourse extraction, and sentence tokenization. Much work has been done to develop accurate segmentation similarity metrics, but even the most advanced metrics used today, B, andWindowDiff, exhibit incorrect behavior due to their evaluation of boundaries in isolation. In this paper, we present a new segment-alignment based approach to segmentation similarity scoring and a new similarity metric A. We show that A does not exhibit the erratic behavior of B and WindowDiff, quantify the likelihood of B and WindowDiff misbehaving through simulation, and discuss the versatility of alignment-based approaches for segmentation similarity scoring. We make our implementation of A publicly available in the hope that it will encourage the community to explore more sophisticated approaches for text segmentation similarity scoring.
0 Replies

Loading