Abstract: Text Segmentation involves dividing text into coherent sections, typically defined by topics. Over the past decade, lots of research has gone into furthering the development of supervised techniques to approach TS tasks, which has largely left unsupervised TS techniques with less advancement. With the onset of Large Language Models and the accessibility of them becoming more commonplace, unsupervised TS can benefit. By leveraging an LLM's strong understanding of natural language, prompting appropriately, and feeding in valuable context, we show that, even with locally run, open source LLM models, we can achieve state-of-the-art unsupervised TS results as benchmarked by Pk and WindowDiff scores.
Paper Type: Long
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: Natural Language Processing, Text Segmentation, LLMs, Natural Language Understanding, Unsupervised Text Segmentation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English
Submission Number: 1184
Loading