Segmenting Natural Language Sentences via Lexical Unit AnalysisDownload PDF

28 Sept 2020 (modified: 26 May 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Neural Sequence Labeling, Neural Sequence Segmentation, Dynamic Programming
Abstract: In this work, we present Lexical Unit Analysis (LUA), a framework for general sequence segmentation tasks. Given a natural language sentence, LUA scores all the valid segmentation candidates and utilizes dynamic programming (DP) to extract the maximum scoring one. LUA enjoys a number of appealing properties such as inherently guaranteeing the predicted segmentation to be valid and facilitating globally optimal training and inference. Besides, the practical time complexity of LUA can be reduced to linear time, which is very efficient. We have conducted extensive experiments on 5 tasks, including syntactic chunking, named entity recognition (NER), slot filling, Chinese word segmentation, and Chinese part-of-speech (POS) tagging, across 15 datasets. Our models have achieved the state-of-the-art performances on 13 of them. The results also show that the F1 score of identifying long-length segments is notably improved.
One Line Summary: This paper introduces a new framework, Lexical Unit Analysis (LUA), for neural sequence segmentation
Acknowledgement Of Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We propose LUA, a novel framework for neural sequence segmentation, which facilitates globally optimal training and inference.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 7 code implementations](https://www.catalyzex.com/paper/segmenting-natural-language-sentences-via/code)
Reviewed Version (pdf): https://openreview.net/references/pdf?id=PcU0XZgzh
7 Replies

Loading