Abstract: Named Entity Recognition (NER) is a foundational technology for systems designed to process Natural Language documents. However, many existing state-of-the-art systems are difficult to integrate into commercial settings (due their monolithic construction, licensing constraints, or need for corpuses, for example). In this work, a new NER system is described that uses the output of existing systems over large corpuses as its training set, ultimately enabling labelling with (i)better F1 scores; (ii)higher labelling speeds; and (iii)no further dependence on the external software.
Loading