Crossroads, Buildings and Neighborhoods: A Dataset for Fine-grained Location RecognitionDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=4sjBbrHJtKj
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: General domain Named Entity Recognition (NER) datasets like CoNLL-2003 mostly annotate coarse-grained location entities such as a country or a city. But many applications require identifying fine-grained locations from texts and mapping them precisely to geographic sites, e.g., a crossroad, an apartment building, or a grocery store. In this paper, we introduce a new dataset HarveyNER with fine-grained locations annotated in tweets. This dataset presents unique challenges and characterizes many complex and long location mentions in informal descriptions. We built strong baseline models using Curriculum Learning and experimented with different heuristic curricula to better recognize difficult location mentions. Experimental results show that the simple curricula can improve the system's performance on hard cases and its overall performance, and outperform several other baseline systems. The dataset and the baseline models can be found at https://github.com/brickee/HarveyNER.
Dataset: zip
Presentation Mode: This paper will be presented in person in Seattle
Virtual Presentation Timezone: UTC-5
Copyright Consent Signature (type Name Or NA If Not Transferrable): Pei Chen
Copyright Consent Name And Address: Texas A&M University, College Station, TX 77842
0 Replies

Loading