Are Language Models Robust Coreference Resolvers?

Nghia T. Le; Alan Ritter

Are Language Models Robust Coreference Resolvers?

Nghia T. Le, Alan Ritter

Published: 10 Jul 2024, Last Modified: 26 Aug 2024COLMEveryoneRevisionsBibTeXCC BY 4.0

Research Area: Alignment, Evaluation, Human mind, brain, philosophy, laws and LMs, LMs and the world

Keywords: Large Language Models, Coreference Resolution, CoNLL 2012, Robustness

TL;DR: Prompting LMs for coreference outperforms unsupervised coreference systems, generalizes well across domains/languages/time periods given no additional training data, but trails behind continued fine-tuning of neural models

Abstract: Recent work on extending coreference resolution across domains and languages relies on annotated data in both the target domain and language. At the same time, pre-trained large language models (LMs) have been reported to exhibit strong zero- and few-shot learning abilities across a wide range of NLP tasks. However, prior work mostly studied this ability using artificial sentence-level datasets such as the Winograd Schema Challenge. In this paper, we assess the feasibility of prompt-based coreference resolution by evaluating instruction-tuned language models on difficult, linguistically-complex coreference benchmarks (e.g., CoNLL-2012). We show that prompting for coreference can outperform current unsupervised coreference systems, although this approach appears to be reliant on high-quality mention detectors. Further investigations reveal that instruction-tuned LMs generalize surprisingly well across domains, languages, and time periods; yet continued fine-tuning of neural models should still be preferred if small amounts of annotated examples are available.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 558

Loading