A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution

Anonymous

A Rationale-centric Counterfactual Data Augmentation Method for Cross-Document Event Coreference Resolution

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone

TL;DR: A Rationale-centric Counterfactual Data Augmentation Method with LLM in the loop to address Cross-Document Event Coreference Resolution.

Abstract: Based on Pre-trained Language Models (PLMs), event coreference resolution (ECR) systems have demonstrated outstanding performance in clustering coreferential events across documents. However, the state-of-the-art system exhibits an excessive reliance on the `trigger lexical matching' spurious pattern in the input mention pair text. We formalize the decision-making process of the baseline ECR system using a Structural Causal Model (SCM), aiming to identify spurious and causal associations (i.e., rationales) within the ECR task. Leveraging the debiasing capability of counterfactual data augmentation, we developed a rationale-centric counterfactual data augmentation method with LLM-in-the-loop. This method is specialized for pairwise input in the ECR system, where we conduct direct interventions on triggers and context to mitigate spurious associations while emphasizing causation. Our approach achieves state-of-the-art performance on three popular cross-document ECR benchmarks and demonstrates robustness in out-of-domain scenarios.

Paper Type: long

Research Area: Information Extraction

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

0 Replies

Loading