Distill, Fuse, Pre-train: Towards Effective Event Causality Identification with Commonsense-Aware Pre-trained Model

Peixin Huang, Xiang Zhao, Minghao Hu, Zhen Tan, Weidong Xiao

Published: 2024, Last Modified: 13 Jan 2026LREC/COLING 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Event Causality Identification (ECI) aims to detect causal relations between events in unstructured texts. This task is challenged by the lack of data and explicit causal clues. Some methods incorporate explicit knowledge from external knowledge graphs (KGs) into Pre-trained Language Models (PLMs) to tackle these issues, achieving certain accomplishments. However, they ignore that existing KGs usually contain trivial knowledge which may prejudice the performance. Moreover, they simply integrate the concept triplets, underutilizing the deep interaction between the text and external graph. In this paper, we propose an effective pipeline DFP, i.e., Distill, Fuse and Pre-train, to build a commonsense-aware pre-trained model which integrates reliable task-specific knowledge from commonsense graphs. This pipeline works as follows: (1) To leverage the reliable knowledge, commonsense graph distillation is proposed to distill commonsense graphs and obtain the meta-graph which contain credible task-oriented knowledge. (2) To model the deep interaction between the text and external graph, heterogeneous information fusion is proposed to fuse them through a commonsense-aware memory network. (3) Continual pre-training designs three continual pre-training tasks to further align and fuse the text and the commonsense meta-graph. Through extensive experiments on two benchmarks, we demonstrate the validity of our pipeline.