Information Extraction with Differentiable Beam Search on Graph RNNs

Published: 01 Jan 2024, Last Modified: 18 Jun 2024LREC/COLING 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Information extraction (IE) from text documents is an important NLP task that includes entity, relation, and event extraction. These tasks are often addressed jointly as a graph generation problem, where entities and event triggers represent nodes and where relations and event arguments represent edges. Most existing systems use local classifiers for nodes and edges, trained using cross-entropy loss, and employ inference strategies such as beam search to approximate the optimal graph structure. These approaches typically suffer from exposure bias due to the discrepancy between training and decoding. In this paper, we tackle this problem by casting graph generation as auto-regressive sequence labeling and making its training aware of the decoding procedure by using a differentiable version of beam search. We evaluate the effectiveness of our approach through extensive experiments conducted on the ACE05 and ConLL04 datasets across diverse languages. Our experimental findings affirm that our model outperforms its non-decoding-aware version for all datasets employed. Furthermore, we conduct ablation studies that emphasize the effectiveness of aligning training and inference. Additionally, we introduce a novel quantification of exposure bias within this context, providing valuable insights into the functioning of our model.
Loading