Implicit Training of Energy Models for Structured Prediction

Shiv Shankar

Implicit Training of Energy Models for Structured Prediction

Shiv Shankar

Published: 08 May 2023, Last Modified: 26 Jun 2023UAI 2023Readers: Everyone

Abstract: Most research in deep learning has predominantly focused on the development of new models and training procedures. In contrast, the exploration of training objectives has received considerably less attention, often limited to combinations of standard losses. When dealing with complex structured outputs, the effectiveness of conventional objectives as proxies for the true objective becomes can be questionable. In this study, we propose that existing inference network-based methods for structured prediction, as observed in previous works [Tu and Gimpel, 2018, Tu et al., 2020a], indirectly learn to optimize a dynamic loss objective parameterized by the energy model. Based on this insight, we propose a method that treats the energy network as a trainable loss function and employs an implicit-gradient-based technique to learn the corresponding dynamic objective. We experiment with multiple tasks such as multi-label classification, entity recognition etc., and find significant performance improvements over baseline approaches. Our results demonstrate that implicitly learning a dynamic loss landscape proves to be an effective approach for enhancing model performance in structured prediction tasks.

Supplementary Material: pdf

Other Supplementary Material: zip

0 Replies

Loading