A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

Kazuma Hashimoto; Caiming Xiong; Yoshimasa Tsuruoka; Richard Socher

A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks

Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher

21 Jul 2024 (modified: 22 Oct 2023)Submitted to ICLR 2017Readers: Everyone

Abstract: Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce such a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. All layers include shortcut connections to both word representations and lower-level task predictions. We use a simple regularization term to allow for optimizing all model weights to improve one task's loss without exhibiting catastrophic interference of the other tasks. Our single end-to-end trainable model obtains state-of-the-art results on chunking, dependency parsing, semantic relatedness and textual entailment. It also performs competitively on POS tagging. Our dependency parsing layer relies only on a single feed-forward pass and does not require a beam search.

TL;DR: A single deep multi-task learning model for five different NLP tasks.

Conflicts: logos.t.u-tokyo.ac.jp, salesforce.com

Keywords: Natural language processing, Deep learning

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:1611.01587/code)

13 Replies

Loading