BERT for Sequence-to-Sequence Multi-Label Text Classification

Ramil Yarullin; Pavel Serdyukov

BERT for Sequence-to-Sequence Multi-Label Text Classification

Ramil Yarullin, Pavel Serdyukov

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone

TL;DR: On using BERT as an encoder for sequential prediction of labels in multi-label text classification task

Abstract: We study the BERT language representation model and the sequence generation model with BERT encoder for multi-label text classification task. We experiment with both models and explore their special qualities for this setting. We also introduce and examine experimentally a mixed model, which is an ensemble of multi-label BERT and sequence generating BERT models. Our experiments demonstrated that BERT-based models and the mixed model, in particular, outperform current baselines in several metrics achieving state-of-the-art results on three well-studied multi-label classification datasets with English texts and two private Yandex Taxi datasets with Russian texts.

Keywords: Multi-Label Text Classification, Sequence-to-Sequence Learning, BERT, Sequence Generation, Hierarchical Text Classification

Original Pdf: pdf

8 Replies

Loading