Abstract: Highlights • We propose a novel pipeline for the task of text classification, i.e., Pre-train Interact Fine-tune (PIF). • To the best of our knowledge, ours is the first attempt to model word interactions for text representation. • We introduce a two-perspective interaction representation for text classification. • We propose the Hybrid Language Model Pretrain-finetuning. • We find that our proposal outperforms the state-of-the-art methods for text classification in terms of accuracy. Abstract Text representation can aid machines in text understanding. Previous work on text representation often focuses on the so-called forward implication, i.e., preceding words are taken as the context of later words for creating representations, effective it is, yet ignoring the fact that the semantics of a text segment is a product of the mutual implication of words in the text: later words contribute to the meaning of preceding words. To bridge this gap, we introduce the concept of interaction and propose a two-perspective interaction representation, in which it encapsulates a local and a global interaction representation. Here, a local interaction representation is one that interacts among words with parent-children relationships on the syntactic trees whereas a global interaction interpretation is one that interacts among all the words in a sentence. We combine these two interaction representations to develop a Hybrid Interaction Representation (HIR). Inspired by existing feature-based and fine-tuning-based pretrain-finetuning approaches to language models, we integrate the merits of feature-based and fine-tuning-based methods to propose the Pre-train, Interact, Fine-tune (PIF) architecture. We evaluate our proposed models on five widely-used datasets for text classification tasks. It turns out that our ensemble method, HIRP, outperforms state-of-the-art baselines with improvements ranging from 2.03% to 3.15% in terms of error rate. In addition, we find that, the improvements of PIF against most state-of-the-art methods is not affected by increasing of the text length.
0 Replies
Loading