ETC$^2$: Near-Attention Ensemble of Term Classification for Effective and Efficient Text ClassificationDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Sequence models, particularly those leveraging transformer architectures, have dominated the Automatic Text Classification (ATC) field in the last years. These models represent words as dense contextual vectors that will compose the document (dense) representations. Though effective these models are expensive for training (fine-tuning) and at inference (prediction) time. Traditional bag-of-words approaches that directly represent a document as a single sparse vector are usually much more efficient, but are not as effective as sequence models. Both types of model commonly involve constructing a representation of the entire document before predicting its class, overlooking the importance of some individual word (co-)occurrences for the target task. This paper takes a completely different approach to the ATC task by promoting words as ``first-class'' citizens for ATC. In other words, our method called ETC$^2$, directly classifies each term of a document -- using an intricate combination of: (i) frequentist information; (ii) explicit co-occurrence and context modeling; and (iii) (near-)attention layering. It then uses these predictions to estimate the document class. The proposed approach eliminates the need for a single representation of the document, thus enormously improving model efficiency. In our experimental evaluation, ETC$^2$ was as effective as (if not better) than the best Transformer baselines in the tested datasets, being up to 17x faster at inference (prediction) time than modern Transformer-based classifiers.
Paper Type: long
Research Area: Machine Learning for NLP
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview