- Abstract: In this paper, we design a generic framework for learning a robust text classification model that achieves accuracy comparable to standard full models under test-time budget constraints. We take a different approach from existing methods and learn to dynamically delete a large fraction of unimportant words by a low-complexity selector such that the high-complexity classifier only needs to process a small fraction of important words. In addition, we propose a new data aggregation method to train the classifier, allowing it to make accurate predictions even on fragmented sequence of words. Our end-to-end method achieves state-of-the-art performance while its computational complexity scales linearly with the small fraction of important words in the whole corpus. Besides, a single deep neural network classifier trained by our framework can be dynamically tuned to different budget levels at inference time.
- Keywords: Data Aggregation, Budget Learning, Speed Up, Faster Inference, Robust Classifier
- TL;DR: Modular framework for document classification and data aggregation technique for making the framework robust to various distortion, and noise and focus only on the important words.