Small-text: Active Learning for Text Classification in PythonDownload PDFOpen Website

2021 (modified: 16 Nov 2021)CoRR 2021Readers: Everyone
Abstract: We present small-text, a simple and modular active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It comes with various pre-implemented state-of-the-art query strategies, including some that can leverage the GPU. Clearly defined interfaces allow the combination of a multitude of classifiers, query strategies, and stopping criteria, thereby facilitating a quick mix and match, and enabling a rapid development of both active learning experiments and applications. To make various classifiers accessible in a consistent way, it integrates several well-known existing machine learning libraries, namely, scikit-learn, PyTorch, and huggingface transformers, where the latter integrations are available as optionally installable extensions, making the availability of a GPU competely optional. The library is available under the MIT License at https://github.com/webis-de/small-text.
0 Replies

Loading