Keywords: Large Language Models, Open-set text classification
TL;DR: An Unknown Aware LLM for Open-Set Text Classification
Abstract: Open-set text classification (OSTC) requires models to correctly classify in-distribution (ID) samples while reliably rejecting out-of-distribution (OOD) inputs—an essential capability for real-world NLP systems. Most OSTC methods train on ID data under the closed assumption that all outputs belong to the known label space and then perform OOD detection with the biased representations, which inherently lack awareness of unknowns and thus yield overconfident predictions on OOD inputs. In this work, we present UnLLM, an Unknown-aware Large Language Model for OSTC. Instead of fixing classification to the entire known label space, we reformulate it into a subset-conditioned text generation task: the LLM is prompted with sampled subsets of known labels, and any instance outside the candidate set is explicitly assigned as “unknown”. This reformulation transforms OOD detection from a post-hoc procedure into an intrinsic modeling capability. More importantly, our approach is the first to explicitly incorporate the unknown into classification, enabling systematic modeling of unknowns through a unified representation–logits–inference optimization, which progressively strengthens the model’s capacity to capture open-set risk. Extensive experiments across six benchmarks show that UnLLM consistently outperforms state-of-the-art (SOTA) baselines. Code is available in an anonymous repository: https://anonymous.4open.science/r/UnLLM-03C2.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 15190
Loading