Transitioning Heads Conundrum: The Hidden Bottleneck in Long-Tailed Class-Incremental Learning

Transitioning Heads Conundrum: The Hidden Bottleneck in Long-Tailed Class-Incremental Learning

TMLR Paper6621 Authors

24 Nov 2025 (modified: 20 Dec 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Long-Tailed Class-Incremental Learning (LTCIL) faces a fundamental tension: models must sequentially learn new classes while contending with extreme class imbalance, which amplifies catastrophic forgetting. A particularly overlooked phenomenon is the Transitioning Heads Conundrum: as replay buffers constrain memory, initially well-represented head classes shrink over time and effectively become tail classes, undermining knowledge retention. Existing approaches fail to address this because they apply knowledge distillation too late, after these transitions have already eroded head-class representations. To overcome this, we introduce DEcoupling Representations for Early Knowledge distillation (DEREK), which strategically employs Early Knowledge Distillation to safeguard head-class knowledge before data constraints manifest. Comprehensive evaluation across 2 LTCIL benchmarks, 12 experimental settings, and 24 baselines, including Long-Tail, Class-Incremental, Few-Shot CIL, and LTCIL methods, shows that DEREK maintains competitive performance across categories, establishing new state-of-the-art results.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Hanwang_Zhang3

Submission Number: 6621

Loading