Asymmetric Training with Heterogeneous Losses: A Probe into Architectural Resonance

Asymmetric Training with Heterogeneous Losses: A Probe into Architectural Resonance

ICLR 2026 Conference Submission21729 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Auxiliary Supervision, Architectural Inductive Bias, Gradient Analysis, Multi-objective Learning, Loss Landscape, Training Dynamics

Abstract: Is deep learning robustness necessarily rooted in optimizing a single objective? We explore an alternative view: adaptive generalization may emerge from structured interactions among heterogeneous objectives during training. We propose an Asymmetric Training Paradigm that temporarily introduces non-competitive, per-class supervision (sigmoid losses) into networks optimized with competitive softmax objectives. This is realized through orthogonally initialized auxiliary pathways, modulated by a scalar coefficient $\alpha$ and present only during training. This controlled form of temporary topological redundancy creates an ideal probe for studying objective interactions. Our mechanistic analysis shows that such redundancy consistently smooths the initial loss landscape, but its performance impact follows a Principle of Architectural Resonance: auxiliary signals benefit models only when aligned with architectural inductive biases. A 6-block Vision Transformer (ViT-6L) exhibits constructive gradient alignment (cosine similarity +0.19), yielding up to 25\% accuracy gains on CIFAR-100 with $20\times$ redundancy; by contrast, a CNN shows destructive conflicts (cosine similarity -0.26), leading to degradation. These findings challenge the view of auxiliary supervision as a universal regularizer. Instead, they reveal robustness as an outcome of structured internal dialogues between objectives, opening a path toward the design of multi-objective training systems tuned to architectural biases.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 21729

Loading