Fractal Predictive Operators: Learnable Iterated Function Systems for Multi-Scale Latent Modeling

TMLR Paper7371 Authors

06 Feb 2026 (modified: 19 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Joint Embedding Predictive Architectures (JEPAs) rely on latent-space prediction to learn representations without explicit reconstruction. While effective, their predictors are typically implemented as shallow feed-forward networks, offering limited control over multi-step dynamics and stability. We introduce Learnable Iterated Function Systems (LIFS), a contractive predictive operator that replaces the standard JEPA predictor with a learned mixture of affine maps applied recursively in latent space. Mixture weights are generated conditionally on the context embedding, allowing the operator to adapt its local geometry across spatial locations and inputs. LIFS does not change the training objective or encoder architecture, but explicitly constrains predictor dynamics through spectral control and adaptive gating. Additionally, our analysis unifies spectral control, exponential moving average (EMA) updates, and predictive convergence through a contraction-based perspective. Empirically, integrating LIFS into JEPA improves training stability and yields consistent, though moderate, gains in linear probing accuracy, particularly for ViT-based encoders and non-overlapping prediction settings. These results highlight predictor dynamics as an important and underexplored design axis in self-supervised learning.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Anastasios_Kyrillidis2
Submission Number: 7371
Loading