Self-Supervised Alignment of RGB-Infrared Representations forEmbedded Perception

Abdelmalek Belghomari; Clara Barbanson; Frederic Jurie; Alexis Lechervy

Self-Supervised Alignment of RGB-Infrared Representations forEmbedded Perception

Abdelmalek Belghomari, Clara Barbanson, Frederic Jurie, Alexis Lechervy

Published: 26 Apr 2026, Last Modified: 26 Apr 2026RJCIA2026 ShortEveryoneRevisionsCC BY 4.0

Keywords: Self-Supervised Learning, JEPA, Domain Alignment, Mul-timodality, Embedded Perception

TL;DR: We propose a self-supervised JEPA-based approach that aligns RGB and IR modalities in a shared semantic space, eliminating the need for costly manual annotations in multispectral image fusion.

Abstract: RGB and IR image fusion requires precise alignment andannotated datasets. To eliminate this need for manual labe-ling, we propose a self-supervised approach using the Joint-Embedding Predictive Architecture (JEPA). By predictingIR latent features from masked RGB context, our model pro-jects both modalities into a shared semantic space. Prelimi-nary results show this alignment provides a solid founda-tion for embedded perception without any human interven-tion

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 20

Loading