Learning Flow-Guided Registration for RGB–Event Semantic Segmentation

Learning Flow-Guided Registration for RGB–Event Semantic Segmentation

ICLR 2026 Conference Submission371 Authors

01 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Event-based Semantic Segmentation, Flow-Guided Registration, Multi-Sensor Fusion

TL;DR: We recast RGB-Event segmentation from fusion to registration via a novel flow-guided registration-centric framework and introduce a new Motion-enhanced Event Tensor (MET) representation, mitigating spatiotemporal and modal misalignments.

Abstract: Event cameras capture microsecond-level motion cues that complement RGB sensors. However, the prevailing paradigm of treating RGB-Event perception as a fusion problem is ill-posed, as it ignores the intrinsic (i) Spatiotemporal and (ii) Modal Misalignment, unlike other RGB-X sensing domains. To tackle these limitations, we recast RGB-Event segmentation from fusion to registration. We propose BRENet, a novel flow-guided bidirectional framework that adaptively matches correspondence between the asymmetric modalities. Specifically, it leverages temporally aligned optical flows as a coarse-grained guide, along with fine-grained event temporal features, to generate precise forward and backward pixel pairings for registration. This pairing mechanism converts the inherent motion lag into terms governed by flow estimation error, bridging modality gaps. Moreover, we introduce Motion-Enhanced Event Tensor (MET), a new representation that transforms sparse event streams into a dense, temporally coherent form. Extensive experiments on four large-scale datasets validate our approach, establishing flow-guided registration as a promising direction for RGB–Event segmentation.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 371

Loading