Keywords: Pan-sharpening, Image Fusion
TL;DR: This study pioneers a framework to address theoretical bias in RWKV and explore its potential for multi-spectral and panchromatic fusion, bridging a key gap in remote sensing image fusion.
Abstract: Pan-sharpening aims to generate a spatially and spectrally enriched multi-spectral image by integrating complementary
cross-modality information from low-resolution multi-spectral image and texture-rich panchromatic counterpart. In this work, we propose a
WKV-sharing embraced random shuffle RWKV high-order modeling paradigm for pan-sharpening from Bayesian perspective, coupled with random weight manifold distribution training strategy derived from Functional theory to regularize the solution space adhering to the
following principles: 1) Random-shuffle RWKV. Recently, the Vision RWKV model, with its inherent linear complexity in global modeling,
has inspired us to explore its untapped potential in pan-sharpening tasks. However, its attention mechanism, relying on a recurrent
bidirectional scanning strategy, suffers from biased effects and demands significant processing time. To address this, we propose a novel
Bayesian-inspired scanning strategy called Random Shuffle, complemented by a theoretically-sound inverse shuffle to preserve
information coordination invariance, effectively eliminating biases associated with fixed sequence scanning. The Random Shuffle
approach mitigates preconceptions in global 2D dependencies in mathematical expectation, providing the model with an unbiased prior.
In line with similar spirit of Dropout, we introduce a testing methodology based on Monte Carlo averaging to ensure the model’s output
aligns more closely with expected results. 2) WKV-sharing high-order. Regarding KV’s attention score calculation in spatial mixer of RWKV, we leverage WKV-sharing mechanism to transfer KV activations across RWKV layers, achieving lower latency and improved trainability, and revisit the channel mixer in RWKV, originally a first-order weighting function, and redevelop its high-order potential by sharing the gate mechanism across RWKV layer. Comprehensive experiments across pan-sharpening benchmarks demonstrate our model’s effectiveness, consistently outperforming state-of-the-art alternatives
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 572
Loading