Abstract: Underwater creature segmentation (UCS) is critical for marine research and robotics but faces unique challenges: environmental distortions and biological traits that distinguish it from terrestrial segmentation. While deep learning advances exist, current UCS models are constrained to low-resolution inputs, losing critical details when processing high-resolution (HR) imagery and degrading segmentation precision. To bridge this gap, we introduce UCS4K, the first large-scale HR dataset for UCS, containing 4,096 images with pixel-wise annotations. UCS4K offers 4 times higher average resolution than existing datasets, covering diverse species, habitats, and environmental complexities essential for robust model training. Additionally, we propose a Resolution-Asymmetric Dual-branch Alignment and Refinement (RADAR) network to address the efficiency-receptiveness trade-off in HR-UCS. RADAR decouples context and detail processing: a CNN branch preserves HR spatial details, while a Transformer branch models global semantics on downsampled inputs to avoid quadratic complexity. Crucially, it resolves the inherent semantic misalignment issue between branches via the Global Semantic Alignment (GSA) module in the encoder and the Bidirectional Collaborative Refinement (BCR) module-embedded decoder that progressively integrates multi-scale encoding features to sharpen boundaries. This asymmetric design ensures efficient long-range context capture without sacrificing spatial precision. Extensive benchmarks demonstrate that RADAR sets new state-of-the-art performance on UCS4K and other existing datasets. Our contributions establish the first HR benchmark for UCS and deliver a scalable framework for high-precision segmentation. Dataset, code, and models are available at https://github.com/WHYfromNUT/RADAR.
External IDs:dblp:journals/tip/WuJWCDYJ25
Loading