FuseUNet: A Multi-Scale Feature Fusion Method for U-like Networks

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Medical image segmentation is a critical task in computer vision, with UNet serving as a milestone architecture. The typical component of UNet family is the skip connection, however, their skip connections face two significant limitations: (1) they lack effective interaction between features at different scales, and (2) they rely on simple concatenation or addition operations, which constrain efficient information integration. While recent improvements to UNet have focused on enhancing encoder and decoder capabilities, these limitations remain overlooked. To overcome these challenges, we propose a novel multi-scale feature fusion method that reimagines the UNet decoding process as solving an initial value problem (IVP), treating skip connections as discrete nodes. By leveraging principles from the linear multistep method, we propose an adaptive ordinary differential equation method to enable effective multi-scale feature fusion. Our approach is independent of the encoder and decoder architectures, making it adaptable to various U-Net-like networks. Experiments on ACDC, KiTS2023, MSD brain tumor, and ISIC2017/2018 skin lesion segmentation datasets demonstrate improved feature utilization, reduced network parameters, and maintained high performance. The code is available at https://github.com/nayutayuki/FuseUNet.
Lay Summary: Medical imaging helps doctors detect diseases by analyzing scans. A popular AI model for this task is called UNet. However, UNet often struggles to combine image details from different scales, which limits its accuracy. We propose a new method that treats this combination process like solving a mathematical problem step by step. This allows the model to better integrate information from different parts of the image, improving its understanding. Our solution works with many types of existing models, improves results across several datasets, and requires fewer computing resources — potentially making medical AI tools more efficient and accurate.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Primary Area: Applications->Computer Vision
Keywords: Feature Fusion; Medical Image Segmentation; Ordinary Differential Equation
Flagged For Ethics Review: true
Submission Number: 2347
Loading