OCTDiff: Bridged Diffusion Model for Portable OCT Super-Resolution and Enhancement

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 spotlightEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
Keywords: Super-resolution, Medical Imaging, Diffusion Model, Noise Aggregation, Computer Vision
TL;DR: We propose OCTDiff, a bridged diffusion model with Adaptive Noise Aggregation and Multi-Scale Cross-Attention, trained with quality-guided loss. It achieves SOTA enhancement on portable OCT images and shows potential for broader clinical use.
Abstract: Medical imaging super-resolution is critical for improving diagnostic utility and reducing costs, particularly for low-cost modalities such as portable Optical Coherence Tomography (OCT). We propose OCTDiff, a bridged diffusion model designed to enhance image resolution and quality from portable OCT devices. Our image-to-image diffusion framework addresses key challenges in the conditional generation process of denoising diffusion probabilistic models (DDPMs). We introduce Adaptive Noise Aggregation (ANA), a novel module to improve denoising dynamics within the reverse diffusion process. Additionally, we integrate Multi-Scale Cross-Attention (MSCA) into the U-Net backbone to capture local dependencies across spatial resolutions. To address overfitting on small clinical datasets and to preserve fine structural details essential for retinal diagnostics, we design a customized loss function guided by clinical quality scores. OCTDiff outperforms convolutional baselines and standard DDPMs, achieving state-of-the-art performance on clinical portable OCT datasets. Our model and its downstream applications have the potential to generalize to other medical imaging modalities and revolutionize the current workflow of ophthalmic diagnostics. The code is available at https://github.com/AI4VSLab/OCTDiff.
Supplementary Material: zip
Primary Area: Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)
Submission Number: 25245
Loading