A Multi-Task Deep Learning Framework Integrating Segmentation and Reconstruction for Lensless Imaging

Published: 01 Jan 2024, Last Modified: 09 Apr 2025IEEE Trans. Emerg. Top. Comput. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As a new paradigm in computational imaging, lensless imaging holds promise for camera miniaturization. However, lensless image reconstruction or segmentation for lensless imaging is challenging due to sensor noise, diffraction effects, and imperfect coding. To tackle the issue, we propose a customized multi-task learning framework called RecSegNet, integrating image reconstruction and segmentation into a single network to boost each other by the complementary information between the two tasks. Our RecSegNet is a Y-shaped architecture consisting of an encoder and two decoders. The encoder includes an optical-aware estimator (OE), a pyramid vision Transformer (PVT), and customized tokenized multi-layer perceptions (TMLP) to obtain long-range semantics. The two decoders, reconstruction decoder (RecD) and segmentation decoder (SegD), share the same structure for predicting the underlying scenes and segmentation maps, respectively. Furthermore, we propose the hierarchical feature mutual learning (HFML) module to drive each task by enhancing the interaction of two tasks. Extensive experiments demonstrate that the RecSegNet can accurately reconstruct the underlying scene while segmenting objects of interest from lensless imaging measurements.
Loading