Dual-Scale Transformer for Large-Scale Single-Pixel Imaging

Gang Qu, Ping Wang, Xin Yuan

Published: 01 Jan 2024, Last Modified: 10 Mar 2025CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Single-pixel imaging (SPI) is a potential computational imaging technique which produces image by solving an ill-posed reconstruction problem from few measurements captured by a single-pixel detector. Deep learning has achieved impressive success on SPI reconstruction. However, previ-ous poor reconstruction performance and impractical imaging model limit its real-world applications. In this paper, we propose a deep unfolding network with hybrid-attention Transformer on Kronecker SPI model, dubbed HATNet, to im-prove the imaging quality of real SPI cameras. Specifically, we unfold the computation graph of the iterative shrinkage-thresholding algorithm (ISTA) into two alternative modules: efficient tensor gradient descent and hybrid-attention multi-scale denoising. By virtue of Kronecker SPI, the gradient descent module can avoid high computational overheads rooted in previous gradient descent modules based on vector-ized SPI. The denoising module is an encoder-decoder archi-tecture powered by dual-scale spatial attention for high- and low-frequency aggregation and channel attention for global information recalibration. Moreover, we build a SPI proto-type to verify the effectiveness of the proposed method. Ex-tensive experiments on synthetic and real data demonstrate that our method achieves the state-of-the-art performance. The source code and pre-trained models are available at https://github.com/Gang-Qu/HATNet-SPI.