Abstract: Reconstructing a High Dynamic Range (HDR) image from
several Low Dynamic Range (LDR) images with different
exposures is a challenging task, especially in the presence
of camera and object motion. Though existing models us-
ing convolutional neural networks (CNNs) have made great
progress, challenges still exist, e.g., ghosting artifacts. Trans-
formers, originating from the field of natural language pro-
cessing, have shown success in computer vision tasks, due to
their ability to address a large receptive field even within a
single layer. In this paper, we propose a transformer model
for HDR imaging. Our pipeline includes three steps: align-
ment, fusion, and reconstruction. The key component is the
HDR transformer module. Through experiments and ablation
studies, we demonstrate that our model outperforms the state-
of-the-art by large margins on several popular public datasets.
Introduction
Loading