End-to-End Image Stitching Network via Multi-Homography Estimation

Dae-Young Song, Gi-Mun Um, Heekyung Lee, Donghyeon Cho

Published: 2021, Last Modified: 28 Oct 2025IEEE Signal Process. Lett. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this letter, we propose an end-to-end stitching network, which takes two images with a narrow field of view (FOV) as inputs, and produces a single image with a wide FOV. Our method estimates multiple homographies to cover the depth differences in the scene and is therefore robust against parallax distortion. In particular, global warping maps are generated using estimated multiple homographies and adjusted by local displacement maps. The final result is made by warping input images multiple times using the warping maps and then merging warped images with the weight maps. Multiple homographies, local displacement maps, and weight maps are generated simultaneously by our stitching network. To train the stitching network, we construct a dataset using the CARLA simulator. Then, using this dataset, our network is trained by end-to-end supervised learning based on appearance matching loss and depth layer loss. In experiments, we show that our method is superior to existing methods both qualitatively and quantitatively. Also, we provide various empirical studies for in-depth analysis as well as the result of the expansion to 360° panoramas.

External IDs:dblp:journals/spl/SongULC21