Abstract: Recently, adversarial networks have attracted increasing attentions for the promising results of generative tasks. In this paper we present the first application of conditional adversarial networks to stereo matching task. Our approach performs a conditional adversarial training process on two networks: a generator that learns the mapping from a pair of RGB images to a dense disparity map, and a discriminator that distinguishes whether the disparity map comes from the ground truth or from the generator. Here, both the generator and the discriminator take the same RGB image pair as an input condition. During this conditional adversarial training process, our discriminator gradually captures high-level contextual features to detect inconsistencies between the ground truth and the generated disparity maps. These high-level contextual features are incorporated into loss function in order to further help the generator to correct predicted disparity maps. We evaluate our model on the Scene Flow dataset and an improvement is achieved compared with the most related work pix2pix.
Loading