Abstract: Stereo rectification is widely considered “solved” due to the abundance of traditional approaches to perform recti-fication. However, autonomous vehicles and robots in-the-wild require constant re-calibration due to exposure to var-ious environmental factors, including vibration, and structural stress, when cameras are arranged in a wide-baseline configuration. Conventional rectification methods fail in these challenging scenarios: especially for larger vehicles, such as autonomous freight trucks and semi-trucks, the resulting incorrect rectification severely affects the quality of downstream tasks that use stereo/multi-view data. To tackle these challenges, we propose an online rectification approach that operates at real-time rates while achieving high accuracy. We propose a novel learning-based online cal-ibration approach that utilizes stereo correlation volumes built from a feature representation obtained from cross-image attention. Our model is trained to minimize vertical optical flow as proxy rectification constraint, and predicts the relative rotation between the stereo pair. The method is real-time and even outperforms conventional methods used for offline calibration, and substantially improves downstream stereo depth, post-rectification. We release two public datasets (https://light.princeton.edu/online-stereo-recification/), a synthetic and experimental wide baseline dataset, to foster further research.
Loading