On the Generalization of Optical Flow: Quantifying Robustness to Dataset Shifts

Katrin Bauer, Andres Bruhn, Jenny Schmalfuss

Published: 18 Oct 2025, Last Modified: 21 Sept 2025ICCV 2025 Workshop DataCVEveryoneCC BY 4.0

Abstract: Optical flow models are commonly evaluated by their ability to accurately predict the apparent motion from image sequence data. Though not seen during training, this evaluation data generally shares the training data's characteristics because it stems from the same distribution, i.e., it is in-distribution (ID) with the training data. However, when models are applied in the real world, the test data characteristics may be shifted, i.e., out-of-distribution (OOD), compared to the training data. For optical flow models, the generalization to dataset shifts is much less reported than the typical accuracy on ID data. In this work we close this gap and systematically investigate the generalization of optical flow models by disentangling accuracy and robustness to dataset shifts with a new effective robustness metric. We evaluate a testbed of 20 models on six established optical flow datasets. Across models and datasets, we find that ID accuracy can be used as a predictor for OOD performance, but certain models generalize better than this trend suggests. While our analysis reveals that model generalization capabilities declined in recent years, we also find that more training data and smart architectural choices can improve generalization. Across tested models, effective robustness to dataset shifts is high for models that avoid attention mechanisms and favor multi-scale designs.