Human:
Video 1 ✔ | Video 2 ✘
VB-BG:
Video 1 ✘ | Video 2 ✔
Score 1 : 0.1030 | Score 2 : 0.1941
MS-Debias (Ours):
Video 1 ✔ | Video 2 ✘
Score 1 : 0.2032 | Score 2 : 0.1864
Despite extreme distortions in the background scene of Video 2, VB-BG still prefers it over Video 1 as the camera barely moves in the scene. Whereas, our MS-Debias captures more subtle changes in the background, leading to a more accurate assessment.
Human:
Video 1 ✔ | Video 2 ✘
VB-BG:
Video 1 ✘ | Video 2 ✔
Score 1 : 0.1649 | Score 2 : 0.2821
MS-Debias (Ours):
Video 1 ✔ | Video 2 ✘
Score 1 : 0.3870 | Score 2 : 0.3303
In Video 2, the building in the background exhibits localized distortions compared to Video 1. VBench prefers this video as contains lesser camera motion.
Human:
Video 1 ✔ | Video 2 ✘
VB-BG:
Video 1 ✘ | Video 2 ✔
Score 1 : 0.4150 | Score 2 : 0.5055
MS-Debias (Ours):
Video 1 ✔ | Video 2 ✘
Score 1 : 0.3298 | Score 2 : 0.3079
The background walls in Video 2, tend to slide and distort. As long as the scene content remains inside the frame, these changes may go unnoticed by VB-BG.
Human:
Video 1 ✔ | Video 2 ✘
VB-BG:
Video 1 ✘ | Video 2 ✔
Score 1 : 0.5531 | Score 2 : 0.6230
MS-Debias (Ours):
Video 1 ✔ | Video 2 ✘
Score 1 : 0.7075 | Score 2 : 0.5526
The building in Video 2, undergoes a sharp transformation that goes undetected by VB-BG, whereas our metric is sensitive to such pixel-level changes.