Vision UFormer: Long-range monocular absolute depth estimation

Published: 2023, Last Modified: 12 Nov 2025Comput. Graph. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Vision UFormer: Dense depth prediction model combining Vision Transformer with a UNet.•Staged Training: Moving from easier to difficult data allows successful training.•Predictor reaches SOTA results, surpassing others in long-range natural environments•Depths are used for further applications, e.g., scene reconstruction or manipulation.
Loading