Speedy MASt3R

Jingxing Li, Yongjae Lee, Deliang Fan, Abhay Kumar Yadav, Cheng Peng, Rama Chellappa

Published: 2025, Last Modified: 12 Nov 2025COINS 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: MASt3R redefines image matching as a 3D task but suffers from high inference latency (198ms per image pair on an A40 GPU). We introduce Speedy MASt3R, a post-training optimization framework that achieves a 54% speedup (91ms per pair) without compromising accuracy. Our approach incorporates four key techniques: (1) FlashMatch, which leverages FlashAttention v2 for efficient attention computation; (2) GraphFusion, which optimizes the computation graph using TensorRT; (3) FastNN-Lite, which reduces complexity from quadratic to linear; and (4) HybridCast, which enables mixed-precision inference. Evaluations on five benchmarks (Aachen Day-Night, InLoc, 7-Scenes, ScanNet1500, MegaDepth1500) demonstrate consistent performance, highlighting real-time 3D understanding capabilities.

External IDs:dblp:conf/coins/LiLFYPC25