StageAR: Markerless Mobile Phone Localization for AR in Live Events

Anthony Rowe

Published: 16 May 2024, Last Modified: 07 Mar 2025IEEE VREveryoneCC BY 4.0

Abstract: Localizing mobile phone users precisely enough to provide AR con- tent in theaters and concert venues is extremely challenging due to dynamic staging and variable lighting. Visual markers are often disruptive in terms of aesthetics, and static pre-defined feature maps are not robust to visual changes. In this paper, we study several techniques that leverage sparse fixed infrastructure to monitor and adapt to changes in the environment at runtime to enable robust AR quality pose tracking for large audiences. Our most basic tech- nique uses one or more fixed cameras in the environment to prune away poor feature points due to motion and lighting from a static model. For more challenging environments, we propose transmit- ting dynamic 3D feature maps that adapt to changes in the scene in real-time. Users with a mobile phone camera can use these maps to accurately localize across highly dynamic environments without explicit markers. We show the performance trade-offs resulting from StageAR’s different reconstruction techniques, ranging from multiple stereo cameras to cameras paired with LiDAR. We evaluate each approach in our system across a wide variety of simulated and real environments at auditorium/theater scale and find that our most accurate technique can match the performance of large (1.5x1.5m) back-lit static markers without being visible to users.