Generative AI for Immersive Video: Recent Advances and Future Opportunities

Published: 2025, Last Modified: 06 Jan 2026IJCAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Immersive video serves as a key component of eXtended Reality (XR) that aims to create and interact with simulated virtual or hybrid environments. Such a technology allows users to experience immersive sensations that transcend time and space, and meanwhile continuously providing training data for emerging technologies like Embodied AI. Thanks to the advancements in sensing, computing, and display, recent years have witnessed many excellent works for XR and related hardware or software systems. However, challenges like high creation cost, lack of immersion, and limited scalability hinder the practical application of immersive video services. Whilst recently emerged generative artificial intelligence (GenAI) provides us with new insights in tackling existing challenges. In this paper, we conduct a comprehensive survey into the recent advances and future opportunities on how GenAI can benefit immersive video services. By introducing a systematic taxonomy, we meticulously classify the pertinent techniques and applications into three well-defined categories aligned with the pipeline of immersive video service: content creation, network delivery, and client-side display. This categorization enables a structured exploration of the diverse roles on how GenAI can benefit immersive video service, providing a framework for a more comprehensive understanding and evaluation of these technologies. To the best of our knowledge, this work is the first systematic survey of GenAI in XR settings, laying a foundation for future research in this interdisciplinary domain.
Loading