Interactive Scene Graph Analysis for Future Intelligent Teleconferencing Systems

Published: 01 Jan 2023, Last Modified: 13 May 2025ISM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In a real-life meeting environment, individuals often demonstrate a remarkable ability to selectively focus their attention on specific visual information. This ability allows them to naturally concentrate on a specific region of interest while tuning out others. Understanding and exploiting such selective attention remains unexplored in a user-centric teleconferencing system, where there is a potential to customize video streaming and foveated rendering based on the viewer’s attention. This paper proposes a novel user-centric scene analysis module that fully leverages the power of selective attention for online meeting scenarios and recognizes the unequal importance of individual pixels in the videos. The module determines the user’s selective attention through the meeting contexts. The contextual representation of the meeting is modeled as a combination of two primary components: proactive user interaction within the system and passive real-time analysis of high-level visual semantics from the scenes. As the meeting progresses, the interactive scene analysis module dynamically updates its contextual representation, offering a dual advantage: (a) Videos can be selectively and adaptively streamed within a user’s attention, resulting in bandwidth savings of up to 78 percent. (b) The module enhances the overall quality of the user experience by facilitating higher user interactivity, particularly in meeting-related tasks such as screen sharing, privacy-preserving user blocking, background removal, automatic user attention shift detection, etc. Our interactive scene analysis module makes significant progress toward enabling an efficient, immersive, and intelligent teleconferencing system.
Loading