Incremental3D: Incremental 3D Scene Generation with Scene Graph for Immersive Teleoperation

Incremental3D: Incremental 3D Scene Generation with Scene Graph for Immersive Teleoperation

TMLR Paper6689 Authors

27 Nov 2025 (modified: 05 Mar 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Graph-based 3D scene generation aims to synthesize 3D environments conditioned on scene graphs and has been widely explored in applications such as 3D gaming and interior design. However, its potential for immersive robotic teleoperation has been largely overlooked. In this setting, transmitting lightweight incremental 3D scene graphs from the robot-side to the operator-side is far more bandwidth-efficient and lower-latency than streaming raw RGB or point-cloud data. %from the robot side to the operator side, and At the same time, recent advances in robot-side 3D scene-graph learning now make such incremental scene-graphs readily obtainable from RGB-D inputs. % for this new teleoperation system. Despite this opportunity, existing scene-graph-based 3D scene generation methods are fundamentally single-shot: inserting even a single new object requires regenerating the entire scene. This global re-computation incurs prohibitive latency and renders existing approaches unsuitable for real-time immersive robotic teleoperation, where the scene graph, and therefore the scene itself, is built and generated incrementally as the robot moves through the environment. To address this limitation, we propose \textit{Incremental3D}, the first framework capable of incremental graph-to-3D scene generation for teleoperation applications. \textit{Incremental3D} augments an existing scene graph with a global classification (CLS) node that maintains a holistic representation of the evolving environment. At each update step, the CLS node aggregates global context and conditions the generation of newly added objects, enabling geometry synthesis and spatial prediction without recomputing unchanged regions. Extensive experiments demonstrate that \textit{Incremental3D} achieves 38 Hz generation speed while maintaining high spatial accuracy, indicating its suitability for real-time teleoperation and other latency-sensitive 3D applications.

Submission Type: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: 1. Add robustness evaluation experiments. 2. Provide further analysis of the CLS-based global embedding compared with alternative global aggregation methods. 3. Redraw Figures 2 and 4. 4. Clarify in Section 3.2 how the shape code is converted into the final object mesh. 5. Expand the limitations and future work section. 6. Add an appendix on incremental scene graph construction and robustness evaluation.

Assigned Action Editor: ~Matthew_Walter1

Submission Number: 6689

Loading