Towards 4D-aware generative models: Creating dynamic, interactive 3D worlds with generative AI

Junyi Cao; Pradyumn Goyal; Dmitrii M Petrov; Evangelos Kalogerakis

Towards 4D-aware generative models: Creating dynamic, interactive 3D worlds with generative AI

Junyi Cao, Pradyumn Goyal, Dmitrii M Petrov, Evangelos Kalogerakis

Published: 23 Jun 2025, Last Modified: 23 Jun 2025Greeks in AI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: generative AI, 3D deep learning, shape synthesis, interactive 3D worlds, physics-aware generative models

TL;DR: my group's latest research on generative AI for building interactive 3D worlds

Abstract: I would like to present the following two works of my group, which are both related to the research direction of building generative AI models for interactive 3D worlds (also funded by my ERC consolidator grant with Technical University of Crete as host institution): https://www.tuc.gr/en/university/in-the-spotlight/item/new-european-distinction-erc-for-the-school-of-ece-for-the-2nd-year-in-a-row (1) GEOPARD: Geometric Pretraining for Articulation Prediction in 3D Shapes [https://arxiv.org/abs/2504.02747] We present GEOPARD, a transformer-based architecture for predicting articulation from a single static snapshot of a 3D shape. The key idea of our method is a pretraining strategy that allows our transformer to learn plausible candidate articulations for 3D shapes based on a geometric-driven search without manual articulation annotation. The search automatically discovers physically valid part motions that do not cause detachments or collisions with other shape parts. Our experiments indicate that this geometric pretraining strategy, along with carefully designed choices in our transformer architecture, yields state-of-the-art results in articulation inference in the PartNet-Mobility dataset. (2) SOPHY: Generating Simulation-Ready Objects with PHYsical Materials [https://arxiv.org/pdf/2504.12684] We present SOPHY, a generative model for 3D physics-aware shape synthesis. Unlike existing 3D generative models that focus solely on static geometry or 4D models that produce physics-agnostic animations, our approach jointly synthesizes shape, texture, and material properties related to physics-grounded dynamics, making the generated objects ready for simulations and interactive, dynamic environments. To train our model, we introduce a dataset of 3D objects annotated with detailed physical material attributes, along with an annotation pipeline for efficient material annotation. Our method enables applications such as text-driven generation of interactive, physics-aware 3D objects and single-image reconstruction of physically plausible shapes. Furthermore, our experiments demonstrate that jointly modeling shape and material properties enhances the realism and fidelity of generated shapes, improving performance on generative geometry evaluation metrics.

Submission Number: 121

Loading