Keywords: Modeling Object Dynamics, Spatial Completion, Temporal Aggregation, Particle Graph Transformer
Abstract: Increasing interaction demands with dynamic objects require accurate modeling of their dynamics and precise prediction of motion trajectories from limited observations. Existing approaches rely on the coordinates of downsampled Key Points as the feature basis and model their interactions within local neighborhoods, resulting in the loss of fine-grained details and homogenized particle representations. In this work, we propose DyG$^2$T, a dynamics modeling framework that leverages spatiotemporally completed particle representations for multi-scale force propagation. Spatially, each Key Point enriches fine-grained edge features and spatial geometry by aggregating position information from corresponding raw particles and relative coordinates from neighboring Key Points. Temporally, after supplementing Key Points with inter-frame relative motion offsets via Motion Align Net, the Temporal Attention is applied to aggregate Key Point features across adjacent frames, preserving the dynamic evolution patterns of particles. For comprehensive interactive modeling, a Particle Graph Transformer establishes multi-scale force propagation paths from contact-near to distant Key Points, preserving discriminative long-range dependencies critical for accurate trajectory modeling. Experiments on synthetic and real-world datasets demonstrate that DyG$^2$T achieves accurate trajectory decoding, strong cross-object and real-world generalization.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 8380
Loading