ArtHOI: Articulated Human-Object Interaction Synthesis via Dynamics Distillation

02 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Articulated Human-object Interaction, Zero-shot Synthesis, Dynamics Distillation
TL;DR: ArtHOI enables zero-shot synthesis of realistic human interactions with articulated objects.
Abstract: Synthesizing realistic articulated human-object interactions is challenging, especially when explicit 3D/4D supervision is unavailable. Recent zero-shot methods distill dynamics priors from pretrained video diffusion models, but this setting inherently provides only monocular evidence. That makes articulated part motion highly ambiguous and tightly coupled with human actions, so prior work falls back to rigid-object assumptions and fails on everyday articulated scenes (e.g., containing doors, fridges, cabinets). We introduce **ArtHOI**, the first zero-shot framework for synthesizing articulated human-object interactions via dynamics distillation from monocular video priors. We make two critical designs: **1)** *Flow-based part segmentation*: we use optical-flow cues to separate dynamic from static regions, because motion is the most reliable signal when multi-view information is absent. **2)** *Decoupled dynamics distillation*: joint optimization of human motion and object articulation is unstable under monocular ambiguity, so we first recover object articulation, then synthesize human motion conditioned on the reconstructed object states. ArtHOI distills dynamics from monocular 2D video priors without any 3D/4D ground truth. Across diverse scenes, ArtHOI yields physically plausible articulated interactions, improving contact quality and reducing penetration while enabling behaviors beyond rigid-only baselines. This extends zero-shot HOI synthesis from rigid manipulation to articulated dynamics. Code will be available.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 1021
Loading