PhysDiff-VTON: Cross-Domain Physics Modeling and Trajectory Optimization for Virtual Try-On

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: virtual try-on, garment deformation modeling, physics modeling, trajectory optimization
TL;DR: We present PhysDiff-VTON via physics modeling and trajectory optimization for better virtual try-on performance.
Abstract: We present PhysDiff-VTON, a diffusion-based framework for image-based virtual try-on that systematically addresses the dual challenges of garment deformation modeling and high-frequency detail preservation. The core innovation lies in integrating physics-inspired mechanisms into the diffusion process: a pose-guided deformable warping module simulates fabric dynamics by predicting spatial offsets conditioned on human pose semantics, while wavelet-enhanced feature decomposition explicitly preserves texture fidelity through frequency-aware attention. Further enhancing generation quality, a novel sampling strategy optimizes the denoising trajectory via least action principles, enforcing temporal coherence, spatial smoothness, and multi-scale structural consistency. Comprehensive evaluations across multiple datasets demonstrate significant improvements in both geometric plausibility and perceptual quality compared to existing approaches. The framework establishes a new paradigm for synthesizing photorealistic try-on images that adhere to physical constraints while maintaining intricate garment details, advancing the practical applicability of diffusion models in fashion technology.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 12374
Loading