Difformer for Action Segmentation

Published: 23 Oct 2025, Last Modified: 26 Jan 2026ICCV 2025EveryoneCC BY 4.0
Abstract: We propose a novel approach to supervised action segmentation that explicitly models uncertainty over framewise class predictions using the Dirichlet distribution. In contrast to most SOTA methods that rely on the multistage refinement of initially proposed frame labels, our approach recalibrates frame-level class distributions through a Dirichlet diffusion process, which is analytically tractable (closed-form) and hence computationally efficient. Diffusion parameters are estimated only at a sparse set of keyframes using a lightweight module, further reducing memory and runtime costs. Experiments on four benchmark datasets – Breakfast, GTEA, 50Salads, and Assembly101 – show that our approach achieves superior accuracy with fewer parameters and lower computational complexity than existing approaches.
Loading