Abstract: We propose a novel approach to supervised action segmentation that explicitly models uncertainty over framewise class predictions using the Dirichlet distribution. In
contrast to most SOTA methods that rely on the multistage refinement of initially proposed frame labels, our approach recalibrates frame-level class distributions through
a Dirichlet diffusion process, which is analytically tractable
(closed-form) and hence computationally efficient. Diffusion parameters are estimated only at a sparse set of
keyframes using a lightweight module, further reducing
memory and runtime costs. Experiments on four benchmark
datasets – Breakfast, GTEA, 50Salads, and Assembly101 –
show that our approach achieves superior accuracy with
fewer parameters and lower computational complexity than
existing approaches.
Loading