Keywords: Structured Energy network, Energy-based Model, Trainable Loss function, Dynamic Loss function, 3D Human Pose Estimation
Abstract: We propose SEAL-Pose, a method that trains models to predict more plausible 3D human poses through a trainable loss function that dynamically learns the output structures of data.
SEAL-Pose extends the Structured Energy As Loss (SEAL) framework, originally designed for structured prediction and limited to probabilistic models, to deterministic models, particularly for 3D human pose estimation.
SEAL-Pose enables pose estimation models to learn joint dependencies via structured energy networks that automatically capture body structure during training without explicit prior knowledge and is applicable to any backbone models.
We also suggest evaluation metrics such as the limb symmetry error (LSE) and body segment length error (BSLE) to assess the structural consistency of the predicted poses.
These metrics measure overall structural preservation, which the vast majority of existing metrics do not capture.
Experimental results on the Human3.6M, MPI-INF-3DHP, and Human3.6M WholeBody datasets show that SEAL-Pose not only reduces per-joint pose estimation errors but also generates more plausible poses.
In addition, SEAL-Pose demonstrates more significant improvements in challenging settings such as monocular single-frame pose estimation.
Our work also highlights the potential of employing trainable loss functions for tasks with complex output structures, offering a promising direction for future research.
Submission Number: 4
Loading