Abstract: Temporal action detection aims to recognize the action category and determine each action instance's starting and ending time in untrimmed videos. The mixed method has demonstrated notable performance by integrating both anchor-based and anchor-free approaches. However, while it leverages the strengths of each method, it also retains their respective limitations. For instance, the anchor-based approach depends on manually crafted anchors tailored to specific datasets, while the anchor-free approach predicts potential action instances at each temporal position, resulting in a significant number of false positives in category prediction. The inclusion of these limitations undermines the potential benefits of the mixed method. In this paper, we propose a novel Boundary Discretization and Reliable Classification Network (BDRC-Net) that addresses the issues above by introducing boundary discretization and reliable classification modules. Specifically, the boundary discretization module (BDM) elegantly merges anchor-based and anchor-free approaches in the form of boundary discretization, eliminating the need for the traditional handcrafted anchor design. Furthermore, the reliable classification module (RCM) predicts reliable global action categories to reduce false positives. Extensive experiments conducted on different benchmarks demonstrate that our proposed method achieves competitive detection performance.
External IDs:doi:10.1109/tmm.2025.3543108
Loading