Pose-guided token selection for the recognition of activities of daily living

Published: 2025, Last Modified: 06 Nov 2025Image Vis. Comput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•PO-GUISE is a human motion and ADL-guided token selection for video transformers.•The resulting model improves the accuracy-GFLOPs trade-off during inference.•Our model integrates heatmap tokens for temporal and multi-actor prediction.•Sets new state-of-the-art results on ADL benchmarks at a reduced computational cost.
Loading