BadDet+: Robust Backdoor Attacks for Object Detection

BadDet+: Robust Backdoor Attacks for Object Detection

ICLR 2026 Conference Submission16302 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: backdoor attack, object detection, adversarial machine learning, machine learning security

Abstract: Backdoor attacks threaten the integrity of deep learning models by allowing adversaries to implant hidden behaviors that activate only under specific conditions. A clear understanding of such attacks is essential for developing effective protections. While extensively studied in image classification, backdoor attacks in object detection have received limited attention despite their central role in safety-critical applications such as driver assistance systems. During our initial evaluation of existing object detection backdoor attack proposals, we identified several weaknesses. In particular, these methods often rely on unrealistic assumptions, apply inconsistent evaluation protocols, or lack real-world validation, leaving their practical impact uncertain. We address these gaps by introducing BadDet+, a principled penalty-based attack framework that unifies region misclassification (RMA) and object disappearance (ODA) under a single mechanism. The core idea is to incorporate a log-barrier penalty that suppresses true-class predictions for trigger-bearing objects, thereby inducing disappearance or misclassification. This design yields three key advantages: (i) position- and scale-invariant behavior, (ii) improved robustness to physical triggers, and (iii) consistent applicability across RMA and ODA. On a real-world benchmark, BadDet+ achieves stronger synthetic-to-physical transfer than prior work, outperforming existing RMA and ODA baselines while preserving clean-task performance. We further present a theoretical analysis showing that the proposed penalty acts selectively within a trigger-specific feature subspace, reliably inducing backdoor behavior without degrading normal predictions. Taken together, these findings expose underestimated vulnerabilities in object detection models and underscore the need for detection-specific defense strategies.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 16302

Loading