Abstract: We introduce ROAR (Robust Object Removal and Re-annotation), a scalable framework for privacy-preserving dataset obfuscation that removes sensitive objects instead of modifying them. Designed for practical deployment, our method integrates instance segmentation with generative inpainting to eliminate identifiable entities while preserving scene integrity. Extensive evaluations on 2D COCO-based object detection show that ROAR achieves 87.5% of baseline average precision (AP), whereas image dropping achieves only 74.2%, highlighting the advantage of scrubbing in preserving dataset utility. In NeRF-based 3D reconstruction, our method incurs a PSNR loss of at most 1.66,dB while maintaining SSIM and improving LPIPS, demonstrating superior perceptual quality. ROAR follows a structured pipeline of detection, inpainting-based removal, re-annotation, and evaluation. We systematically evaluate the privacy-utility trade-off across both 2D and 3D tasks, showing that object removal offers a more effective balance than traditional methods. Our findings establish ROAR as a practical privacy framework, achieving strong guarantees with minimal performance trade-offs. The results highlight challenges in generative inpainting, occlusion-robust segmentation, and task-specific scrubbing, laying the groundwork for real-world privacy-preserving vision systems.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We sincerely thank the reviewers for their careful reading of our manuscript and for their constructive and insightful feedback. We are encouraged that all reviewers found the core claims supported by convincing evidence and considered the work relevant to the TMLR audience. In response to the comments, we revised the manuscript to clarify the scope of applicability, strengthen the conceptual grounding with respect to privacy, and provide a more explicit discussion of segmentation errors, visual artifacts, and hallucinations. All changes made to the manuscript are highlighted in color. The main revisions are summarized below:
- Clarified the scope and applicability of removal-based privacy, explicitly distinguishing context-oriented tasks from subject-centric tasks (Introduction).
- Strengthened the positioning of ROAR with respect to theoretical privacy notions by contrasting removal-based data minimization with differential privacy (Related Work).
- Made explicit that ROAR’s privacy guarantees are upper-bounded by detector recall and mask accuracy, and analyzed boundary expansion as a robustness mechanism (Sections 3.1 and 4.3).
- Expanded the discussion of visual artifacts, hallucinations, and physical inconsistencies introduced by generative inpainting, particularly in crowded scenes (Sections 4.3 and 4.5).
- Clarified the role of oracle-based re-annotation as an annotation-level safeguard that prevents artifacts from silently propagating into the final dataset (Section 3.3).
Assigned Action Editor: ~Tongliang_Liu1
Submission Number: 6304
Loading