AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

TMLR Paper6464 Authors

10 Nov 2025 (modified: 03 Jan 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a proliferation of attacks and defenses.However, existing studies largely treat these threats in isolation, lacking a coherent framework to expose their shared principles and inter-dependencies. This fragmented view hinders systematic understanding and limits the design of comprehensive defenses. Crucially, the two foundational assets of ML—data and models—are no longer independent; vulnerabilities in one directly compromise the other. The absence of a holistic framework leaves open questions about how these bidirectional risks propagate across the ML pipeline. To address this critical gap, we propose a unified closed-loop threat taxonomy that explicitly frames model–data interactions along four directional axes. Our framework offers a principled lens for analyzing and defending foundation models. The resulting four classes of security threats represent distinct but interrelated categories of attacks: (1) Data→Data (D→D): including data decryption attacks and watermark removal attacks. (2) Data→Model (D→M): including poisoning, harmful fine-tuning attacks and jailbreak attacks; (3) Model→Data (M→D): including model inversion, membership inference attacks, and training data extraction attacks; (4) Model→Model (M→M): including model extraction attacks. We conduct a systematic review that analyzes the mathematical formulations, attack and defense strategies, and applications across the vision, language, audio, and graph domains. Our unified framework elucidates the underlying connections among these security threats and establishes a foundation for developing scalable, transferable, and cross-modal security strategies—particularly within the landscape of foundation models.

Submission Type: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Mengnan_Du1

Submission Number: 6464

Loading