TL;DR: This paper argues that generative AI requires a shift to Distributional Integrity to prevent backdoors from inheriting through the supply chain and polluting recursive synthetic data pipelines.
Abstract: Foundation models, such as Diffusion Models (DMs) and Large Language Models (LLMs), are now widely integrated into digital systems. This widespread use introduces a specific security risk: generative backdoors. Unlike traditional models where backdoors cause simple classification errors, generative backdoors hide within the model’s output distribution. This makes them difficult to detect using standard pattern-based methods.This paper argues that current defensive strategies are insufficient for generative AI. \textbf{We propose Distributional Integrity, a framework that focuses on maintaining the stability and accuracy of the model's data distribution.} We identify two primary threats: backdoors within the model supply chain and the contamination of synthetic data pipelines. To address these, we advocate for a shift toward cross-modal certification and parameter-level verification. These methods aim to secure the AI-generated content (AIGC) ecosystem against inherited vulnerabilities.
Lay Summary: Foundation models, such as Diffusion Models (DMs) and Large Language Models (LLMs), are now widely integrated into digital systems. This widespread use introduces a specific security risk: generative backdoors. Unlike traditional models where backdoors cause simple classification errors, generative backdoors hide within the model’s output distribution. This makes them difficult to detect using standard pattern-based methods.This paper argues that current defensive strategies are insufficient for generative AI. \textbf{We propose Distributional Integrity, a framework that focuses on maintaining the stability and accuracy of the model's data distribution.} We identify two primary threats: backdoors within the model supply chain and the contamination of synthetic data pipelines. To address these, we advocate for a shift toward cross-modal certification and parameter-level verification. These methods aim to secure the AI-generated content (AIGC) ecosystem against inherited vulnerabilities.
Primary Area: System Risks, Safety, and Government Policy
Keywords: Backdoor Attacks & Defenses, Generative AI Security (AIGC) Robustness
Originally Submitted PDF: pdf
Submission Number: 854
Loading