Position: Generative Distributional Integrity against Backdoor Attacks

Shuaibiao Han; Ruiyang Ni; Zhiguo Yang; Changlong Li; Perley Xu; Wenjie Ruan

Position: Generative Distributional Integrity against Backdoor Attacks

Shuaibiao Han, Ruiyang Ni, Zhiguo Yang, Changlong Li, Perley Xu, Wenjie Ruan

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 Position Paper Track regularEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: This paper argues that generative AI requires a shift to Distributional Integrity to prevent backdoors from inheriting through the supply chain and polluting recursive synthetic data pipelines.

Abstract: Foundation models, such as Diffusion Models (DMs) and Large Language Models (LLMs), are now widely integrated into digital systems. This widespread use introduces a specific security risk: generative backdoors. Unlike traditional models where backdoors cause simple classification errors, generative backdoors hide within the model’s output distribution. This makes them difficult to detect using standard pattern-based methods.This paper argues that current defensive strategies are insufficient for generative AI. \textbf{We propose Distributional Integrity, a framework that focuses on maintaining the stability and accuracy of the model's data distribution.} We identify two primary threats: backdoors within the model supply chain and the contamination of synthetic data pipelines. To address these, we advocate for a shift toward cross-modal certification and parameter-level verification. These methods aim to secure the AI-generated content (AIGC) ecosystem against inherited vulnerabilities.

Lay Summary: Foundation models, such as Diffusion Models (DMs) and Large Language Models (LLMs), are now widely integrated into digital systems. This widespread use introduces a specific security risk: generative backdoors. Unlike traditional models where backdoors cause simple classification errors, generative backdoors hide within the model’s output distribution. This makes them difficult to detect using standard pattern-based methods.This paper argues that current defensive strategies are insufficient for generative AI. \textbf{We propose Distributional Integrity, a framework that focuses on maintaining the stability and accuracy of the model's data distribution.} We identify two primary threats: backdoors within the model supply chain and the contamination of synthetic data pipelines. To address these, we advocate for a shift toward cross-modal certification and parameter-level verification. These methods aim to secure the AI-generated content (AIGC) ecosystem against inherited vulnerabilities.

Primary Area: System Risks, Safety, and Government Policy

Keywords: Backdoor Attacks & Defenses, Generative AI Security (AIGC) Robustness

Originally Submitted PDF: pdf

Submission Number: 854

Loading