Auditable Bits or Covert Influence? Safe Revelation Complexity in Partially Observable Assistance Games

Published: 03 Jun 2026, Last Modified: 03 Jun 2026AI4GOOD Workshop 2026 RegularEveryoneRevisionsBibTeXCC BY 4.0
Keywords: trustworthy AI, cooperative AI, partially observable assistance games, auditable communication, safe revelation complexity, graph coloring, conditional graph entropy, covert channels, Blackwell order
Abstract: We study the minimum explicit, auditable communication required for optimal cooperation in a one-shot directional slice of partially observable assistance, under a safety restriction that forbids observation-channel modulation. We define the *safe revelation complexity* of a game $M$ and prove the exact fixed-length formula $$\\mathrm{SRC}_{\\mathrm{FL}}(M)=\\lceil \\log_2 \\chi(G_M)\\rceil$$ Here, $G_M$ is the safe-confusability graph induced by receiver-conditioned optimal-action disagreements. For i.i.d. repetitions, we prove the asymptotic rate identity $\\mathrm{SRC}_{\\infty}(M)=H_{G_M}(Y\\mid X)$. We further show that every finite graph arises exactly as a safe-confusability graph, implying NP-hardness of exact fixed-budget revelation. Finally, we construct covert observation-channel augmentations in which each available kernel is individually strictly Blackwell-inferior to an approved baseline, yet the sender's choice of kernel collapses the explicit communication requirement to zero. Exact finite-instance experiments validate the fixed-length, asymptotic, and trust-separation predictions.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 23
Loading