The Mutual Information Uncertainty Range: A Non-Parametric Test for Dependent Censoring

ICLR 2026 Conference Submission22479 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Independent censoring, Survival analysis, Conditional mutual information
Abstract: Learning a survival prediction model can be viewed as regression with the added complication of censoring. Each subject $x_i$ has a true event time $E_i$ and a censoring time $C_i$, yet we only observe $T_i = \min(E_i, C_i)$ and $\delta_i = \mathbf{1}(E_i \leq C_i)$. Many standard survival methods implicitly assume the $E$ and $C$ are independent, conditioned on $X$: $E \perp C \mid X$, which is not always true. To produce effective survival models, it would be useful to know this (in)dependency; however, this is difficult to determine as, for each subject, we observe either $E_i$ or $C_i$, but never both. To address this challenge, we introduce, for each $t>0$, indicator variables $E_{i,t}\ =\ {\bf 1}( E_i > t )\ \in\ \lbrace0,1,?\rbrace$, where `$?$' represents unobserved values due to censoring; with a similar definition for $C_{i,t}$. We use this set of $\lbrace E_{i,t}, C_{i,t} \rbrace$ over the set of instance $i$ and various times $t$, to develop a nonparametric diagnostic for testing whether $E \perp C \mid X$, based on the width of the Conditional Mutual Information (CMI) uncertainty range between $E_{i,t}$ and $C_{i,t}$ given $X_i$ over the unknown values ``$?$'', defined as $\Delta I^t = I^t_{\max} - I^t_{\min}$. Under independence, $\Delta I^t$ follows a characteristic null distribution from random data completions. Dependent censoring imposes structure, producing atypical $\Delta I^t$ values. To make this computation feasible, we formulate the CMI bound computation as a decomposable integer program, which we solve exactly with a dynamic programming algorithm of polynomial complexity. Combined with a permutation test, this yields a scalable, assumption-free tool for detecting dependent censoring. To evaluate the performance of the proposed method, we conducted experiments on different types of synthetic data where both the presence and strength of dependence could be controlled.
Primary Area: learning theory
Submission Number: 22479
Loading