MCIR: A Feature Dependence-Aware  Explainability Method with Reliability Guarantees

MCIR: A Feature Dependence-Aware Explainability Method with Reliability Guarantees

TMLR Paper9796 Authors

16 Jun 2026 (modified: 20 Jun 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: As modern machine learning models are deployed in high-stakes, data-rich environments, the interactions among features have grown more intricate and less amenable to traditional interpretation. Many explanation methods fail when features are strongly dependent. In the presence of multicollinearity or near-duplicate predictors, existing value attribution tools such as SHAP, LIME, HSIC, MI/CMI, and SAGE often distribute importance across redundant features, obscuring which variables represent "important and unique information." This may lead to unstable rankings, jeopardising importance scores, and usually results in a high computational cost. Recent correlation-aware approaches, such as CIR or BlockCIR, offer partial improvements, but still struggle to fully separate redundancy from unique contributions at the feature level. To address this, we propose the Mutual Correlation Impact Ratio Method (MCIR-M), a dependence-aware global feature-importance procedure that quantifies the unique information contributed by each feature beyond its correlated neighbours. MCIR-M introduces the score Mutual Correlation Impact Ratio (MCIR) that conditions each feature on a small set of its most correlated neighbours and computes a normalized ratio of conditional information having a value range, which is comparable across tasks, and collapses to zero when a feature is redundant, enabling clear redundancy detection. In addition to MCIR, we introduce a lightweight estimation procedure for computing MCIR scores using only a fraction of the available data while preserving the attribution behaviour of the full model. Across a synthetic household-energy dataset and the real UCI HAR benchmark, MCIR yields more stable and dependence-aware rankings than SHAP (independent and conditional), SAGE, HSIC, MI-based scores, and correlation-aware baselines such as CIR or BlockCIR. Lightweight explanations reduce runtime manifold and preserve over 95% topfeature agreement in the synthetic benchmark setting while maintaining moderate overlap on the more challenging HAR dataset. These results demonstrate that MCIR-M provides a practical and scalable solution for global explanation in settings with strong feature dependence.

Submission Type: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=UHMkfgIVbS&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)

Changes Since Last Submission: ## Changes Since Previous TMLR Submission (#6812) Since Submission #6812, the manuscript has been substantially revised in response to reviewer and Action Editor feedback regarding presentation, organization, theoretical clarity, parameter selection, and experimental justification. ### Presentation and Organization (Sections 1–2) The Introduction (pp. 1–3) and Related Work (pp. 3–5) were extensively rewritten to improve readability and motivation. We now: * Introduce the lightweight MCIR variant earlier. * Explicitly distinguish between the MCIR-M framework and the MCIR metric. * Clarify that MCIR provides statistical attribution under feature dependence and does not perform causal identification. * Add new comparison tables (Tables 1–2) positioning MCIR-M against SHAP, SAGE, MI/CMI, HSIC, PCIR, BlockCIR, and related methods. ### Lightweight Framework Formalization (Section 3, pp. 5–7) The preliminaries section was substantially expanded. New material includes: * Formal definitions of the environment and lightweight environment. * Explicit distinction between population-level quantities and empirical estimators. * A new environment-similarity framework. * Orthogonal-Procrustes alignment and ranking-preservation formulation for lightweight explanations. ### Theoretical Development (Section 4, pp. 7–20) The theory section was significantly strengthened and reorganized. Major additions include: * Expanded derivation of MCIR. * Formal boundedness and redundancy-collapse results. * Incremental-information interpretation. * Finite-sample rank-stability analysis. * Estimator perturbation results. * Lightweight-fidelity guarantees. * Additional proofs and theoretical discussion throughout Section 4 and the appendices. ### Parameter Selection Framework (Section 4.4) A new subsection, \emph{Parameter Selection and Stability-Driven Design}, was added in response to reviewer concerns regarding neighborhood-size selection. This section introduces: * Auto-$\Phi$ neighborhood selection, * bootstrap-based rank-stability optimization, * automatic estimator selection, * and an explicit reproducible algorithm. ### Latent Confounding Analysis (Section 4.5) A new subsection, Behaviour under Latent Confounding, was added. It discusses: * hidden common-cause scenarios, * proxy-variable behavior, * limitations under latent confounding, * and the distinction between statistical and causal interpretation. ### Practical Workflow (end of Section 4) A new practical usage pipeline was added describing neighborhood construction, estimator selection, MCIR computation, reliability assessment, and lightweight deployment. ### Experimental Expansion (Sections 5–6) The experimental methodology and results sections were substantially expanded. New and revised analyses include: * stronger synthetic redundancy experiments, * expanded HAR and energy-system evaluations, * additional CIFAR-10 representation experiments, * deletion-based faithfulness studies, * stability and robustness analyses, * lightweight-fidelity evaluation, * revised figures and tables supporting the main claims. ### New Ablation Studies (Section 6 / Section 7.1.1) Additional ablation studies were introduced evaluating: * neighborhood-size sensitivity, * estimator sensitivity, * rank stability, * and computational trade-offs, providing empirical support for the Auto-$\Phi$ framework. ### New Latent-Confounding Experiments (Appendix) A new synthetic latent-confounding benchmark was added to complement the theoretical analysis and evaluate MCIR behavior under hidden common-cause scenarios. ### Assumptions and Reproducibility Following reviewer feedback: * Assumption 3 was expanded with additional bootstrap justification. * Section 6.1 was revised to clarify the role of ranking metrics, faithfulness measures, and statistical significance tests. * Reproducibility material was expanded through additional algorithms, implementation details, parameter-selection procedures, and supplementary proofs. contributions. ### Additional Revisions Addressing Reviewer and Editor Feedback In addition to the methodological and experimental changes above, the manuscript was extensively reorganized to improve clarity and navigability. Several sections were rewritten, notation was standardized throughout the paper, redundant material was removed, and explanations were streamlined to improve readability. We also expanded the discussion of reliability analysis, including the Explanation Reliability Index (ERI), lightweight-fidelity evaluation, and rank-preservation behavior. Overall, the revised manuscript contains substantial improvements in presentation, theoretical development, parameter-selection methodology, confounding analysis, experimental validation, and reproducibility while preserving the core MCIR-M framework and its main

Assigned Action Editor: ~Adams_Wai-Kin_Kong1

Submission Number: 9796

Loading