Resolving Disagreement Problems in Explainable Artifi- cial Intelligence Through Multi-Criteria Decision Analysis

TMLR Paper6923 Authors

08 Jan 2026 (modified: 11 Apr 2026)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Post-hoc explanation methods are critical for building trust in complex black-box artificial intelligence (AI) models; however, they often suffer from the disagreement problem, which provides conflicting explanations for the same prediction. This inconsistency undermines reliability and poses a significant barrier to adoption in high-stakes domains that demand trustworthiness and transparency. To address this, we move beyond the search for a single best method and instead propose a principled, preference-driven framework for selecting the best suitable explanation technique for a given context: \emph{which specific post-hoc explanation methods to use and when?} We formalize this selection process as a Multi-Criteria Decision Analysis (MCDA) problem. Our framework evaluates a set of state-of-the-art post-hoc explanation methods (e.g., LIME, BayesLIME, SHAP, BayesSHAP, and Anchor) against six explanation evaluation metrics: fidelity, identity, stability, separability, faithfulness, and consistency. We then apply a suite of established MCDA techniques such as Simple Additive Weighting (SAW), Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS), and Elimination and Choice Translating Reality (ELECTRE I) to aggregate these evaluations based on user-defined priorities. By comparing rankings produced by diverse decision logics across multiple predictive models and real-world datasets, we demonstrate not only how to select the optimal explanation method under different priority scenarios (e.g., favoring fidelity vs. stability), but also how to expose critical trade-offs that are invisible to simpler aggregation approaches. Our work provides a robust, transparent, and adaptable methodology for preference-aware explainer selection, transforming explanation disagreement into a structured and justifiable decision-making process.
Submission Type: Long submission (more than 12 pages of main content)
Changes Since Last Submission: *We would like to thank all reviewers for their time and effort in reviewing our work. We found the reviews very helpful and believe they will help us improve the content and readability of the manuscript. Based on the reviewers' comments, we have revised the manuscript and addressed all issues raised to the best of our ability (highlighted in blue).* ## Responses to the Comments from Reviewer dRQ4 Regarding the Identity metric, we have corrected the wrong citation in Section 3.2 of the revised manuscript. Regarding the Separability Metric, we have revised this section to provide a clearer interpretation and ensure that the definition, equation, and accompanying discussion are fully consistent with those presented in Section 3.2 of the revised manuscript. Regarding the addition of experiments on different variants within the same class of explainer, we have extended our study to include such variants. Specifically, we added LIME-type methods (e.g., BayesLIME) and SHAP-type methods (e.g., BayesSHAP), and evaluated them using the same six explanation evaluation metrics and the same MCDA pipeline in the revised version. We also added the visual explanation of these two explanation techniques, as depicted in Figures 4 and 6 in our revised version. Also, we revised Figure 2 (methodology Figure) to better align with the actual framing of our work. Regarding the minor corrections, we addressed all of them in the revised version of the manuscript. ## Responses to the Comments from Reviewer 8psN Regarding the expansion of the evaluation set size, in the revised version, we increased the test set size (e.g., by computing results on the full test set) and reported the corresponding results in Sections 5.2 and 5.3 of our revised manuscript. ## Responses to the Comments from Reviewer Tpfw Regarding the clarification of the ground truth for the Fidelity metric, we have clarified this in Section 3.2 of the revised manuscript. Regarding hyperparameter specification, we have explicitly fixed and documented the relevant hyperparameters (e.g., k, epsilon/neighborhood size, perturbation scheme, and random seeds) in Section 3.2 of the revised manuscript. Regarding the increase in sample size, we expanded the evaluation by computing results on the full test set and reported the corresponding findings in Sections 5.2 and 5.3 of the revised manuscript. Regarding a user study (or practitioner-facing evaluation), we acknowledge its importance and plan to evaluate the proposed framework in future work through a user study or practitioner-oriented assessment. This evaluation will consider measurable outcomes such as decision quality, debugging time, error detection, trust calibration, and the consistency of explainer selection across users. This has been added to the Limitations and Future Work section (Section 7) of the revised manuscript. Regarding the reframing of our central claim, we have revised the abstract, introduction, and conclusion to shift the focus from “resolving disagreement” to “providing a principled framework for structured explainer selection based on user preferences,” thereby accurately representing the contribution. Regarding the remaining minor comments, we have addressed all of them in the revised manuscript.
Assigned Action Editor: ~Fabio_Stella1
Submission Number: 6923
Loading