CLARIFY: Concept Level Active-Learning Ranker Interpreter for Systematic Reviews

Published: 15 Oct 2025, Last Modified: 31 Oct 2025BNAIC/BeNeLearn 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Track: Type A (Regular Papers)
Keywords: Explainable Active learning, Systematic review, XAI, Concept activation vectors, ASReview, Interpretability
Abstract: Transformer-based ranking models have recently advanced active-learning tools for accelerating systematic reviews, but the internal criteria they use to rank documents are opaque, limiting their utility in scientific decision making. We introduce CLARIFY, a post-hoc explainability method for active-learning applications that (i) automatically derives high-level concepts from the model's embedding space, (ii) quantifies each concept's influence on ranking, and (iii) links the most influential concepts to sentences via occlusion and re-projection without retraining or manual intervention. Evaluated on three SYNERGY systematic review datasets, CLARIFY uncovers latent concepts and represents them in a human understandable manner. Similarity metrics show discernible relations between these concepts and elements of the inclusion criteria. By making model reasoning transparent, CLARIFY supports accountable, evidence-based decision-making in systematic-review screening. Our open-source work can be found on doi.org/10.5281/zenodo.16797395.
Serve As Reviewer: ~Jelle_Jasper_Teijema1
Submission Number: 50
Loading