Interpretable Learning for Detection of Cognitive Distortions from Natural Language Text

Interpretable Learning for Detection of Cognitive Distortions from Natural Language Text

ACL ARR 2025 May Submission388 Authors

12 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: We developed a technology that, based on a dataset annotated for cognitive distortions, builds an interpretable model capable of detecting cognitive distortions in natural language texts. The learning and detection technologies are based on structural pattern (N-gram) matching with the ”priority on order” principle. We investigated and released two types of detection models: plain binary classification and a model based on a multi-class representation. We optimized the hyper-parameters of the models and achieved an accuracy of 0.92 and an F1 score of 0.95 in a cross-validation experiment. Additionally, we achieved over 1000 times higher performance and lower computational cost compared to LLM-based alternatives.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: healthcare applications, clinical NLP, NLP for social good, stance detection, feature attribution, topic modeling, model editing

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models

Languages Studied: English

Keywords: healthcare applications, clinical NLP, NLP for social good, stance detection, feature attribution, topic modeling, model editing

Submission Number: 388

Loading