Mechanistic Interpretability of Semantic Abstraction in Biomedical Text

Published: 23 Sept 2025, Last Modified: 17 Feb 2026CogInterp @ NeurIPS 2025 RejectEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Biomedical NLP, Mechanistic interpretability, Semantic abstraction, Register-invariant representations, BioBERT, SciBERT, Clinical-T5, BioGPT, Activation patching, Transformer analysis, Representational similarity, Trajectory visualization, Causal probing, Plain-language biomedical text, Clinical communication
Abstract: We look into whether biomedical language models create register-invariant semantic representations of sentences---a cognitive ability that allows consistent and reliable clinical communication across different language styles. Using aligned sentence pairs (technical vs. plain language abstracts that mean the same thing), we analyze how BioBERT, SciBERT, Clinical-T5, and BioGPT react to varying registers through similarity measures, trajectory visualization, and activation patching. The results show that models converge to shared semantic states in mid-to-late layers, revealing the internal processes by which these models keep meaning across stylistic variation.
Submission Number: 108
Loading