Beyond Labels: Explanatory Collapse due to Instruction Tuning in Protein LLMs

Published: 06 Oct 2025, Last Modified: 06 Oct 2025NeurIPS 2025 2nd Workshop FM4LS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Instruction Tuning, LLM, Explanatory Collapse, Protein Understanding
Abstract: Instruction tuning on domain-specific datasets improves categorical accuracy but consistently degrades explanatory behavior. In protein large language models, we observe that tuned models ignore explanation requests, fall back on generic templates, and produce degenerate structured outputs with trivial repetitions or hallucinated lists. This collapse of expressive diversity renders models terse and uninformative, limiting their scientific utility. Our findings highlight a trade-off in current protein instruction-tuning practices: accuracy is gained at the cost of interpretive value, underscoring the need for strategies that preserve explanatory depth.
Submission Number: 54
Loading