Social Determinants of Health Prediction for ICD-9 Code with Reasoning Models

Published: 27 Nov 2025, Last Modified: 10 Dec 2025ML4H 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Social determinants of health, ICD-9 codes, clinical notes, large language models, MIMIC-III
TL;DR: We evaluate the performance of reasoning language models to predict V-code from clinical documentation.
Track: Findings
Abstract: Social Determinants of Health correlate with patient outcomes but are rarely captured in structured data. Recent attention has been given to automatically extracting these markers from clinical text to supplement diagnostic systems with knowledge of patients’ social circumstances. Large language models demonstrate strong performance in identifying Social Determinants of Health labels from sentences. However, prediction in large admissions or longitudinal notes is challenging given long distance dependencies. In this paper, we explore hospital admission multi-label Social Determinants of Health ICD-9 code classification on the MIMIC-III dataset using reasoning models and traditional large language models. We exploit existing ICD-9 codes for prediction on admissions, which achieved a 89% F1. Our contributions include our findings, missing SDoH codes in 139 admissions, and code to reproduce the results.
General Area: Models and Methods
Specific Subject Areas: Public & Social Health
Supplementary Material: zip
Data And Code Availability: Yes
Ethics Board Approval: No
Entered Conflicts: I confirm the above
Anonymity: I confirm the above
Submission Number: 226
Loading