Care Is Not a Style Transfer Task: Evaluating Culturally Grounded Clinical AI
Keywords: Value Sensitive Design, AI in Healthcare, Responsible AI
Abstract: Clinical conversational AI systems are commonly
evaluated on accuracy, safety, helpfulness, lan-
guage quality, and trustworthiness. These dimen-
sions are necessary but incomplete when the sys-
tem participates in clinical care, where appropri-
ate communication depends on local language,
practical access constraints, emotional interpre-
tation, and role boundaries. This paper presents
a case study of expert evaluations of a Spanish-
language tuberculosis support chatbot. Although
the system received high ratings on conventional
dimensions, qualitative evaluator comments re-
vealed recurring gaps: the system advised before
exploring, over-interpreted distress, used inacces-
sible language, offered impractical guidance, and
risked blurring the boundary between automated
support and human care. Drawing on Value Sen-
sitive Design, I argue that culturally grounded
clinical AI requires evaluation as a sociotechnical
system that must scaffold care while preserving
human judgment, accountability, and relational
meaning. I propose five evaluation dimensions—
interpretive humility, exploratory empathy, practi-
cal situatedness, local communicative legitimacy,
and boundary-preserving care—that conventional
metrics only partially capture
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 76
Loading