Culturally-Aware AI for Personalized Pregnancy Nutrition: Evaluating Context Augmentation Strategies in Diverse Indian Settings
Keywords: culturally-aware AI; personalized nutrition; pregnancy; maternal health; India; large language models (LLMs); context augmentation; retrieval-augmented generation (RAG); structured datasets (IFCT 2017, ICMR RDA); web retrieval; human evaluation; mean opinion score (MOS); medical safety; cultural relevance; completeness; low-resource settings; regional cuisines (Kerala, Andhra Pradesh, Tamil Nadu, Karnataka, West Bengal); anemia; gestational diabetes; immunosuppression; safety-critical systems.
TL;DR: Current LLMs fail to deliver safe, culturally relevant pregnancy meal plans in India—context augmentation helps but only modestly, and human oversight remains essential.
Abstract: Personalized pregnancy nutrition in India requires balancing medical safety, cultural fit, and day-to-day feasibility. We evaluate three LLM context-augmentation strategies: (E1) prompt-only, (E2) structured dataset integration, and (E3) dataset + targeted web retrieval across 20 profiles spanning five Indian states and multiple clinical contexts (e.g., anemia, bed rest, post-transplant). Human evaluation of 100 generated meal plans revealed modest improvements from context augmentation. Baseline LLMs achieved mediocre performance (medical safety 3.46/5, cultural relevance 3.57/5, overall quality 3.59/5). Dataset integration (E2) showed minimal gains in medical safety (+4%) but decreased overall quality (-3.6%). Web-augmented approaches (E3) achieved the best results with +6.9% improvement in medical safety and +8% in overall quality, though absolute scores remained moderate (3.70/5 and 3.87/5 respectively). Critical failure rates persisted across all configurations (E1: 31%, E2: 38%, E3: 21%), with issues including calorie miscalculations, contraindicated foods for medical conditions, and culturally inappropriate suggestions. High variance across profiles (σ=0.98–1.39) indicates inconsistent performance. We contribute (i) empirical evidence that current LLMs require substantial improvement for healthcare deployment, (ii) demonstration that context augmentation provides limited benefits without addressing fundamental model limitations, and (iii) identification of persistent safety failures requiring human oversight. Our findings emphasize that autonomous deployment remains premature for this critical healthcare domain.
Submission Number: 223
Loading