{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"Biased Oracle: Empathy &amp; Understandability in LLM-Generated Medical Explanations","text":"<p>Shunchang Liu, Guillaume Drui, Jianzhou Yao, Rikard Pettersson Human-Centered AI for Social Good: Peace, Health, Climate, ETH Z\u00fcrich</p> <p>Code \u00b7</p> <p>License \u00b7 Cite</p> <p></p>"},{"location":"#abstract","title":"Abstract","text":"<p>Large language models can tailor diagnostic explanations and tone to patient context, but do they remain understandable and empathetic across demographics? Biased Oracle evaluates GPT-4o and Claude-3.7 on 156 scenarios spanning demographics, education, conditions, and age. We combine readability metrics and LLM-as-a-Judge with human validation to expose systematic bias patterns.</p>"},{"location":"#key-findings","title":"Key Findings","text":"<ul> <li>Diagnosis dominates empathy (Alzheimer\u2019s highest; heart disease lowest).</li> <li>Education bias (medical degree \u2192 lower affective empathy).</li> <li>Age effects are rater-dependent (U-shape visible with some evaluators).</li> <li>LLM-as-Judge disagreement (poor inter-rater reliability between models).</li> </ul>"},{"location":"#quick-start","title":"Quick Start","text":"<p>```bash pip install -r requirements.txt</p>"},{"location":"#then-follow-codeempathy_scoreprompt_and_response_ipynb","title":"then follow Code/Empathy_Score/Prompt_and_Response_*.ipynb","text":""}]}