ReadEasy: Bridging Reading Accessibility Gaps using Responsible Multimodal Simplification with Generative AI

Published: 24 Sept 2025, Last Modified: 07 Nov 2025NeurIPS 2025 Workshop GenProCCEveryoneRevisionsBibTeXCC BY 4.0
Track: Regular paper
Keywords: Text Simplification, Multimodal Learning, Retrieval-Augmented Generation, Graph-based Retrieval, Accessibility, Education Technology, Healthcare Communication, Human-in-the-Loop Systems, Responsible AI, Large Language Models
TL;DR: We present a multimodal, retrieval-augmented system that simplifies text and images with graph-based retrieval and human feedback, improving accessibility for education, healthcare, and technical domains.
Abstract: Complex, multimodal content remains a barrier to accessibility in education, healthcare, and technical domains. We present readeasy.org, a multimodal, retrieval-augmented system that jointly simplifies text and images while preserving context. The pipeline integrates Age-of-Acquisition (AoA) guidance and word-sense disambiguation with a $\textbf{graph-based retrieval-augmented generation (RAG)}$ module that fetches domain-specific definitions from curated knowledge bases; an image captioner produces level-aware captions for diagrams and schematics. A real-time feedback loop allows users to refine outputs, adapt terminology, and steer retrieval. Across $\textbf{14{,}000}$ items spanning educational, medical, and technical sources, the system improves readability over a strong Large Language Model (LLM) baseline (GPT-4): $\textbf{+22.21\% SARI}$ and $\textbf{+14.11\% Flesch Reading Ease}$. These gains prioritize accessibility over exact form preservation, as reflected by BLEU and Cosine Similarity. Graph-based RAG increases domain-term retrieval precision by $\textbf{11\%}$. In teacher-facilitated classroom use with $\textbf{200 K--12 students}$ (ages grouped 5/7/9/11), and in additional evaluations with medical and technical professionals, users reported that the system’s outputs were easier to understand and more useful for non-experts. Incorporating user feedback yielded a further $\textbf{8\%}$% improvement in content relevance and a $\textbf{15\%}$% increase in user satisfaction. By coupling multimodal processing with knowledge-grounded retrieval and human-in-the-loop adaptation, this work advances practical accessibility for high-impact domains while aligning with responsible deployment principles.
Submission Number: 58
Loading