Abstract: Textbooks are essential learning tools for children in schools, and ensuring their accessibility is crucial for the inclusion of students with disabilities. However, manually adapting textbooks is a time-consuming process that fails to meet the growing needs. Our long term project aims to automate textbook adaptation, with this paper focusing on extraction and structuring-the first crucial step in the adaptation pipeline. By leveraging deep learning and computer vision, we efficiently extract and structure multimodal content, ensuring a well-organized representation of textbook elements. This approach enhances automation efficiency, supports equitable education, and facilitates the development of intelligent tools for inclusive learning. Our system achieved 98% accuracy in detecting layout formats and 93% accuracy in identifying exercise boxes, and the detection of their elements was directly influenced by the amount of training data available for each element type.
External IDs:dblp:conf/icalt/LashebPBLBH25
Loading