StorySpark: Expert-Annotated QA Pairs with Real-World Knowledge for Children Storytelling

ACL ARR 2024 June Submission4567 Authors

16 Jun 2024 (modified: 02 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Interactive storytelling between parents and children is a common activity in the real world, in which parents expect to teach children both language skills and real-world knowledge beyond the story narratives. While increasing AI-assisted storytelling systems have been developed and used in children's story-based interaction and learning scenarios, existing systems often fall short of generating real-world knowledge infused conversation to meet parents' practical expectation of interactive storytelling, with the foremost reason of existing question-answering (QA) datasets these systems build on focusing mainly on the knowledge answerable within the story content. To bridge this gap, we designed an annotation framework empowered by real-world knowledge graph to facilitate experts' annotations while collecting their mental procedures. Further, we leveraged this annotation framework to build StorySpark, a dataset of 5,868 expert-annotated QA pairs with real-world knowledge beyond story context. A comprehensive benchmarking experiment, including both automated and human expert evaluation within various QA pair generation (QAG) settings, demonstrates the usability of our StorySpark on the story-based knowledgeable QAG task. Worth mentioning that a traditional compact model fine-tuned on StorySpark can reliably outperform robust LLMs. This further highlights the complexity of such real-world tasks.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: NLP datasets, educational applications, benchmarking, question generation, interactive storytelling
Contribution Types: Data resources
Languages Studied: English
Submission Number: 4567