Abstract: Interactive storytelling between parents and children is a common activity in the real world, in which parents expect to teach children both language skills and real-world knowledge beyond the story narratives. 
While increasing AI-assisted storytelling systems have been developed and used in children's story-based interaction and learning scenarios, existing systems often fall short of generating real-world knowledge infused conversation to meet parents' practical 
expectation of interactive storytelling, with the foremost reason of existing question-answering (QA) datasets these systems build on focusing mainly on the knowledge answerable within the story content.
To bridge this gap, we designed an annotation framework empowered by real-world knowledge graph to facilitate experts' annotations while collecting their mental procedures.
Further, we leveraged this annotation framework to build StorySpark, a dataset of 5,868 expert-annotated QA pairs with real-world knowledge beyond story context.
A comprehensive benchmarking experiment, including both automated and human expert evaluation within various QA pair generation (QAG) settings, demonstrates the usability of our StorySpark on the story-based knowledgeable QAG task. 
Worth mentioning that a traditional compact model fine-tuned on StorySpark can reliably outperform robust LLMs. This further highlights the complexity of such real-world tasks.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: NLP datasets, educational applications, benchmarking, question generation,  interactive storytelling
Contribution Types: Data resources
Languages Studied: English
Submission Number: 4567
Loading