Abstract: Interactive storytelling between parents and children is a common activity in the real world, in which parents expect to teach children both language skills and real-world knowledge beyond the story narratives.
While increasing AI-assisted storytelling systems have been developed and used in children's story-based interaction and learning scenarios, existing systems often fall short of generating real-world knowledge infused conversation to meet parents' practical
expectation of interactive storytelling, with the foremost reason of existing question-answering (QA) datasets these systems build on focusing mainly on the knowledge answerable within the story content.
To bridge this gap, we designed an annotation framework empowered by real-world knowledge graph to facilitate experts' annotations while collecting their mental procedures.
Further, we leveraged this annotation framework to build StorySpark, a dataset of 5,868 expert-annotated QA pairs with real-world knowledge beyond story context.
A comprehensive benchmarking experiment, including both automated and human expert evaluation within various QA pair generation (QAG) settings, demonstrates the usability of our StorySpark on the story-based knowledgeable QAG task.
Worth mentioning that a traditional compact model fine-tuned on StorySpark can reliably outperform robust LLMs. This further highlights the complexity of such real-world tasks.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: NLP datasets, educational applications, benchmarking, question generation, interactive storytelling
Contribution Types: Data resources
Languages Studied: English
Submission Number: 4567
Loading