Abstract: Script Knowledge (Schank and Abelson, 1975)
has long been recognized as crucial for language understanding as it can help in filling in
unstated information in a narrative. However,
such knowledge is expensive to produce manually and difficult to induce from text due to
reporting bias (Gordon and Van Durme, 2013).
In this work, we are interested in the scientific
question of whether explicit script knowledge
is present and accessible through pre-trained
generative language models (LMs). To this
end, we introduce the task of generating full
event sequence descriptions (ESDs) given a scenario as a natural language prompt. Through
zero-shot probing, we find that generative LMs
produce poor ESDs with mostly omitted, irrelevant, repeated or misordered events. To address
this, we propose a pipeline-based script induction framework (SIF) which can generate good
quality ESDs for unseen scenarios (e.g., bake
a cake). SIF is a two-staged framework that
fine-tunes LM on a small set of ESD examples
in the first stage. In the second stage, ESD generated for an unseen scenario is post-processed
using RoBERTa-based models to filter irrelevant events, remove repetitions, and reorder the
temporally misordered events. Through automatic and manual evaluations, we demonstrate
that SIF yields substantial improvements (1-3
BLEU points) over a fine-tuned LM. However,
manual analysis shows that there is great room
for improvement, offering a new research direction for inducing script knowledge.
0 Replies
Loading