Paper Link: https://openreview.net/forum?id=72ufkeMgpvM
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: We propose a new framework, SIDeQ, that enables non-experts to indirectly annotate the meanings of natural language utterances by answering Simple Informative Denotation Questions. We take Text-to-SQL as a case study. Given a natural-language database query, SIDeQ generates a prior over SQL candidates by running a seed semantic parser (e.g., Codex), but it does not show these candidates to the annotators. Instead, it asks them to evaluate the natural-language query on various concrete databases and upweights the candidates that are consistent with their responses. For efficient interactions, we synthesize these databases to maximize the expected information gain of knowing the correct evaluations, while keeping the question simple by reducing the database size. We build an interface based on SIDeQ and recruit non-experts to annotate a random subset of 240 utterances from the SPIDER development set. Our system with non-experts achieves the same annotation accuracy as the original SPIDER expert annotators (75%) and significantly outperforms the top-1 accuracy of Codex (59%). Finally, we analyze common mistakes by database experts without SIDeQ and those by non-experts unfamiliar with databases.
0 Replies
Loading