Mixed-modality Representation Learning and Pre-training for Joint Table-and-Text Retrieval in OpenQADownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=MgtFuVblLw4
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: Retrieving evidences from tabular and textual resources is essential for open-domain question answering (OpenQA), which provides more comprehensive information than from a single resource. However, training an effective dense table-text retriever is difficult due to the challenges of table-text discrepancy and limited training data. In this paper, we introduce an optimized OpenQA Table-TExt Retriever (OTTER) to jointly retrieve tabular and textual evidences. To address the above challenges, we exploit the cross-modal connections and propose three novel solutions: modality-enhanced representation, mixed-modality negative sampling, and mixed-modality synthetic pre-training. Experimental results demonstrate that OTTER substantially improves the performance of table-text retrieval on the OTT-QA dataset. We further conduct comprehensive analyses to examine the effectiveness of the three mechanisms in OTTER. Besides, equipped with OTTER, our OpenQA system achieves the state-of-the-art result on the downstream QA task, with 10.1% absolute improvement in terms of the exact match over the previous best system. \footnote{All the code and data will be released upon acceptance}
0 Replies

Loading