BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives

Anonymous

BIRCO: A Benchmark of Information Retrieval Tasks with Complex Objectives

Anonymous

16 Dec 2023 (modified: 27 Feb 2024)ACL ARR 2023 December Blind SubmissionReaders: Everyone

TL;DR: A New Information Retrieval Benchmark for Tasks with Complex Objectives, and a Unified LLM-based Framework for IR Tasks.

Abstract: We present the Benchmark of Information Retrieval (IR) tasks with Complex Objectives (BIRCO) to evaluate the ability of IR models to follow multi-faceted task objectives. We study the performance of various embedding, distilled and fine-tuned IR models on BIRCO, and find them lacking. We provide a unified framework for investigating the performance of large language models (LLMs) on these tasks. The proposed framework consists of 3 modular components: task-objective awareness; chain-of-thought reasoning; and task decomposition. We investigate the effects of these factors on LLM performance, and identify a simple baseline model which matches or outperforms existing approaches and more complex alternatives. No approach achieves satisfactory performance on all benchmark tasks, suggesting that stronger models and new retrieval protocols are necessary to address complex user needs. https://github.com/BIRCO-benchmark/BIRCO.git

Paper Type: short

Research Area: Information Retrieval and Text Mining

Contribution Types: Model analysis & interpretability, Data resources

Languages Studied: English

0 Replies

Loading