Every Answer Counts: Efficient Entity-Centric QA by Bayesian-Guided Subquery Sampling

Binyamin Perets; Zohar Shnaider; Dvir Aran; Shie Mannor

Every Answer Counts: Efficient Entity-Centric QA by Bayesian-Guided Subquery Sampling

Binyamin Perets, Zohar Shnaider, Dvir Aran, Shie Mannor

Published: 23 Sept 2025, Last Modified: 22 Nov 2025LAWEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Entity-centric question answering (ECQA), LLM-based scientific discovery, multi-armed bandit, pathway enrichment analysis

TL;DR: ARISE uses a multi-armed bandit approach for efficient LLM-based Entity-centric question answering.

Abstract: Entity-centric question answering (ECQA) is the problem of selecting which entities from a large, predefined set are most relevant to given observations. This represents a fundamental challenge for LLM-based scientific discovery, given obtaining reliable answers from long, heterogeneous inputs remains largely unattainable. Current approaches rely on consensus ranking from multiple subqueries or extensive iterative validation, but these methods incur token costs that scale poorly with input complexity, leading to "token explosion." To guide this process more efficiently, we introduce ARISE (Adaptive Residual Information Sampling Engine), a framework that grounds the selection of subqueries in a formal probabilistic model. We explicitly build a Bayesian generative model for the exploration problem, reframing ECQA as a multi-armed bandit problem with side observations. Our key insight is that each query targeting a specific entity provides noisy side-observations about all related entities, which can be used not only to update those entities under proper statistical grounding, but also leveraged for a better querying policy. ARISE employs DUETS Bandit (DUal Experts for Turbid side-Observations with Stochastic feedback graph), a novel online learning algorithm with dual advisors: a GraphExpert that leverages entity co-occurrence priors, and a NoiseExpert that strategically selects queries to maximize expected observation quality. Confirmation Atoms, a set of well-established validation processes, validate outputs and update internal beliefs. The outputs are fed into a "statistical engine" that enables statistically rigorous hypothesis testing with formal p-values. For evaluation, we use the hallmark challenge of pathway enrichment analysis using 180+ annotated gene expression datasets.

Submission Type: Research Paper (4-9 Pages)

Submission Number: 130

Loading