Bayesian information theoretic model-averaging stochastic item selection for computer adaptive testing

TMLR Paper4867 Authors

15 May 2025 (modified: 24 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Computer Adaptive Testing (CAT) aims to accurately estimate an individual's ability using only a subset of an Item Response Theory (IRT) instrument. A secondary goal is to ensure diverse item exposure across different testing sessions, preventing any single item from being over or underutilized. In CAT, items are selected sequentially based on a running estimate of a respondent's ability. Prior methods almost universally see item selection through an optimization lens, motivating greedy item selection procedures. While efficient, these methods tend to have poor item exposure. Existing stochastic methods for item selection are ad-hoc, where item sampling weights lack theoretical justification. In this manuscript, we formulate CAT as a Bayesian model averaging problem. At each step, we sample the next item in a manner where the Frequentist item sampling statistics correspond to Bayesian model averaging in the space of next-item ability estimates. This view of the CAT item selection problem also defines the natural criterion of the ability discrepancy: the KL divergence between the unknown next-item ability estimate and the unknown true full item bank ability estimate. We tested our new method on the eight independent IRT models that comprise the Work Disability Functional Assessment Battery, comparing it to prior art. We found that our stochastic methodology had superior item exposure while not compromising in terms of test accuracy and efficiency.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: July 24: I appended the latexdiff into the supplemental material, starting on page 6 July 7 update 2: Got rid of the redundant subscripts for $\pi$, $\tilde\pi$, made the KL divergence notation more consistent. July 7 update: I added some more citations for IRT background. This revision is an extensive rewrite, shifting the structure of the manuscript to better mimic that found in usual ML conference papers. The Introduction section in particular is rewritten for clarity, in a less technical manner. The manuscript now has more consistent emphasis on the focal novelty of the method: stochastic selection where the Frequentist item statistics yield Bayesian model averaging in ability estimate space.
Assigned Action Editor: ~Shinichi_Nakajima2
Submission Number: 4867
Loading