Keywords: large language models, forecasting, probability judgment, calibration, optimism bias, directional bias, alignment, prediction markets
TL;DR: OptimismBench measures directional probability bias in LLMs via inverted pairs. Across 14 models, a controlled base-vs-chat probe shows alignment training causally sets the tilt toward optimism or pessimism.
Abstract: Large language models are increasingly used as decision aids whose probability judgments shape downstream choices. Whether those judgments carry a systematic directional tilt has been hard to detect: standard calibration metrics aggregate unsigned errors, and naturalistic uncertainty offers no ground-truth probability. When an LLM rates a startup’s success at 70% but its failure at 15%, the missing 15 points expose a distortion no aggregate score flags. We introduce OPTIMISMBENCH, which detects directional bias using inverted pairs: for each scenario we elicit both P(success) and P(failure) and measure whether positive and negative framings are treated symmetrically. Across 17 models from 8 providers, fourteen exhibit significant optimism and three exhibit pessimism. The pattern is stable under prompt and temperature ablations, and an eleven-model six-language probe shows inter-model variance is 3.4×inter-language variance and bias magnitude correlates with cross-lingual stability (r=0.61). Smaller and base-stage models are more optimistic, and a four-pair controlled base-versus-chat probe confirms causally that alignment training attenuates optimism. When alignment makes a model more helpful, it also tilts its probabilities; downstream pipelines inherit the tilt by default.
Submission Number: 121
Loading