FROST: Factual Reasoning via Optimized Stochastic Trajectories in Large Language Models during Inference
Keywords: hallucination, exploration, dual-process reasoning, chain-of-thought, high-entropy generation, small model ensembles, information theory
Abstract: Large language models face a trade-off between factual consistency and reasoning
diversity: deterministic decoding prioritizes reliability but may miss alternative
solution paths, while high-temperature sampling increases exploration at the cost
of accuracy. We present FROST (Factual Reasoning via Optimized Stochastic
Trajectories), an inference-time framework that balances exploration and
exploitation without additional training or context augmentation. FROST combines
deterministic inference from a large model with targeted stochastic sampling from
a smaller model, selecting outputs via multi-criteria validation over coherence,
factual grounding, and semantic novelty. Across HotpotQA, CommonsenseQA, and
MMLU, FROST achieves 2--5 percentage point improvements over standard chain-of-thought
prompting and reduces unsupported outputs by 40\% relative to Standard CoT. Compared
to Self-Consistency ensembles, FROST delivers comparable accuracy at 28\% lower
inference cost through strategic delegation to smaller models. On an adversarial
subset with unanswerable queries, FROST abstains on 34\% of cases versus 8\% for
standard chain-of-thought, reducing false positives by 45\%. Task-stratified
evaluation shows that exploration benefits scale with problem ambiguity.
Generalization to mathematical reasoning, code generation, and multimodal domains
remains future work.
Submission Type: Emerging
Copyright Form: pdf
Submission Number: 251
Loading