Learning to Predict Future-Aligned Research Proposals with Language Models

Heng Wang; Jiawei Han; Heng Ji

Learning to Predict Future-Aligned Research Proposals with Language Models

Heng Wang, Jiawei Han, Heng Ji

Published: 28 Apr 2026, Last Modified: 28 Apr 2026MSLD 2026 PosterEveryoneRevisionsCC BY 4.0

Keywords: AI Scientist, LLM for Research, Hypotheis Generation

Abstract: Large language models (LLMs) are increasingly used as research assistants, but evaluating the quality of LLM-generated research proposals remains difficult: novelty, soundness, and feasibility are hard to measure automatically and typically require costly human judgment, making it unclear how to define a scalable learning objective. We propose a verifiable alternative by reframing proposal generation as a time-sliced scientific forecasting problem. Given a research question and inspiring papers available before a cutoff time $t_C$, the model generates a structured proposal and is evaluated by whether it anticipates research directions that appear in papers published after $t_C$. We operationalize this objective with the Future Alignment Score (FAS), computed via retrieval and LLM-based semantic scoring against a held-out future corpus. To train models under this objective, we construct a time-consistent dataset of 17,771 papers by converting published papers and their pre-cutoff citations into proposal targets, and synthesize reasoning traces that explicitly perform gap analysis and inspiration borrowing; we further introduce a stepwise variant that decomposes generation into problem identification, method design, and experimental planning. Across Llama-3.1 and Qwen2.5 models, future-aligned tuning improves future alignment over unaligned baselines (up to +10.6\% overall FAS), and domain-expert human evaluation corroborates improved proposal quality. Finally, we demonstrate practical impact by implementing two model-generated proposals with a code agent, obtaining 4.17\% accuracy gain on MATH from a new prompting strategy and consistent improvements for a novel model-merging method.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 167

Loading