An LLM in Two Discovery Experiments for Extreme Astrophysics: Promising Tool and Co-author, Not Fully Independent Yet
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: astrophysics, large language models, scientific discovery, binary stars, high-energy astrophysics, machine learning
TL;DR: The same LLM acts as an autonomous tool when an astronomical question has a clean physical signal and as a hint-driven co-author when it doesn't — and we use a paired AM CVn discovery experiment to show what distinguishes the two.
Abstract: We report two parallel discovery experiments in which a general-purpose large language model (LLM) was placed in different roles in a search for permanent high-state AM\,CVn binaries --- ultracompact white-dwarf systems and key gravitational-wave verification sources for LISA. In Experiment A (Needle in a Box of Sticks), the model was given $29$ reduced optical spectra and, with no domain hints, autonomously authored a Python pipeline that ranked the lone AM\,CVn first by composite line-strength score (\,$+3.6$ vs.\ negative scores for hydrogen-rich cataclysmic variables\,). In Experiment B (Needle in a Haystack), the model was given a $1{,}487{,}933$-source eROSITA$\times$Gaia catalog and, through eight rounds of one-sentence human hints encoding AM\,CVn domain priors, narrowed the catalog to $30$ ranked candidates with a known high-state system recovered at rank $3$. We argue that the same model occupies different positions on the "tool, co-author, founder" spectrum depending on whether the inference target is well-specified by physical constraints (Experiment A) or under-determined and prior-dependent (Experiment B), and we propose that the atomicity and legibility of the human hints in Experiment B is a useful operational definition of co-authorship. These experiments have led to a novel discovery, demonstrating that current LLMs are promising tools and co-authors for discovery in high-energy astrophysics.
Submission Number: 293
Loading