A Pilot Benchmark for NL-to-FOL Translation in Planetary Exploration

Published: 29 May 2026, Last Modified: 29 May 2026ICRA 2026 Workshop on Perceptual Challenges for Planetary ExplorationEveryoneRevisionsCC BY 4.0
Keywords: Natural Language, First-Order-Logic, Planetary Exploration, Translation, LLM, Autonomous Agents
TL;DR: A small pilot benchmark for translating planetary mission text into FOL; current LLMs struggle with long, structured reasoning on this task.
Abstract: Future planetary exploration envisions autonomous robotic agents operating under severe communication constraints, without global positioning, and with minimal human intervention. In such environments, agents must not only perceive and act, but also reason over mission objectives, operational constraints, and evolving environmental conditions. While prior work has largely focused on perception and control, the translation of high-level mission knowledge into structured, machine-interpretable representations remains underexplored. We introduce a pilot benchmark for translating natural language (NL) into First-Order Logic (FOL) within the domain of planetary exploration. The dataset is constructed from real mission documentation sourced from NASA’s Planetary Data System (PDS), spanning missions from 2003 to 2013. These documents describe mission phases such as launch, boost, coast, cruise, and orbital operations in rich natural language. We manually annotate these documents with corresponding FOL representations that capture temporal structure, agent roles, and operational dependencies. In addition, we provide structured predicate vocabularies and typed constants to enable controlled experimentation with varying levels of prior knowledge. This pilot benchmark provides a foundation for research at the intersection of language understanding and formal reasoning, grounded in real-world, safety-critical mission data. The dataset is provided for anonymous review at: https://anonymous.4open.science/r/PMR-BF96/mission.json.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 10
Loading