LLaVA-LE: Large Language-and-Vision Assistant for Lunar Exploration

Published: 26 Apr 2026, Last Modified: 26 Apr 2026AI4SpaceEveryoneRevisionsCC BY 4.0
Keywords: MultiModal, Llava, Space Exploration
TL;DR: conversationalist AI for lunar surface characterization
Abstract: Recent advances in multimodal vision–language models (VLMs) have enabled joint reasoning over visual and textual information, yet their application to planetary science remains largely unexplored. A key hindrance is the absence of large-scale datasets pairing real planetary imagery with detailed scientific descriptions. In this work, we introduce **LLaVA-LE** (Large Language-and-Vision Assistant for Lunar Exploration), a vision–language model specialized for lunar surface and subsurface characterization. To enable this capability, we curate a new large-scale multimodal lunar dataset, **LUCID**(**LU**nar **C**aption **I**mage **D**ataset) consisting of **96k** high-resolution panchromatic images paired with detailed captions describing lunar terrain characteristics, and **81k** question-answer (QA) pairs derived from $\sim$20k images in the LUCID dataset. Leveraging this dataset, we fine-tune LLaVA using a two-stage training curriculum: (1) concept alignment for domain-specific terrain description, and (2) instruction-tuned visual question answering. We further design evaluation benchmarks spanning multiple levels of reasoning complexity relevant to lunar terrain analysis. Evaluated against GPT and Gemini judges, LLaVA-LE achieves a **3.3x** overall performance gain over Base LLaVA and **2.1x** over our Stage 1 model, with a reasoning score of **1.070** — _exceeding the judge's own reference score_ — highlighting the effectiveness of domain-specific multimodal data and instruction tuning for advancing VLMs in planetary exploration.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 10
Loading