Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning

Mehdi Azarafza; Mojtaba Nayyeri; Faezeh Pasandideh; Steffen Staab; Achim Rettberg

Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning

Mehdi Azarafza, Mojtaba Nayyeri, Faezeh Pasandideh, Steffen Staab, Achim Rettberg

Published: 05 Mar 2026, Last Modified: 30 Apr 2026ICLR 2026 Workshop LLM ReasoningEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 10 pages)

Keywords: Large Language Model, Retrieval-Augmented Generation, Unmanned Aerial Vehicle, UAV-Math-Bench

TL;DR: In this work, we explore how RAG enhances the mathematical reasoning of LLMs for UAV applications. We use state-of-the-art models and augment them with a vector-based retrieval mechanism to supply external info during problem-solving.

Abstract: Autonomous UAV operation necessitates reliable mathematical reasoning for tasks such as trajectory planning and power management. While traditional flight control relies on hardcoded equations, recent Large Language Models (LLMs) of- fer potential for more flexible problem-solving but struggle with reliably selecting and applying correct mathematical formulations and executing precise multi-step arithmetic. We propose RAG-UAV, a retrieval-augmented generation framework designed to improve the mathematical reasoning of several LLMs (including GPT o1/Turbo, Llama-3.2/3.3, Mistral, and DeepSeek R1) in UAV-specific contexts by providing access to relevant domain literature. To conduct an initial assess- ment, we introduce the UAV-Math-Bench (a 20-question pilot problem set) of UAV-centric mathematical problems across four difficulty levels. Our experiments suggest that incorporating retrieval substantially increases exact answer accuracy (achieving up to 75% with o1), reduces instances of incorrect formulation selec- tion (from 25% without RAG to 5% with RAG in a limited setting), and decreases numerical errors, reducing Mean Squared Error (MSE) by orders of magnitude for the best-performing models. This pilot study indicates that RAG can enable general-purpose LLMs to function as more reliable tools for engineering analysis, although direct real-time flight control requires further investigation and validation on a larger scale. All data are available.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.

Submission Number: 6

Loading