MARWA: Multi-agent retrieval-augmented framework for reliable bioinformatics workflow automation

ICLR 2026 Conference Submission11000 Authors

18 Sept 2025 (modified: 18 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Bioinformatics; Workflow Automation; Multi-Agent Systems; Retrieval-Augmented Generation; Large Language Models
TL;DR: We present MARWA, a multi-agent retrieval-augmented framework that improves reliability of bioinformatics workflow automation.
Abstract: The rapid growth of multi-omics data has driven the expansion of bioinformatics analysis tools. Common bioinformatics tasks often rely on workflows, which link multiple tools into structured pipelines for reproducibility and scalability. Yet, building workflows manually is slow and error-prone, motivating efforts toward automation. However, bioinformatics workflow automation remains difficult due to the need to clarify vague analytical objectives, coordinate heterogeneous tools, and generate intricate tool commands. Despite the potential of large language models (LLMs) to aid bioinformatics workflow recommendation through advanced semantic understanding and logical reasoning, current agent frameworks often rely on one-shot generation, weak tool retrieval solution, and limited evaluation scheme, resulting in fragile workflow automation. We propose MARWA, a Multi-Agent Retrieval-augmented framework for reliable bioinformatics Workflow Automation. The framework emphasizes a step-by-step generation process with error handling at each stage to ensure robustness. We introduce a retrieval-augmented framework to strengthen tool command accuracy, which incorporates multi-perspective LLM-augmented descriptions and employs contrastive learning. We further design a two-stage evaluation framework, combining expert-verified execution on 40 curated tasks with large-scale benchmarking on 2,270 tasks using LLM-based evaluation. Our experiments demonstrate that MARWA consistently outperforms baselines in pass rate, workflow quality and scalability. Our work provides a foundation for trustworthy bioinformatics workflow automation. Project Page: https://anonymous.4open.science/r/MARWA-7D30.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 11000
Loading