Keywords: Collaboration of Experts, DevOps, AIOps, Ensemble Learning, Retrieval-augmented Generation
Abstract: While Large Language Models (LLMs) have advanced the paradigm of AIOps, a single monolithic model struggles to cover the comprehensive DevOps lifecycle—ranging from low-level fault analysis to high-level release planning—due to domain knowledge constraints. Although ensemble learning offers a potential solution, existing approaches often lack the scalability to adapt to dynamic task shifts. To address these challenges, we propose CoE-Ops, a Collaboration-of-Experts framework designed for complex AIOps Question-Answering (QA). CoE-Ops incorporates a training-free, general-purpose LLM as a task classifier, augmented by Retrieval-Augmented Generation (RAG) to precisely route queries across heterogeneous expert models without fine-tuning. This mechanism enables robust handling of both concrete (e.g., anomaly detection) and abstract (e.g., operation) tasks. Extensive evaluations on the DevOps-Eval benchmark demonstrate that CoE-Ops significantly outperforms state-of-the-art baselines: it achieves a 72\% improvement in routing accuracy for high-level tasks compared to existing CoE methods, delivers an 8\% accuracy gain over the best standalone experts, and surpasses large-scale Mixture-of-Experts (MoE) models by up to 14\% in overall accuracy with fewer parameters.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: Collaboration of Experts, DevOps, AIOps, Ensemble Learning, Retrieval-augmented Generation
Contribution Types: NLP engineering experiment
Languages Studied: English, Chinese
Submission Number: 2882
Loading