Trust in a Multi-Agent System: Using Natural Language Rules and a Policing Agent to Encourage Trustworthy Behavior

Published: 31 Mar 2026, Last Modified: 31 Mar 2026ARMS 2026 OralEveryoneRevisionsCC BY 4.0
Keywords: trust, multi-agent system, rules, consequences, LLM, agentic, AI, simulation, penalty, policing
Abstract: The focus of this paper is to explore how to promote and encourage trustworthy behavior between LLM-based agents within a multi- agent system. The LLM is used by the agents to decide how to behave or act in a given scenario. This work proposes two novel methods to achieve the outcome of the desired behavior. The first is defining rules and consequences in natural language for how agents should conduct themselves throughout the environment. The second is a policing agent that will penalize agents for violations or harms inflicted upon another agent. Using the ARGoS simulator [20] and LLM models available in LM Studio [23], experiments have been performed to determine the effectiveness of these methods in a variety of scenarios.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 13
Loading