Abstract: This paper proposes the Dynamic and Cooperative Multi-Agent Task Allocation (DC-MATA) problem, focusing on individually rational agents in a cooperative organization, which allocate dynamically changing tasks over time. DC-MATA aims at dynamically improving Nash equilibrium over time through learning in this context. Task utilities evolve dynamically, and learning, conducted in rounds, optimizes agents' task selection order to enhance system performance. Our proposed DC-MATA solution approach assigns agents to tasks with highest utility over time and tends towards the Nash equilibrium that aligns with agents' self-interest while improving the gap with system optimum. We propose priority-sensitive reward function and four action sampling algorithms (ε-greedy, ε-decay, Adapted Simulated Annealing, and Prior Sequence-Aware Sampling - PSAS) leveraging a Markov decision process (MDP) framework. Simulation experiments on our newly proposed GitHub benchmark instances confirm robust performance, facilitating efficient task allocation in the DC-MATA scenario.
Loading