Abstract: We propose ScheduleNet, a scalable scheduler that minimizes task completion time by coordinating multiple agents. We formulate the min-max Multiple Traveling Salesmen Problem (mTSP) as a Markov decision process with an episodic reward and derive a scalable decision-making policy using Reinforcement Learning (RL). The decision-making procedure of ScheduleNet includes (1) representing the state of a problem with the agent-task graph, (2) extracting node embedding for agents and tasks by employing the type-aware graph attention, (3) and computing the task assignment probability with the computed node embedding. We show that ScheduleNet can outperform other heuristic approaches and existing deep RL approaches, particularly validating its exceptional effectiveness in solving large and practical problems. We also confirm that ScheduleNet can effectively solve practical mTSP variants, which include limited observation and online mTSP.
Loading