Improved Cooperation by Exploiting a Common Signal

Published: 03 Feb 2021, Last Modified: 29 Sept 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0
Abstract: Can artificial agents benefit from human conventions? Human so- cieties manage to successfully self-organize and resolve the tragedy of the commons in common-pool resources, in spite of the bleak prediction of non-cooperative game theory. On top of that, real- world problems are inherently large-scale and of low observabil- ity. One key concept that facilitates human coordination in such settings is the use of conventions. Inspired by human behavior, we investigate the learning dynamics and emergence of temporal con- ventions, focusing on common-pool resources. Extra emphasis was given in designing a realistic evaluation setting: (a) environment dy- namics are modeled on real-world fisheries, (b) we assume decen- tralized learning, where agents can observe only their own history, and (c) we run large-scale simulations (up to 64 agents). Uncoupled policies and low observability make cooperation hard to achieve; as the number of agents grow, the probability of tak- ing a correct gradient direction decreases exponentially. By intro- ducing an arbitrary common signal (e.g., date, time, or any peri- odic set of numbers) as a means to couple the learning process, we show that temporal conventions can emerge and agents reach sus- tainable harvesting strategies. The introduction of the signal con- sistently improves the social welfare (by 258% on average, up to 3306%), the range of environmental parameters where sustainabil- ity can be achieved (by 46% on average, up to 300%), and the con- vergence speed in low abundance settings (by 13% on average, up to 53%).
Loading