Making progress in Trustworthy AI using DeepMind’s AI Safety Gridworlds

Ahmed Ghoor; Jonathan P. Shock

Making progress in Trustworthy AI using DeepMind’s AI Safety Gridworlds

Ahmed Ghoor, Jonathan P. Shock

Published: 19 Jun 2025, Last Modified: 12 Jul 20254th Muslims in ML Workshop co-located with ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Submission Track: Track 1: Machine Learning Research by Muslim Authors

Keywords: Reinforcement Learning, AI Safety

Abstract: DeepMind's AI Safety Gridworlds are a suite of environments aimed at facilitating the research and development of safe artificial intelligence by encapsulating simplified, yet meaningful, representations of safety challenges that real-world AI systems might encounter. This paper looks at DeepMind's accompanying paper and surveys several solutions that have been proposed for the environments.

Submission Number: 16

Loading