Abstract: AI systems for software development are rapidly gaining prominence, yet signifi-
cant challenges remain in ensuring their safety. To address this, Amazon launched
the Trusted AI track of the Amazon Nova AI Challenge, a global competition
among 10 university teams to drive advances in secure AI. In the challenge, five
teams focus on developing automated red teaming bots, while the other five create
safe AI assistants. This challenge provides teams with a unique platform to eval-
uate automated red-teaming and safety alignment methods through head-to-head
adversarial tournaments where red teams have multi-turn conversations with the
competing AI coding assistants to test their safety alignment. Along with this,
the challenge provides teams with a feed of high quality annotated data to fuel
iterative improvement. Throughout the challenge, teams developed state-of-the-art
techniques, introducing novel approaches in reasoning-based safety alignment,
robust model guardrails, multi-turn jail-breaking, and efficient probing of large
language models (LLMs). To support these efforts, the Amazon Nova AI Challenge
team made substantial scientific and engineering investments, including building a
custom baseline coding specialist model for the challenge from scratch, developing
a tournament orchestration service, and creating an evaluation harness. This paper
outlines the advancements made by university teams and the Amazon Nova AI
Challenge team in addressing the safety challenges of AI for software development,
highlighting this collaborative effort to raise the bar for AI safety.
Loading