William Saunders

Researcher, Alignment Science, Anthropic

Researcher, Model Evaluation and Threat Research

  • Joined January 2023

Names

William Saunders

Emails

****@williamsaunders.net (Confirmed)
,
****@anthropic.com (Confirmed)

Career & Education History

Researcher
Alignment Science, Anthropic (anthropic.com)
2025Present
 
Researcher
Model Evaluation and Threat Research (metr.org)
20242025
 
Researcher
OpenAI (openai.com)
20212024
 

Advisors, Relations & Conflicts

Coworker
20212024
 
Coauthor
20172021
 
Coworker
20172021
 

Expertise

interpretability
2022Present
 
language models
20202024
 
scalable oversight, debate, critiques
20202022