William Saunders

Researcher, Alignment Science, Anthropic
Joined
January 2023

Names

William Saunders

Emails

****@williamsaunders.net (Confirmed)
****@anthropic.com (Confirmed)

Personal Links

Career & Education History

Researcher
Alignment Science, Anthropic (anthropic.com)
2025Present
Researcher
Model Evaluation and Threat Research (metr.org)
20242025
Researcher
OpenAI (openai.com)
20212024

Advisors, Relations & Conflicts

Coworker
20212024
Coauthor
20172021
Coworker
20172021

Expertise

interpretability
2022Present
language models
20202024
scalable oversight
debate
critiques
20202022

Publications

View all 13 publications

Co-Authors