Leonard Bereska

PhD student, Informatics Institute, University of Amsterdam

  • Joined December 2019

Names

Leonard Bereska

Emails

****@hotmail.de (Confirmed)
,
****@zi-mannheim.de (Confirmed)
,
****@uva.nl (Confirmed)
,
****@protonmail.com (Confirmed)

Career & Education History

PhD student
Informatics Institute, University of Amsterdam (uva.nl)
20212025
 
Researcher
Theoretical Neuroscience, Central Institute of Mental Health (zi-mannheim.de)
20192021
 
MS student
Physics, Heidelberg University (uni-heidelberg.de)
20162019
 
Researcher
Physics, Heidelberg University (uni-heidelberg.de)
20122016
 

Advisors, Relations & Conflicts

Coauthor
2024Present
 
PhD Advisor
2021Present
 
Coauthor
20192021
 
Coauthor
20192021
 

Expertise

ai security, adversarial robustness, interpretability, machine learning theory
2024Present
 
interpretability, developmental interpretability, singular learning theory, neural network dynamics, local learning coefficient
2024Present
 
ai safety, mechanistic interpretability, transformer models, large language models, transparency, interpretability, circuit analysis, causal interpretability
2023Present
 
ai safety, mechanistic interpretability, polysemanticity, superposition, neural network representations, feature disentanglement, sparse autoencoders
2023Present