OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Alexander Hägele
PhD student, IC, EPFL - EPF Lausanne
Joined
May 2022
Names
Alexander Hägele
(Preferred)
,
Alexander Haegele
,
Alexander Hagele
Emails
****@inf.ethz.ch
(Confirmed)
,
****@gmail.com
(Confirmed)
,
****@ethz.ch
(Confirmed)
,
****@t-online.de
(Confirmed)
,
****@epfl.ch
(Confirmed)
Personal Links
Homepage
Google Scholar
DBLP
Semantic Scholar
Career & Education History
PhD student
IC,
EPFL - EPF Lausanne
(epfl.ch)
2023
–
Present
Intern
Anthropic
(anthropic.com)
2025
–
2025
Intern
MLR,
Apple
(apple.com)
2023
–
2023
MS student
CS,
ETHZ - ETH Zurich
(ethz.ch)
2021
–
2023
Exchange Student
École Polytechnique
(polytechnique.fr)
2022
–
2022
Undergrad student
ETHZ - ETH Zurich
(ethz.ch)
2017
–
2021
Exchange Student
University of Toronto
(toronto.edu)
2019
–
2019
Advisors, Relations & Conflicts
PhD Advisor
Martin Jaggi
2023
–
Present
Expertise
Machine Learning
Present
Publications
The surprising agreement between convex optimization theory and learning-rate scheduling for large model training
Adrien Taylor
,
Fabian Schaipp
,
Alexander Hägele
,
Umut Simsekli
,
Francis Bach
International conference on Machine Learning (ICML)
Readers:
Everyone
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?
Alexander Hägele
,
Aryo Pradipta Gema
,
Henry Sleight
,
Ethan Perez
,
Jascha Sohl-Dickstein
ICLR 2026 Poster
Readers:
Everyone
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Alejandro Hernández-Cano
,
Alexander Hägele
,
Allen Hao Huang
,
Angelika Romanou
,
Antoni-Joan Solergibert i Llaquet
,
Barna Pásztor
,
Bettina Messmer
,
Dhia Garbaya
,
Eduard Frank Durech
,
Ido Hakimi
,
Juan García Giraldo
,
Mete Ismayilzada
,
Negar Foroutan
,
Skander Moalla
,
Tiancheng Chen
,
Vinko Sabolcec
,
Yixuan Xu
,
Michael Aerni
,
Badr AlKhamissi
,
Ines Altemir Marinas
et al. (81 additional authors not shown)
CoRR 2025
Readers:
Everyone
Inverse Scaling in Test-Time Compute
Aryo Pradipta Gema
,
Alexander Hägele
,
Runjin Chen
,
Andy Arditi
,
Jacob Goldman-Wetzler
,
Kit Fraser-Taliente
,
Henry Sleight
,
Linda Petrini
,
Julian Michael
,
Beatrice Alex
,
Pasquale Minervini
,
Yanda Chen
,
Joe Benton
,
Ethan Perez
Accepted by TMLR
Readers:
Everyone
Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler
Aleksandr Dremov
,
Alexander Hägele
,
Atli Kosson
,
Martin Jaggi
Accepted by TMLR
Readers:
Everyone
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training
Fabian Schaipp
,
Alexander Hägele
,
Adrien Taylor
,
Umut Simsekli
,
Francis Bach
ICML 2025 poster
Readers:
Everyone
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
,
Elie Bakouch
,
Atli Kosson
,
Loubna Ben allal
,
Leandro Von Werra
,
Martin Jaggi
ES-FoMo-II 2024 Poster
Readers:
Everyone
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
Alexander Hägele
,
Elie Bakouch
,
Atli Kosson
,
Loubna Ben allal
,
Leandro Von Werra
,
Martin Jaggi
NeurIPS 2024 spotlight
Readers:
Everyone
BaCaDI: Bayesian Causal Discovery with Unknown Interventions
Alexander Hägele
,
Jonas Rothfuss
,
Lars Lorch
,
Vignesh Ram Somnath
,
Bernhard Schölkopf
,
Andreas Krause
Published: 01 Jan 2023, Last Modified: 10 Nov 2023
AISTATS 2023
Readers:
Everyone
BaCaDI: Bayesian Causal Discovery with Unknown Interventions
Alexander Hägele
,
Jonas Rothfuss
,
Lars Lorch
,
Vignesh Ram Somnath
,
Bernhard Schölkopf
,
Andreas Krause
Published: 09 Jul 2022, Last Modified: 12 Oct 2025
CRL@UAI 2022 Poster
Readers:
Everyone
View all 11 publications
Co-Authors
Adrien Taylor
Alejandro Hernández-Cano
Aleksandr Dremov
Alexander Hoyle
Alexander Sternfeld
Allen Hao Huang
Ana Klimovic
Anastasiia Kucherenko
Andreas Krause
Andreas Marfurt
Andrei Kucharavy
Andrei Panferov
Andrei Semenov
Andy Arditi
Angelika Romanou
Anna Sotnikova
Antoine Bosselut
Antoni-Joan Solergibert i Llaquet
Arnout Devos
Aryo Pradipta Gema
Atli Kosson
Auguste Poiroux
Ayush Kumar Tarun
Badr AlKhamissi
Barna Pásztor
View all 131 co-authors