Toggle navigation
OpenReview
.net
Login
×
Back to
ICML
ICML 2025 Workshop HiLD Submissions
Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling
Teodora Srećković
,
Jonas Geiping
,
Antonio Orvieto
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Input differentiation via negative computation
Linghao Kong
,
Angelina Ning
,
Nir N Shavit
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Learning how to step in gradient-based optimization: beyond convexity and smoothness
Dravyansh Sharma
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
On Generalization of Spectral Gradient Descent: A Case Study on Imbalanced Data
Bhavya Vasudeva
,
Puneesh Deora
,
Christos Thrampoulidis
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
The Interplay Between Implicit Bias and Adversarial Robustness in Linear Convolutional Neural Networks
Aurélien Boland
,
Hannah Pinson
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Understanding Generalization in Diffusion Models via Probability Flow Distance
Huijie Zhang
,
Zijian Huang
,
Siyi Chen
,
Jinfan Zhou
,
Zekai Zhang
,
Peng Wang
,
Qing Qu
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Towards the Optimal Control Perspective of ResNet Training
Jens Püttschneider
,
Simon Heilig
,
Asja Fischer
,
Timm Faulwasser
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
The Nuclear Route: Sharp Asymptotics of ERM in Overparameterized Quadratic Networks
Vittorio Erba
,
Emanuele Troiani
,
Lenka Zdeborova
,
Florent Krzakala
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning
Junsoo Oh
,
Jerry Song
,
Chulhee Yun
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Emergent Linear Separability of Unseen Data Points in High-dimensional Last-Layer Feature Space
Taehun Cha
,
Donghun Lee
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Grokking and Generalization Collapse: Insights from HTSR theory
Hari Kishan Prakash
,
charles h martin
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Different simultaneous mechanisms for in-context recall have distinct learning dynamics
Sultan Daniels
,
Dylan Davis
,
Dhruv Gautam
,
Wentinn Liao
,
Gireeja Ranade
,
Anant Sahai
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
On the Performance of Differentially Private Optimization with Heavy-Tail Class Imbalance
Qiaoyue Tang
,
Alain Zhiyanov
,
Mathias Lécuyer
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Exploration Behavior of Untrained Policies
Jacob Adamczyk
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training
Minhak Song
,
Beomhan Baek
,
Kwangjun Ahn
,
Chulhee Yun
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Understanding Normalization Layers for Sparse Training
Mohammed Adnan
,
Ekansh Sharma
,
Rahul Krishnan
,
Yani Ioannou
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Revisiting the Goldilocks Zone in Inhomogeneous Networks
Zacharie Garnier Cuchet
,
Sarath Chandar
,
Ekaterina Lobacheva
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Tracing the representation geometry of language models from pretraining to post-training
Melody Zixuan Li
,
Kumar Krishna Agrawal
,
Arna Ghosh
,
Komal Kumar Teru
,
Guillaume Lajoie
,
Blake Aaron Richards
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
A simple connection from loss flatness to compressed neural representations
Shirui Chen
,
Stefano Recanatesi
,
Eric Todd SheaBrown
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Data-Free Transformer Quantization Using Parameter-Space Symmetry
Lucas Laird
,
Bo Zhao
,
Rose Yu
,
Robin Walters
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
Pulkit Gopalani
,
Wei Hu
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD
Tongcheng Zhang
,
Zhanpeng Zhou
,
Mingze Wang
,
Andi Han
,
Wei Huang
,
Taiji Suzuki
,
Junchi Yan
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Risk Phase Transitions in Spiked Regression: Alignment Driven Benign and Catastrophic Overfitting
Jiping Li
,
Rishi Sonthalia
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Oral
Readers:
Everyone
In Search of Adam’s Secret Sauce
Antonio Orvieto
,
Robert M. Gower
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
Origins of Creativity in Attention Based Diffusion Models
Emma Lucia Byrnes Finn
,
T. Anderson Keller
,
Manos Theodosis
,
Demba E. Ba
Published: 09 Jun 2025, Last Modified: 09 Jun 2025
HiLD at ICML 2025 Poster
Readers:
Everyone
«
‹
1
2
3
4
›
»