Toggle navigation
OpenReview
.net
Login
×
Back to
NeurIPS
NeurIPS 2024 Workshop M3L Submissions
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo
,
Druv Pai
,
Yu Bai
,
Jiantao Jiao
,
Michael Jordan
,
Song Mei
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Oral
Readers:
Everyone
Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model
Siyu Chen
,
Beining Wu
,
Miao Lu
,
Zhuoran Yang
,
Tianhao Wang
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Oral
Readers:
Everyone
Depth Extrapolation of Decoders Trained on Nested Structures
Emile R Richard
Published: 11 Oct 2024, Last Modified: 06 Dec 2024
M3L Poster
Readers:
Everyone
Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
Juno Kim
,
Dimitri Meunier
,
Arthur Gretton
,
Taiji Suzuki
,
Zhu Li
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
In-Context Learning by Linear Attention: Exact Asymptotics and Experiments
Yue Lu
,
Mary Letey
,
Jacob A Zavatone-Veth
,
Anindita Maiti
,
Cengiz Pehlevan
Published: 11 Oct 2024, Last Modified: 14 Dec 2024
M3L Poster
Readers:
Everyone
Understanding Diffusion-based Representation Learning via Low-Dimensional Modeling
Xiao Li
,
Zekai Zhang
,
Xiang Li
,
Siyi Chen
,
Zhihui Zhu
,
Peng Wang
,
Qing Qu
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Towards characterizing the value of edge embeddings in Graph Neural Networks
Dhruv Rohatgi
,
Tanya Marwah
,
Zachary Chase Lipton
,
Jianfeng Lu
,
Ankur Moitra
,
Andrej Risteski
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Oral
Readers:
Everyone
How do students become teachers: A dynamical analysis for two-layer neural networks
Zhenyu Zhu
,
Fanghui Liu
,
Volkan Cevher
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Mixture of Parrots: Mixtures of experts improve memorization more than reasoning
Samy Jelassi
,
Clara Mohri
,
David Brandfonbrener
,
Alex Gu
,
Nikhil Vyas
,
Nikhil Anand
,
David Alvarez-Melis
,
Yuanzhi Li
,
Sham M. Kakade
,
Eran Malach
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Oral
Readers:
Everyone
Convergence of Distributed Adaptive Optimization with Local Updates
Ziheng Cheng
,
Margalit Glasgow
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Robust Feature Learning for Multi-Index Models in High Dimensions
Alireza Mousavi-Hosseini
,
Adel Javanmard
,
Murat A Erdogdu
Published: 11 Oct 2024, Last Modified: 03 Dec 2024
M3L Poster
Readers:
Everyone
Emergence in non-neural models: grokking modular arithmetic via average gradient outer product
Neil Rohit Mallinar
,
Daniel Beaglehole
,
Libin Zhu
,
Adityanarayanan Radhakrishnan
,
Parthe Pandit
,
Mikhail Belkin
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization
Noam Razin
,
Sadhika Malladi
,
Adithya Bhaskar
,
Danqi Chen
,
Sanjeev Arora
,
Boris Hanin
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Geometric Deep Learning with Quasiconformal Neural Networks: An Introduction
Nico Alvarado
,
Hans Lobel
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Yuda Song
,
Hanlin Zhang
,
Carson Eisenach
,
Sham M. Kakade
,
Dean Foster
,
Udaya Ghai
Published: 11 Oct 2024, Last Modified: 03 Dec 2024
M3L Poster
Readers:
Everyone
A Theoretical Framework for Federated Domain Generalization with Gradient Alignment
Mahdiyar Molahasani
,
Milad Soltany
,
Farhad Pourpanah
,
Michael Greenspan
,
Ali Etemad
Published: 11 Oct 2024, Last Modified: 25 Nov 2024
M3L Poster
Readers:
Everyone
Towards Principled Graph Transformers
Luis Müller
,
Daniel Kusuma
,
Blai Bonet
,
Christopher Morris
Published: 11 Oct 2024, Last Modified: 04 Dec 2024
M3L Poster
Readers:
Everyone
A Theory of Initialisation's Impact on Specialisation
Devon Jarvis
,
Sebastian Lee
,
Clémentine Carla Juliette Dominé
,
Andrew M Saxe
,
Stefano Sarao Mannelli
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
,
Julien Siems
,
Jörg K.H. Franke
,
Arber Zela
,
Frank Hutter
,
Massimiliano Pontil
Published: 11 Oct 2024, Last Modified: 28 Feb 2025
M3L Oral
Readers:
Everyone
How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework
Yinuo Ren
,
Haoxuan Chen
,
Grant M. Rotskoff
,
Lexing Ying
Published: 11 Oct 2024, Last Modified: 11 Nov 2024
M3L Poster
Readers:
Everyone
The GAN is dead; long live the GAN! A Modern GAN Baseline
Nick Huang
,
Aaron Gokaslan
,
Volodymyr Kuleshov
,
James Tompkin
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
On Your Mark, Get Set, Warmup!
Dayal Singh Kalra
,
Maissam Barkeshli
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Continuous-Time Analysis of Adaptive Optimization and Normalization
Rhys Gould
,
Hidenori Tanaka
Published: 11 Oct 2024, Last Modified: 09 Dec 2024
M3L Poster
Readers:
Everyone
Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos
Dayal Singh Kalra
,
Tianyu He
,
Maissam Barkeshli
Published: 11 Oct 2024, Last Modified: 10 Nov 2024
M3L Poster
Readers:
Everyone
Does Machine Bring in Extra Bias in Learning? Approximating Discrimination Within Models Quickly
Yijun Bian
,
Yujie Luo
,
Ping Xu
Published: 11 Oct 2024, Last Modified: 11 Nov 2024
M3L Poster
Readers:
Everyone
«
‹
1
2
3
4
›
»