OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Back to
ICML
ICML 2025 Workshop MOSS Submissions
Loading
On the Emergence of Position Bias in Transformers
Xinyi Wu
,
Yifei Wang
,
Stefanie Jegelka
,
Ali Jadbabaie
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Foundation Models on a Budget: Approximating Blocks in Large Vision Models
Irene Cannistraci
,
Simone Antonelli
,
Emanuele Palumbo
,
Thomas M. Sutter
,
Emanuele Rodolà
,
Bastian Rieck
,
Julia E Vogt
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Permutations as a testbed for studying the effect of input representations on learning
Sarah McGuire Scullen
,
Davis Brown
,
Robert Jasper
,
Henry Kvinge
,
Helen Jenne
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Emergence, pretraining loss and associative recall: a toy model
Sultan Daniels
,
Dylan Davis
,
Dhruv Gautam
,
Wentinn Liao
,
Gireeja Ranade
,
Anant Sahai
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Performance Plateaus in Inference-Time Scaling for Text-to-Image Diffusion Without External Models
Changhyun Choi
,
Sungha Kim
,
H. Jin Kim
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Extrapolation by Association: Length Generalization Transfer in Transformers
Ziyang Cai
,
Nayoung Lee
,
Avi Schwarzschild
,
Samet Oymak
,
Dimitris Papailiopoulos
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought
Hanlin Zhu
,
Shibo Hao
,
Zhiting Hu
,
Jiantao Jiao
,
Stuart Russell
,
Yuandong Tian
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025 Oral
Readers:
Everyone
CaliPSo: Calibrated Predictive Models with Sharpness as Loss Function
Alexandre Capone
,
Kamron Zaidi
,
Tianyu Xu
,
Brian Yang
,
Geoff Pleiss
,
Jeff Schneider
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
AdaptMI: Adaptive Skill-based In-context Math Instructions for Small Language Models
Yinghui He
,
Abhishek Panigrahi
,
Yong Lin
,
Sanjeev Arora
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Understanding How Chess-Playing Language Models Compute Linear Board Representations
Aaron Mei
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Do Larger Language Models Imply Better Generalization? A Pretraining Scaling Law for Implicit Reasoning
Xinyi Wang
,
Shawn Tan
,
Mingyu Jin
,
William Yang Wang
,
Rameswar Panda
,
Yikang Shen
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025 Oral
Readers:
Everyone
Neural Stochastic Differential Equations on Compact State-Spaces
Yue-Jane Liu
,
Malinda Lu
,
Matthew K. Nock
,
Yaniv Yacoby
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Universal Dynamics of Warmup Stable Decay: understanding WSD beyond Transformers
Annalisa Belloni
,
Lorenzo Noci
,
Antonio Orvieto
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
TinyServe: Query-Aware Cache Selection for Efficient LLM Inference
Dong Liu
,
Yanxuan Yu
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Generative or Discriminative? Revisiting Text Classification in the Era of Transformers
Siva Rajesh Kasa
,
Sumegh Roychowdhury
,
Karan Gupta
,
Yaswanth Biruduraju
,
Santhosh Kumar Kasa
,
Ashutosh Kumar
,
Pattisapu Nikhil Priyatam
,
Arindam Bhattacharya
,
Shailendra Agarwal
,
Vijay huddar
Published: 10 Jun 2025, Last Modified: 26 Nov 2025
MOSS@ICML2025
Readers:
Everyone
Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Lillian Sun
,
Martin Pawelczyk
,
Zhenting Qi
,
Aounon Kumar
,
Himabindu Lakkaraju
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025 Oral
Readers:
Everyone
Evaluating Generalization and Representation Stability in Small LMs via Prompting, Fine-Tuning and Out-of-Distribution Prompts
Rahul Raja
,
Arpita Vats
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Measuring Memorization and Generalization in Forecasting Models via Structured Perturbations of Chaotic Systems
Max Kanwal
,
Caryn Tran
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Exploring Diverse Solutions for Underdetermined Problems
Eric Volkmann
,
Andreas Radler
,
Johannes Brandstetter
,
Arturs Berzins
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Transformers May Learn to Classify In-Context by Context-Adaptive Kernel Gradient Descent
Sara Dragutinović
,
Andrew M Saxe
,
Aaditya K Singh
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Cross-Validation Error Dynamics in Smaller Datasets
Bethany austhof
,
Lev Reyzin
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Encoding Domain Insights into Multi-modal Fusion: Improved Performance at the Cost of Robustness
Jackson Sam Michaels
,
Sidong Zhang
,
Madalina Fiterau
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
What Happens During the Loss Plateau? Understanding Abrupt Learning in Transformers
Pulkit Gopalani
,
Wei Hu
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
Dataset Distillation for Memorized Data: Soft Labels can Leak Held-Out Teacher Knowledge
Freya Behrens
,
Lenka Zdeborova
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025 Oral
Readers:
Everyone
Efficient B-Tree Insertions Using Proximal Policy Optimization and Hierarchical Attention Models
Alexander Kastius
,
Nick Lechtenbörger
,
Felix Schulz
,
Johann Schulze Tast
,
Rainer Schlosser
,
Ralf Herbrich
Published: 10 Jun 2025, Last Modified: 15 Jul 2025
MOSS@ICML2025
Readers:
Everyone
«
‹
1
2
3
›
»