Toggle navigation
OpenReview
.net
Login
×
Back to
MLSys
MLSys 2025 Conference Submissions
TurboAttention: Efficient attention approximation for high throughputs llm
Hao Kang
,
Srikant Bharadwaj
,
James Hensman
,
Tushar Krishna
,
Victor Rühle
,
Saravan Rajmohan
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025 withshepherding
Readers:
Everyone
Supply-Chain Attacks in Machine Learning Frameworks
Yue Gao
,
Ilia Shumailov
,
Kassem Fawaz
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
ScaleFusion: Scalable Inference of Spatial-Temporal Diffusion Transformers for High-Resolution Long Video Generation
Jiacheng Yang
,
Jun Wu
,
Zhen Zhang
,
Xinwei Fu
,
Zhiying Xu
,
Zhen Jia
,
Yida Wang
,
Gennady Pekhimenko
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
ReaL: Efficient RLHF Training of Large Language Models with Parameter Reallocation
Zhiyu Mei
,
Wei Fu
,
Kaiwei Li
,
Guangju Wang
,
Huanchen Zhang
,
Yi Wu
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025 withshepherding
Readers:
Everyone
DiffServe: Efficiently Serving Text-to-Image Diffusion Models with Query-Aware Model Scaling
Sohaib Ahmad
,
Qizheng Yang
,
Haoliang Wang
,
Ramesh K. Sitaraman
,
Hui Guan
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
A Bring-Your-Own-Model Approach for ML-Driven Storage Placement in Warehouse-Scale Computers
Chenxi Yang
,
Yan Li
,
Martin Maas
,
Mustafa Uysal
,
Ubaid Ullah Hafeez
,
Arif Merchant
,
Richard McDougall
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
PipeFill: Using GPUs During Bubbles in Pipeline-parallel LLM Training
Daiyaan Arfeen
,
Zhen Zhang
,
Xinwei Fu
,
Gregory Ganger
,
Yida Wang
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators
Beichen Huang
,
Yueming Yuan
,
ZELEI SHAO
,
Minjia Zhang
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
Graph Learning at Scale: Characterizing and Optimizing Pre-Propagation GNNs
Zichao Yue
,
Chenhui Deng
,
Zhiru Zhang
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
Mingyu Liang
,
Hiwot Tadese Kassa
,
Wenyin Fu
,
Brian Coutinho
,
Louis Feng
,
Christina Delimitrou
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
Know Where You’re Uncertain When Planning with Multimodal Foundation Models: A Formal Framework
Neel P. Bhatt
,
Yunhao Yang
,
Rohan Siva
,
Daniel Milan
,
ufuk topcu
,
Zhangyang Wang
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
FlexInfer: Flexible LLM Inference with CPU Computations
Seonjin Na
,
Geonhwa Jeong
,
Byung Hoon Ahn
,
Aaron Jezghani
,
Jeffrey Young
,
Christopher J. Hughes
,
Tushar Krishna
,
Hyesoon Kim
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
Efficient On-Device Machine Learning with a Biologically-Plausible Forward-Only Algorithm
Baichuan Huang
,
Amir Aminifar
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
Seesaw: High-throughput LLM Inference via Model Re-sharding
Qidong Su
,
Wei Zhao
,
Xin Li
,
Muralidhar Andoorveedu
,
Chenhao Jiang
,
Zhanda Zhu
,
Kevin Song
,
Christina Giannoula
,
Gennady Pekhimenko
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
FedProphet: Memory-Efficient Federated Adversarial Training via Robust and Consistent Cascade Learning
Minxue Tang
,
Yitu Wang
,
Jingyang Zhang
,
Louis DiValentin
,
Aolin Ding
,
Amin Hass
,
Yiran Chen
,
Hai Li
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
QServe:W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
Yujun Lin
,
Haotian Tang
,
Shang Yang
,
Zhekai Zhang
,
Guangxuan Xiao
,
Chuang Gan
,
Song Han
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
Shang Yang
,
Junxian Guo
,
Haotian Tang
,
Qinghao Hu
,
Guangxuan Xiao
,
Jiaming Tang
,
Yujun Lin
,
Zhijian Liu
,
Yao Lu
,
Song Han
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds
Yinfang Chen
,
Manish Shetty
,
Gagan Somashekar
,
Minghua Ma
,
Yogesh Simmhan
,
Jonathan Mace
,
Chetan Bansal
,
Rujia Wang
,
Saravan Rajmohan
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
FastTree: Optimizing Attention Kernel and Runtime for Tree-Structured LLM Inference
Zaifeng Pan
,
Yitong Ding
,
Yue Guan
,
Zheng Wang
,
Zhongkai Yu
,
Xulong Tang
,
Yida Wang
,
Yufei Ding
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
Interference-aware Edge Runtime Prediction with Conformal Matrix Completion
Tianshu Huang
,
Arjun Ramesh
,
Emily Ruppel
,
Nuno Pereira
,
Anthony Rowe
,
Carlee Joe-Wong
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Zhiqiang Xie
,
Hao Kang
,
Ying Sheng
,
Tushar Krishna
,
Kayvon Fatahalian
,
Christos Kozyrakis
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025 withshepherding
Readers:
Everyone
On Distributed Larger-Than-Memory Subset Selection With Pairwise Submodular Functions
Maximilian Böther
,
Abraham Sebastian
,
Pranjal Awasthi
,
Ana Klimovic
,
Srikumar Ramalingam
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
The Hidden Bloat in Machine Learning Systems
Huaifeng Zhang
,
Ahmed Ali-Eldin
Published: 11 Feb 2025, Last Modified: 16 May 2025
MLSys 2025
Readers:
Everyone
COMET: Fine-grained Computation-communication Overlapping for Mixture-of-Experts
Shulai Zhang
,
Ningxin Zheng
,
Haibin Lin
,
Ziheng Jiang
,
Wenlei Bao
,
Chengquan Jiang
,
Qi Hou
,
Weihao Cui
,
Size Zheng
,
Li-Wen Chang
,
Quan Chen
,
Xin Liu
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
APOLLO: SGD-like Memory, AdamW-level Performance
Hanqing Zhu
,
Zhenyu Zhang
,
Wenyan Cong
,
Xi Liu
,
Sem Park
,
Vikas Chandra
,
Bo Long
,
David Z. Pan
,
Zhangyang Wang
,
Jinwon Lee
Published: 11 Feb 2025, Last Modified: 13 May 2025
MLSys 2025
Readers:
Everyone
«
‹
1
2
3
›
»