Toggle navigation
OpenReview
.net
Login
×
Back to
ICLR
ICLR 2025 Workshop SLLM Submissions
One Must Imagine Experts Happy: Rebalancing Neural Routers via Constrained Optimization
Kushal Thaman
Published: 05 Mar 2025, Last Modified: 22 Apr 2025
SLLM
Readers:
Everyone
ClusterGen: Token Generation in Sublinear Time and Memory with Clustering KV Cache
Amir Zandieh
,
Insu Han
,
Amin Karbasi
,
Vahab Mirrokni
Published: 05 Mar 2025, Last Modified: 18 Apr 2025
SLLM
Readers:
Everyone
Contextual Sparsity as a Tool for Mechanistic Understanding of Retrieval in Hybrid Foundation Models
Davide Zani
,
Kurt Felix Michalak
,
Steven Abreu
Published: 05 Mar 2025, Last Modified: 18 Apr 2025
SLLM
Readers:
Everyone
MoE Lens - An Expert Is All You Need
Marmik Chaudhari
,
Idhant Gulati
,
Nishkal Hundia
,
Pranav Karra
,
Shivam Raval
Published: 05 Mar 2025, Last Modified: 27 Apr 2025
SLLM
Readers:
Everyone
Sparse and Wide Linear RNNs Are at the Efficiency-Performance Pareto Front
Alessandro Pierro
,
Steven Abreu
,
Jonathan Timcheck
,
Philipp Stratmann
,
Sumit Bam Shrestha
Published: 05 Mar 2025, Last Modified: 16 Apr 2025
SLLM
Readers:
Everyone
KURTAIL : KURTOSIS-BASED LLM QUANTIZATION
Mohammad Sadegh Akhondzadeh
,
Aleksandar Bojchevski
,
Evangelos Eleftheriou
,
Martino Dazzi
Published: 05 Mar 2025, Last Modified: 03 Apr 2025
SLLM
Readers:
Everyone
Post-LoRA Restoration: Utilizing Transferability of Low-Rank Adapter in Quantized Foundation Models
Yuto Kanda
,
Kenji Hatano
Published: 05 Mar 2025, Last Modified: 25 Apr 2025
SLLM
Readers:
Everyone
NoWag: A Unified Framework for Shape Preserving Compression of Large Language Models
Lawrence Ray Liu
,
Inesh Chakrabarti
,
Yixiao Li
,
Mengdi Wang
,
Tuo Zhao
,
Lin Yang
Published: 05 Mar 2025, Last Modified: 21 Apr 2025
SLLM
Readers:
Everyone
Efficient Transformers via MPO-Based Low-Rank Factorization and Pruning
Sam Mikhak
,
Venkata Sai Gummidi
,
Praneeth Medepalli
,
Kevin Zhu
Published: 05 Mar 2025, Last Modified: 09 Apr 2025
SLLM
Readers:
Everyone
Steering Fine-Tuning Generalization with Targeted Concept Ablation
Helena Casademunt
,
Caden Juang
,
Senthooran Rajamanoharan
,
Neel Nanda
Published: 05 Mar 2025, Last Modified: 17 Apr 2025
SLLM
Readers:
Everyone
LoRAM: Low-Rank Adaptation of Large Language Models on Manifold
Xiaowen Jiang
,
Xun Wang
,
Sebastian U Stich
Published: 05 Mar 2025, Last Modified: 29 Mar 2025
SLLM
Readers:
Everyone
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Yingtao Zhang
,
Jialin Zhao
,
Wenjing Wu
,
Ziheng Liao
,
Umberto Michieli
,
Carlo Vittorio Cannistraci
Published: 05 Mar 2025, Last Modified: 10 Apr 2025
SLLM
Readers:
Everyone
Q-Filters: Leveraging Query-Key Geometry for Efficient Key-Value Cache Compression
Nathan Godey
,
Alessio Devoto
,
Yu Zhao
,
Simone Scardapane
,
Pasquale Minervini
,
Éric Villemonte de la Clergerie
,
Benoît Sagot
Published: 05 Mar 2025, Last Modified: 09 Apr 2025
SLLM
Readers:
Everyone
Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks
Jialin Zhao
,
Yingtao Zhang
,
Xinghang Li
,
Huaping Liu
,
Carlo Vittorio Cannistraci
Published: 05 Mar 2025, Last Modified: 10 Apr 2025
SLLM
Readers:
Everyone
Prefix and Output Length-Aware Scheduling for Efficient Online LLM Inference
Iñaki Arango
,
Ayush Noori
,
Yepeng Huang
,
Rana Shahout
,
Minlan Yu
Published: 05 Mar 2025, Last Modified: 10 Apr 2025
SLLM
Readers:
Everyone
RLMedusa: Reinforcement Learning for Multiple Decoding Heads to Accelerate LLM Inference
Aadit Juneja
,
Parsa Idehpour
Published: 05 Mar 2025, Last Modified: 05 Mar 2025
SLLM
Readers:
Everyone
Compressed sparse tiles for memory-efficient unstructured and semi-structured sparsity
Mike Lasby
,
Max Zimmer
,
Sebastian Pokutta
,
Erik Schultheis
Published: 05 Mar 2025, Last Modified: 22 Apr 2025
SLLM
Readers:
Everyone
Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries
Junhyuck Kim
,
Jongho Park
,
Jaewoong Cho
,
Dimitris Papailiopoulos
Published: 05 Mar 2025, Last Modified: 14 Apr 2025
SLLM
Readers:
Everyone
Pivoting Factorization: A Compact Meta Low-Rank Representation of Sparsity for Efficient Inference in Large Language Models
Jialin Zhao
,
Yingtao Zhang
,
Carlo Vittorio Cannistraci
Published: 05 Mar 2025, Last Modified: 10 Apr 2025
SLLM
Readers:
Everyone
On the Spatial Structure of Mixture-of-Experts in Transformers
Daniel Bershatsky
,
Ivan Oseledets
Published: 05 Mar 2025, Last Modified: 06 Apr 2025
SLLM
Readers:
Everyone
Scaling Laws and Efficient Inference for Ternary Language Models
Tejas Vaidhya
,
Ayush Kaushal
,
Vineet Jain
,
Francis Couture-Harpin
,
Prashant Shishodia
,
Majid Behbahani
,
Irina Rish
,
Yuriy Nevmyvaka
Published: 05 Mar 2025, Last Modified: 09 Apr 2025
SLLM
Readers:
Everyone
High Frequency Latents Are Features, Not Bugs
Xiaoqing Sun
,
Joshua Engels
,
Max Tegmark
Published: 05 Mar 2025, Last Modified: 18 Apr 2025
SLLM
Readers:
Everyone
Scalable Continual Learning: Adaptive MoEs for Expanding Task Sets
Adrian Candocia
,
Omer Mustafa Inan
,
Raaghav Agarwal
,
Aamod Varma
,
Mark A. Davenport
Published: 05 Mar 2025, Last Modified: 09 Apr 2025
SLLM
Readers:
Everyone
Evaluating LLM Memorization Using Soft Token Sparsity
Zhili Feng
,
Yixuan Even Xu
,
Pratyush Maini
,
Alexander Robey
,
Avi Schwarzschild
,
J Zico Kolter
Published: 05 Mar 2025, Last Modified: 09 Apr 2025
SLLM
Readers:
Everyone
ReALLM: a general framework for LLM compression and fine-tuning
Lisa Bedin
,
Louis Leconte
,
Van Minh Nguyen
,
Eric Moulines
Published: 05 Mar 2025, Last Modified: 05 Mar 2025
SLLM
Readers:
Everyone
«
‹
1
2
3
›
»