Toggle navigation
OpenReview
.net
Login
×
Back to
KDD
KDD 2025 Workshop Inference Optimization for GenAI Submissions
Kinetics: Rethinking Test-Time Scaling Laws
Ranajoy Sadhukhan
,
Zhuoming Chen
,
Haizhong Zheng
,
Yang Zhou
,
Emma Strubell
,
Beidi Chen
Published: 04 Jul 2025, Last Modified: 23 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Oral
Readers:
Everyone
Token-PD: Portfolio-Optimal KV-Cache Eviction for Multi-Tenant LLM Inference
Thomas Y Chen
Published: 04 Jul 2025, Last Modified: 22 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Poster
Readers:
Everyone
A Lightweight Reasoning Method with Test-Time Scaling for Preserving Diversity and Factuality in LLM-Based Decision-Making
Rongrong Chen
,
Kailin Gao
,
Yuan He
,
Hongsheng Qi
Published: 04 Jul 2025, Last Modified: 22 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Poster
Readers:
Everyone
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sagar Srinivas Sakhinana
,
Shivam Gupta
,
Akash Das
,
Venkataramana Runkana
Published: 04 Jul 2025, Last Modified: 22 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Oral
Readers:
Everyone
Latent Multi-Head Attention for Small Language Models
Sushant Mehta
,
Raj Dandekar
,
Rajat Dandekar
,
Sreedath Panat
Published: 04 Jul 2025, Last Modified: 22 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Poster
Readers:
Everyone
CALO-GNN: Calibrated‑Uncertainty Graph Cost Models for Cross‑Device TVM Meta‑Schedule
Sanjay Kumar Patnala
Published: 04 Jul 2025, Last Modified: 22 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Poster
Readers:
Everyone
Maximizing LLM Efficiency Through Optimization Strategies
Iman Abbasnejad
,
Tomal Deb
,
Xuefeng Liu
Published: 04 Jul 2025, Last Modified: 22 Jul 2025
KDD 2025 Workshop on Inference Optimization for GenAI Poster
Readers:
Everyone