Toggle navigation
OpenReview
.net
Login
×
Back to
COLM
COLM 2025 Conference Submissions
Adaptive Computation Pruning for the Forgetting Transformer
Zhixuan Lin
,
Johan Obando-Ceron
,
Xu Owen He
,
Aaron Courville
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation
Tuhina Tripathi
,
Manya Wadhwa
,
Greg Durrett
,
Scott Niekum
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Pretraining on the Test Set Is No Longer All You Need: A Debate-Driven Approach to QA Benchmarks
Linbo Cao
,
Jinman Zhao
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Estimating Optimal Context Length for Hybrid Retrieval-augmented Multi-document Summarization
Adithya Pratapa
,
Teruko Mitamura
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups
Rijul Magu
,
Arka Dutta
,
Sean Kim
,
Ashiqur R. KhudaBukhsh
,
Munmun De Choudhury
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
M²IV: Towards Efficient and Fine-grained Multimodal In-Context Learning via Representation Engineering
Yanshu Li
,
Yi Cao
,
Hongyang He
,
Qisen Cheng
,
Xiang Fu
,
Xi Xiao
,
Tianyang Wang
,
Ruixiang Tang
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
BiXSE: Improving Dense Retrieval via Probabilistic Graded Relevance Distillation
Christos Tsirigotis
,
Vaibhav Adlakha
,
Joao Monteiro
,
Aaron Courville
,
Perouz Taslakian
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Stop-Think-AutoRegress: Language Modeling with Latent Diffusion Planning
Justin Lovelace
,
Christian K Belardi
,
Sofian Zalouk
,
Adhitya Polavaram
,
Srivatsa R Kundurthy
,
Kilian Q Weinberger
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Corrupted by Reasoning: Reasoning Language Models Become Free-Riders in Public Goods Games
David Guzman Piedrahita
,
Yongjin Yang
,
Mrinmaya Sachan
,
Giorgia Ramponi
,
Bernhard Schölkopf
,
Zhijing Jin
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
In-Context Occam’s Razor: How Transformers Prefer Simpler Hypotheses on the Fly
Puneesh Deora
,
Bhavya Vasudeva
,
Tina Behnia
,
Christos Thrampoulidis
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Reasoning Models Know When They’re Right: Probing Hidden States for Self-Verification
Anqi Zhang
,
Yulin Chen
,
Jane Pan
,
Chen Zhao
,
Aurojit Panda
,
Jinyang Li
,
He He
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
The Negation Bias in Large Language Models: Investigating bias reflected in linguistic markers
Yishan Wang
,
Pia Sommerauer
,
Jelke Bloem
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
Anthony GX-Chen
,
Dongyan Lin
,
Mandana Samiei
,
Doina Precup
,
Blake Aaron Richards
,
Rob Fergus
,
Kenneth Marino
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection
Kabir Ahuja
,
Melanie Sclar
,
Yulia Tsvetkov
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Hell or High Water: Evaluating Agentic Recovery from External Failures
Andrew Wang
,
Sophia Hager
,
Adi Asija
,
Daniel Khashabi
,
Nicholas Andrews
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
A Taxonomy of Transcendence
Natalie Abreu
,
Edwin Zhang
,
Eran Malach
,
Naomi Saphra
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
Shalev Lifshitz
,
Sheila A. McIlraith
,
Yilun Du
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Retrieval-Augmented Generation with Conflicting Evidence
Han Wang
,
Archiki Prasad
,
Elias Stengel-Eskin
,
Mohit Bansal
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models
Thao Nguyen
,
Yang Li
,
Olga Golovneva
,
Luke Zettlemoyer
,
Sewoong Oh
,
Ludwig Schmidt
,
Xian Li
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Impact of LLM Alignment on Impression Formation in Social Interactions
Ala N. Tak
,
Anahita Bolourani
,
Daniel B. Shank
,
Jonathan Gratch
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing
Michael Paul Clemens
,
Ana Marasovic
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Breakpoint: Stress-testing systems-level reasoning in LLM agents
Kaivalya Hariharan
,
Uzay Girit
,
Zifan Wang
,
Jacob Andreas
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Rhapsody: A Dataset for Highlight Detection in Podcasts
Younghan Park
,
Anuj Diwan
,
David Harwath
,
Eunsol Choi
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
M-Prometheus: A Suite of Open Multilingual LLM Judges
José Pombal
,
Dongkeun Yoon
,
Patrick Fernandes
,
Ian Wu
,
Seungone Kim
,
Ricardo Rei
,
Graham Neubig
,
Andre Martins
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
Task Vectors in In-Context Learning: Emergence, Formation, and Benefits
Liu Yang
,
Ziqian Lin
,
Kangwook Lee
,
Dimitris Papailiopoulos
,
Robert D Nowak
Published: 08 Jul 2025, Last Modified: 26 Aug 2025
COLM 2025
Readers:
Everyone
«
‹
1
2
3
4
5
6
7
8
9
10
›
»