Toggle navigation
OpenReview
.net
Login
×
Back to
ICLR
ICLR 2025 Workshop SSI-FM Submissions
MALT: Improving Reasoning with Multi-Agent LLM Training
Sumeet Ramesh Motwani
,
Chandler Smith
,
Rocktim Jyoti Das
,
Rafael Rafailov
,
Ivan Laptev
,
Philip Torr
,
Fabio Pizzati
,
Ronald Clark
,
Christian Schroeder de Witt
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment
Haoyu Wang
,
Zeyu Qin
,
Li Shen
,
Xueqian Wang
,
Minhao Cheng
,
Dacheng Tao
Published: 08 Mar 2025, Last Modified: 16 Mar 2025
SSI-FM Poster
Readers:
Everyone
AgentBreeder: Mitigating the AI Safety Impact of Multi-Agent Scaffolds via Self-Improvement
J Rosser
,
Jakob Nicolaus Foerster
Published: 08 Mar 2025, Last Modified: 14 Apr 2025
SSI-FM Oral
Readers:
Everyone
Don't Throw Away Data: Improving Sequence Knowledge Distillation with Minimum Bayes Risk Decoding
Jun Wang
,
Eleftheria Briakou
,
Hamid Dadkhahi
,
Rishabh Agarwal
,
Colin Cherry
,
Trevor Cohn
Published: 08 Mar 2025, Last Modified: 13 Apr 2025
SSI-FM Poster
Readers:
Everyone
Evaluating LLMs Without Oracle Feedback: Agentic Annotation Evaluation Through Unsupervised Consistency Signals
Cheng Chen
,
Haiyan Yin
,
Ivor Tsang
Published: 08 Mar 2025, Last Modified: 13 Apr 2025
SSI-FM Poster
Readers:
Everyone
Mitigating Short Board Effect via Dynamic Reward Balancing in Multi-reward LLM Optimization
Nuo Chen
,
Yufei Gao
,
Yongnan Jin
,
Yan Hu
,
Anningzhe Gao
,
Lingyong Yan
,
Benyou Wang
Published: 08 Mar 2025, Last Modified: 22 Apr 2025
SSI-FM Poster
Readers:
Everyone
Safety is Essential for Responsible Open-Ended Systems
Ivaxi Sheth
,
Jan Wehner
,
Sahar Abdelnabi
,
Ruta Binkyte
,
Mario Fritz
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
Self-Improving Diffusion Models With Synthetic Data
Sina Alemohammad
,
Ahmed Imtiaz Humayun
,
Shruti Agarwal
,
John Collomosse
,
Richard Baraniuk
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
Escaping Collapse: The Strength of Weak Data for Large Language Model Training
Kareem Amin
,
Sara Babakniya
,
Alex Bie
,
Weiwei Kong
,
Umar Syed
,
Sergei Vassilvitskii
Published: 08 Mar 2025, Last Modified: 12 Apr 2025
SSI-FM Poster
Readers:
Everyone
Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage
Zhi Gao
,
Bofei Zhang
,
Pengxiang Li
,
Xiaojian Ma
,
Tao Yuan
,
Yue Fan
,
Yuwei Wu
,
Yunde Jia
,
Song-Chun Zhu
,
Qing Li
Published: 08 Mar 2025, Last Modified: 17 Mar 2025
SSI-FM Poster
Readers:
Everyone
OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning
Jiawei Zhou
,
Lei Chen
Published: 08 Mar 2025, Last Modified: 22 Mar 2025
SSI-FM Poster
Readers:
Everyone
Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources
Alisia Maria Lupidi
,
Carlos Gemmell
,
Nicola Cancedda
,
Jane Yu
,
Jason E Weston
,
Jakob Nicolaus Foerster
,
Roberta Raileanu
,
Maria Lomeli
Published: 08 Mar 2025, Last Modified: 29 Mar 2025
SSI-FM Poster
Readers:
Everyone
RMBoost: Reward Model Training With Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
,
Ran Xu
,
Yennie Jun
,
Zhen Qin
,
Tianqi Liu
,
Carl Yang
,
Yi Liang
,
Simon Baumgartner
,
Michael Bendersky
Published: 08 Mar 2025, Last Modified: 14 Mar 2025
SSI-FM Poster
Readers:
Everyone
Scalable Thompson Sampling via Ensemble++
Yingru Li
,
Jiawei Xu
,
Baoxiang Wang
,
Zhi-Quan Luo
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
Policy-Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
Max Sobol Mark
,
Tian Gao
,
Georgia Gabriela Sampaio
,
Mohan Kumar Srirama
,
Archit Sharma
,
Chelsea Finn
,
Aviral Kumar
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
Self-Correcting Self-Consuming Loops For Generative Model Training
Nate Gillman
,
Michael Freeman
,
Daksh Aggarwal
,
Chia-Hong HSU
,
Calvin Luo
,
Yonglong Tian
,
Chen Sun
Published: 08 Mar 2025, Last Modified: 14 Mar 2025
SSI-FM Poster
Readers:
Everyone
Natural Language Reinforcement Learning
Xidong Feng
,
Bo Liu
,
Ziyu Wan
,
Haotian Fu
,
Girish A. Koushik
,
Zhiyuan Hu
,
Mengyue Yang
,
Ying Wen
,
Jun Wang
Published: 08 Mar 2025, Last Modified: 18 Apr 2025
SSI-FM Poster
Readers:
Everyone
Scaling Flaws of Verifier-guided Search in Mathematical Reasoning
Fei Yu
,
Yingru Li
,
Benyou Wang
Published: 08 Mar 2025, Last Modified: 13 Apr 2025
SSI-FM Poster
Readers:
Everyone
Optimizing Test-Time Compute via Meta Reinforcement Finetuning
Yuxiao Qu
,
Matthew Y. R. Yang
,
Lewis Tunstall
,
Edward Emanuel Beeching
,
Ruslan Salakhutdinov
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
An Architecture Search Framework for Inference-Time Techniques
Jon Saad-Falcon
,
Adrian Gamarra Lafuente
,
Shlok Natarajan
,
Nahum Maru
,
Hristo Todorov
,
Etash Kumar Guha
,
E. Kelly Buchanan
,
Mayee F Chen
,
Neel Guha
,
Christopher Re
,
Azalia Mirhoseini
Published: 08 Mar 2025, Last Modified: 13 Apr 2025
SSI-FM Oral
Readers:
Everyone
Boss LLM: Adaptation via No-Regret Learning
Yu Feng
,
Avishree Khare
,
Nghia Nguyen
,
Sikata Bela Sengupta
Published: 08 Mar 2025, Last Modified: 13 Apr 2025
SSI-FM Poster
Readers:
Everyone
Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy
Saeid Asgari
,
Joao Monteiro
Published: 08 Mar 2025, Last Modified: 21 Mar 2025
SSI-FM Poster
Readers:
Everyone
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension
Xiyao Wang
,
Zhengyuan Yang
,
Linjie Li
,
Hongjin Lu
,
Yuancheng Xu
,
Chung-Ching Lin
,
Kevin Lin
,
Furong Huang
,
Lijuan Wang
Published: 08 Mar 2025, Last Modified: 08 Mar 2025
SSI-FM Poster
Readers:
Everyone
«
‹
1
2
3
›
»