Toggle navigation
OpenReview
.net
Login
×
Back to
ICML
ICML 2024 Workshop MFHAIA Submissions
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback
Zhirui Chen
,
Vincent Y. F. Tan
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping
Haoyu Wang
,
Guozheng Ma
,
Ziqiao Meng
,
Zeyu Qin
,
Li Shen
,
Zhong Zhang
,
Bingzhe Wu
,
Liu Liu
,
Yatao Bian
,
Tingyang Xu
,
Xueqian Wang
,
Peilin Zhao
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
Enhancing Intent Understanding for Ambiguous prompt: A Human-Machine Co-Adaption Strategy
Yangfan He
,
Yuxuan Bai
,
TIANYU SHI
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
Inverse Reinforcement Learning from Demonstrations for LLM Alignment
Hao Sun
,
Mihaela van der Schaar
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
Hummer: Towards Limited Competitive Preference Dataset
Li Jiang
,
Yusen Wu
,
Junwu Xiong
,
Jingqing Ruan
,
Yichuan Ding
,
Qingpei Guo
,
zujie wen
,
JUN ZHOU
,
Xiaotie Deng
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Oral
Readers:
Everyone
Weak-to-Strong Extrapolation Expedites Alignment
Chujie Zheng
,
Ziqi Wang
,
Heng Ji
,
Minlie Huang
,
Nanyun Peng
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
RLHF and IIA: Perverse Incentives
Wanqiao Xu
,
Shi Dong
,
Xiuyuan Lu
,
Grace Lam
,
Zheng Wen
,
Benjamin Van Roy
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Oral
Readers:
Everyone
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences
Souradip Chakraborty
,
Jiahao Qiu
,
Hui Yuan
,
Alec Koppel
,
Furong Huang
,
Dinesh Manocha
,
Amrit Bedi
,
Mengdi Wang
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Oral
Readers:
Everyone
Aligning Crowd Feedback via Distributional Preference Reward Modeling
Dexun Li
,
Cong Zhang
,
Kuicai Dong
,
Derrick Goh Xin Deik
,
Ruiming Tang
,
Yong Liu
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
Hritik Bansal
,
Ashima Suvarna
,
Gantavya Bhatt
,
Nanyun Peng
,
Kai-Wei Chang
,
Aditya Grover
Published: 17 Jun 2024, Last Modified: 02 Jul 2024
ICML 2024 Workshop MHFAIA Poster
Readers:
Everyone
«
‹
1
2
3
›
»