Data-Efficient Alignment via Learning from Collective Feedback in Social MediaDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Aligning large language models (LLMs) with human feedback becomes a critical area due to LLMs' potential for acquiring undesirable abilities from unsupervised corpora. Traditionally, LLMs' alignment involves extensive human preference data collection, which is time-consuming and labor-intensive. To address this issue, in this paper, we explore LLM alignment via learning from collective feedback (LCF) contained in social media. Social media users often provide diverse feedback on content, reflecting a broad spectrum of human preferences, which can provide abundant training signals for alignment. We thoroughly investigate the training strategies for incorporating collective feedback and examine the effectiveness of LCF on widely-used direct preference optimization algorithm. The experimental results show that LCF can significantly reduce the need for human annotation, achieving comparable performance with only 20% of annotated data. Additionally, LLMs with LCF exhibit improved generalization across out-of-domain instructions. The code and data used in our paper will be released to promote the development of learning from collective feedback.
Paper Type: short
Research Area: Efficient/Low-Resource Methods for NLP
Contribution Types: Approaches to low-resource settings
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview