human in the loop learning, human robot interaction, reinforcement learning from human feedback, RLHF, preference learning, alignment, PbRL, preference-based reinforcement learning
Enter your feedback below and we'll get back to you as soon as possible. To submit a bug report or feature request, you can use the official OpenReview GitHub repository: Report an issue
BibTeX Record
Click anywhere on the box above to highlight complete record