Bandit Learning with Joint Effect of Incentivized Sampling, Delayed Sampling Feedback, and Self-Reinforcing User Preferences | OpenReview

Bandit Learning with Joint Effect of Incentivized Sampling, Delayed Sampling Feedback, and Self-Reinforcing User Preferences

Tianchen Zhou, Jia Liu, Chaosheng Dong, Yi Sun

2022 (modified: 30 Dec 2022)ICLR 2022Readers: Everyone

0 Replies

Loading