Boosted Off-Policy Learning

Ben London, Levi Lu, Ted Sandler, Thorsten Joachims

Published: 2023, Last Modified: 17 Aug 2023AISTATS 2023Readers: Everyone

Abstract: We propose the first boosting algorithm for off-policy learning from logged bandit feedback. Unlike existing boosting methods for supervised learning, our algorithm directly optimizes an estimate o...

0 Replies