Mitigating OOD overoptimism via in-sample value function in offline reinforcement learning

Wenhui Liu, Kangyang Luo, Zhijian Wu, Shanfeng Hao, Dingjiang Huang

Published: 2026, Last Modified: 07 May 2026Neural Networks 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading