Abstract: Highlights•Present generalization error bound for the reward transfer paradigm in TIL.•Evaluate transfer effects and propose alternative reward transfer plans.•Equate minimizing optimizable training error to maximizing RL objective in target.•Apply our main results to evaluate diverse possible transfer effects.
Loading