Abstract: Highlights•Stabilize the synchronization and convergence points allowing for more accurate estimation.•Avoid overestimated bias to generalize the model well.•Adds noise to the target action to exploit the Q-function errors by smoothing out Q-values.•Ran simulations which capture the characteristics of closed loop DBS.
Loading