In paper 'Deep Reinforcement Learning for Sponsored Search Real-time Bidding', the algorithm is based on DQN, which involves finding a DNN that can closely approximate the real optimal Q function with an iterative solver. Another paper that you've read also employs the target network for stable convergence. Provide the full name of that paper.