Abstract: In this project, we deployed various types of reinforcement learning algorithms to resolves the rewards maximization problem in water sailing by reaching the destination with the highest priority using the smallest number of steps. We have implemented our own environment for this project and trained different agents using policy iteration, value iteration, and Deep-Q Learning (DQN). We comprehensively evaluates our approach on the environment with 8x8 and 16x16 map, our results show that the agents trained by policy iteration and value iteration can reaches the destination by giving certain number of steps, and the agent trained by DQN can finished the sailing using the smaller number of steps.
3 Replies
Loading