Keywords: Reinforcement learning, Hardware Acceleration, Parallel Environment Execution, Hardware-Software Codesign
TL;DR: We implement an FPGA-CPU joint hardware-acceleration framework for RL environment parallel execution
Abstract: Reinforcement learning (RL) faces the key challenge of balancing exploration (gathering information about the environment through trials) and exploitation (exploiting current knowledge to maximize rewards), especially in the open world. To boost exploration efficiency, parallel environment execution is a widely used technique that instantiates multiple environments and executes them in parallel. However, this is computationally challenging using CPUs limited by the total thread numbers. In this work, we present FPGA-Gym, an FPGA-CPU joint acceleration framework for accelerating RL environment parallel executions. By offloading environment steps to FPGA hardware, FPGA-Gym achieves a 4.36 to 972.6$\times$ speedup over the existing fastest software-based framework for parallel environment execution. Moreover, FPGA-Gym is a general RL acceleration framework compatible with existing RL algorithms and frameworks. Its modular and parameterized features allow users to conveniently customize new environments without extensive FPGA knowledge. We demonstrate multiple representative RL benchmarks (e.g. Cartpole, CliffWalking, Seaquest etc.) with Deep Q-Network and proximal policy optimization algorithms. Additionally, we provide a standard environment library similar to Gymnasium based on FPGA-Gym framework. The framework and the library are available at https://github.com/Selinaee/FPGA_Gym.
Submission Number: 78
Loading