Abstract: Reinforcement Learning (RL) has marked significant achievements across a variety of complex tasks in real-world scenarios. However, the efficacy of RL predominantly relies on the availability of extensive datasets and considerable training resources. Hence, there is the critical need for a distributed system capable of generating and processing vast amounts of data with efficiency. In this work, we introduce a Scalable and Customized Distributed Reinforcement Learning system (SCDRL). Concretely, we analyze the paradigm of RL and decouple the major RL computations into three main aspects, i.e. environment simulation, policy inference and policy training. Such decouple enables SCDRL to efficiently allocate computing resources (be it CPUs or GPUs of varying computational capabilities) tailored to the specific needs of each component. We demonstrate the effectiveness of our method across several key RL environments, demonstrating that our system not only achieves significant learning outcomes and enhanced throughput but also utilizes computing resources with greater efficiency. Notably, our findings reveal SCDRL's proficiency in optimizing resource use not just in single-machine setups but also in multi-machine configurations, all the while maintaining data efficiency and resource utilization.