Using Probabilistic Model Rollouts to Boost the Sample Efficiency of Reinforcement Learning for Automated Analog Circuit Sizing

Published: 06 Nov 2024, Last Modified: 26 Jan 202661st ACM/IEEE Design Automation Conference (DAC), 2024, San Francisco, CA, USA, Association for Computing Machinery (ACM)EveryoneCC BY 4.0
Abstract: Despite recent advances in algorithms, such as the use of reinforcement learning, analog circuit sizing optimization remains a challenging task that demands numerous circuit simulations, hence extensive CPU times. This paper introduces the application of Model-Based Policy Optimization (MBPO) to highly boost the sample efficiency of reinforcement learning for analog circuit sizing. This method leverages an ensemble of probabilistic dynamic models to generate short rollouts branched from real data for a fast but extensive exploration of the design space, thereby speeding up the learning process of the reinforcement learning agent and improving its convergence. Integrated in the Twin Delayed DDPG (TD3) algorithm, our new model-based TD3 (MBTD3) approach is validated on analog circuits of different complexity, outperforming the existing model-free TD3 method by achieving power/area-optimal design solutions within up to ~3x fewer simulations and half the run time. In addition, for larger analog circuits, we present a multi-agent version of MBTD3, in which multiple simultaneous agents use global probabilistic models for sizing the different sub-blocks within the circuit. Demonstrated for a complex data receiver circuit, it surpasses the model-free multi-agent TD3 method with ~2x less simulations and half the run time. The proposed novel algorithms clearly boost the efficiency of automated analog circuit sizing.
Loading