TL;DR: squeeze deep RL onto tiny ML devices.
Abstract: The use of Deep Reinforcement Learning (Deep RL) in many resource constrained mobile systems has been limited in scope due to the severe resource consumption (e.g., memory, computation, energy) such approaches exert. As a result, TinyML devices ranging from sensors and cameras to small formfactor robots and drones have been unable to benefit from the advantages of recent Deep RL algorithms that have underpinned breakthrough results in applications of decision and control. In this work, we propose and study a variety of general-purpose techniques designed to lower such system resource bottlenecks for Deep RL by optimizing both the agent algorithms and neural architectures used in these solutions. Experiments show our Deep RL optimization framework that combines these techniques is able produce significant efficiency gains to the point such techniques become feasible for TinyML platforms. We present one representative end-to-end application (viz. network protocol learning) executing on constrained processors (embedded-hardware), in addition to simulated control problems addressed assuming limited access to system resources.