Keywords: Combinatorial Optimization, Reinforcement Learning, Evolution Strategies, Simulated Annealing
Abstract: Simulated annealing (SA) is a stochastic global optimisation technique applicable to a wide range of discrete and continuous variable problems. Despite its simplicity, the development of an effective SA optimiser for a given problem hinges on a handful of carefully handpicked components; namely, neighbour proposal distribution and temperature annealing schedule. In this work, we view SA from a reinforcement learning perspective and frame the proposal distribution as a policy, which can be optimised for higher solution quality given a fixed computational budget. We demonstrate that this Neural SA with such a learnt proposal distribution outperforms SA baselines with hand-selected parameters on a number of problems: Rosenbrock's function, the Knapsack problem, the Bin Packing problem, and the Travelling Salesperson problem. We also show that Neural SA scales well to large problems while again outperforming popular off-the-shelf solvers in terms of solution quality and wall clock time.
One-sentence Summary: We extend simulated annealing with learnable (neural) components, producing a new method that is competitive in terms of both speed and quality of the final solution.
17 Replies
Loading