A Genetic Algorithm for Solving Sudoku Based on Multiarmed Bandit Selection

Jon-Lark Kim, Eunjee Eor

Published: 01 Jun 2025, Last Modified: 23 Jan 2026IEEE Transactions on GamesEveryoneRevisionsCC BY-SA 4.0

Abstract: In this article, we introduce a genetic algorithm-based upper confidence bound (GA-UCB), an innovative hybrid genetic algorithm integrating multiarmed bandit. It effectively addresses the challenges of solving large and intricate Sudoku puzzles, thus overcoming the constraints of traditional genetic algorithms. In GA-UCB, reinforcement learning is applied to simulate parent selection and crossover. By learning the optimal parent selection within a given population, the population evolves. Based on this technology, GA-UCB demonstrates improved results in solving complex Sudoku puzzles. GA-UCB is compared with several state-of-the-art algorithms on Sudoku puzzles of different difficulty levels and shows a 55% improvement in convergence speed compared to previous research results, particularly in the most challenging instance among the six Sudoku puzzle instances tested.

External IDs:doi:10.1109/tg.2024.3487861