Abstract: We present a new exploration term, more efficient than classical UCT-like exploration terms. It combines efficiently expert rules, patterns extracted from datasets, All-Moves-As-First values, and classical online values. As this improved bandit formula does not solve several important situations (semeais, nakade) in computer Go, we present three other important improvements which are central in the recent progress of our program MoGo. MoGo recently won a game with handicap 7 against a 9Dan Pro player, Zhou JunXun, winner of the LG Cup 2007, and a game with handicap 6 against a 1Dan pro player, Li-Chen Chien.
0 Replies
Loading