Files already downloaded and verified
Matrix distribution: CIFAR
Matrix distribution config: {'c': 0.25, 'd': 5000, 'eps': 0.001}
Initial matrix shape: torch.Size([3072, 3072])
Algorithm name: mcts
Algorithm config: {'c_ucb': 5.0, 'alpha_pw': 0.4, 'epsilon': 1e-06, 'EXPLORE_K': 5, 'early_termination_epsilon': 1e-05, 'budget': 150000, 'print_every': 1000, 'max_termination_count': 10, 'tree_initial_capacity': 10000, 'device': 'cuda', 'actions': [['sqrt_db', [[0, 0], [50, 50]]], ['sqrt_nsv', [[0, 0], [5, 5]]], ['sqrt_visser', [[0, 0], [10, 10]]], ['sqrt_visser_coupled', [[0, 0], [10, 10]]], ['sqrt_couple', None]], 'initialize_with_baselines': True}
Actions: ['sqrt_couple', 'sqrt_db', 'sqrt_nsv', 'sqrt_visser', 'sqrt_visser_coupled']
Action sqrt_couple took 1.0 times longer than sqrt_couple
Action sqrt_db took 1.7910451929187616 times longer than sqrt_couple
Action sqrt_nsv took 0.35146734847764355 times longer than sqrt_couple
Action sqrt_visser took 0.13585155341451385 times longer than sqrt_couple
Action sqrt_visser_coupled took 0.2714850320202462 times longer than sqrt_couple
Skipping sign_newton because not all actions are in the tree
Skipping sign_scaled_newton because not all actions are in the tree
Skipping sign_ns because not all actions are in the tree
Skipping sign_scaled_ns because not all actions are in the tree
Skipping sign_newton_variant because not all actions are in the tree
Skipping sign_halley because not all actions are in the tree
Skipping inv_ns because not all actions are in the tree
Skipping inv_ns_chebyshev because not all actions are in the tree
Skipping sqrt_newton because not all actions are in the tree
Skipping sqrt_newton_coupled because not all actions are in the tree
Skipping proot_newton because not all actions are in the tree
Skipping proot_visser because not all actions are in the tree
Skipping proot_iannazzo because not all actions are in the tree
[?25l/home/sykim/code/make_algorithm/losses.py:39: RuntimeWarning: overflow encountered in multiply
  loss = np.linalg.norm(x * x - y) / np.linalg.norm(y)
[2K/home/sykim/code/make_algorithm/actions.py:878: RuntimeWarning: overflow encountered in multiply
  intermediate = a0 - a1 * Y * Z
[2K/home/sykim/code/make_algorithm/actions.py:879: RuntimeWarning: overflow encountered in multiply
  Yn = 0.5 * Y * intermediate
[2K/home/sykim/code/make_algorithm/actions.py:880: RuntimeWarning: overflow encountered in multiply
  Zn = 0.5 * Z * intermediate
[2K0/149 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:00[0m Remaining: [36m-:--:--[0m 501319.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-10.5777289 -10.5777289]                                                                                                                                                │
│ [-4.13762587 -4.13762587]                                                                                                                                                │
│ [-3.43469117 -3.43469117 -3.43469117]                                                                                                                                    │
│ [-3.35470885 -3.35470885 -3.35470885 -3.35470885 -3.08322382]                                                                                                            │
│ [-3.08322382 -3.08322382 -3.08322382 -3.08322382 -2.81173879 -2.81173879]                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/149 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.7%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m   1.00 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 997, 1000]                                                                                                                                                  │
│ Average cumulative reward:       -11.357703763339368                                                                                                                     │
│ Average rollout reward:          -11.140934573206605                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0832238198413946                                                                                                                             │
│ Best path: [0, 4, 48, 49, 53, 68]                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/149 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.7%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 997, 1000]                                                                                                                                                  │
│ Average cumulative reward:       -11.357703763339368                                                                                                                     │
│ Average rollout reward:          -11.140934573206605                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0832238198413946                                                                                                                             │
│ Best path: [0, 4, 48, 49, 53, 68]                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/149 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m0:02:06[0m   1.00 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 6, 22, 24, 1941, 1949, 2000]                                                                                                                                │
│ Average cumulative reward:       -11.29171796826562                                                                                                                      │
│ Average rollout reward:          -11.00490729109313                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0832238198413946                                                                                                                             │
│ Best path: [0, 4, 48, 49, 53, 68]                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/149 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m0:02:06[0m   1.25 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 6, 22, 24, 1941, 1949, 2000]                                                                                                                                │
│ Average cumulative reward:       -11.29171796826562                                                                                                                      │
│ Average rollout reward:          -11.00490729109313                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0832238198413946                                                                                                                             │
│ Best path: [0, 4, 48, 49, 53, 68]                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/149 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.0%[0m Elapsed: [33m0:00:03[0m Remaining: [36m0:02:07[0m   1.00 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 2987, 2993, 3000]                                                                                                                                           │
│ Average cumulative reward:       -12.230863908643576                                                                                                                     │
│ Average rollout reward:          -11.93505829008863                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0832238198413946                                                                                                                             │
│ Best path: [0, 4, 48, 49, 53, 68]                                                                                                                                        │
│ [-3.0032415  -3.0032415  -3.0032415  -3.0032415  -2.73175647 -2.73175647                                                                                                 │
│  -2.73175647]                                                                                                                                                            │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/149 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.7%[0m Elapsed: [33m0:00:03[0m Remaining: [36m0:02:07[0m   1.14 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 3927, 3998, 4000]                                                                                                                                           │
│ Average cumulative reward:       -11.459030960255621                                                                                                                     │
│ Average rollout reward:          -11.144194247012226                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/149 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.7%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:07[0m   1.00 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 3927, 3998, 4000]                                                                                                                                           │
│ Average cumulative reward:       -11.459030960255621                                                                                                                     │
│ Average rollout reward:          -11.144194247012226                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/149 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.4%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:05[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 2058, 2093, 2100, 5000]                                                                                                                                     │
│ Average cumulative reward:       -11.20329400115088                                                                                                                      │
│ Average rollout reward:          -10.864398067260343                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/149 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.4%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:05[0m   1.00 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 2058, 2093, 2100, 5000]                                                                                                                                     │
│ Average cumulative reward:       -11.20329400115088                                                                                                                      │
│ Average rollout reward:          -10.864398067260343                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/149 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m4.0%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:04[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 4835, 5012, 6000]                                                                                                                                           │
│ Average cumulative reward:       -10.761318675462654                                                                                                                     │
│ Average rollout reward:          -10.44371346500149                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m4.7%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:03[0m   1.16 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 3758, 3781, 3785, 7000]                                                                                                                                     │
│ Average cumulative reward:       -11.093354557982455                                                                                                                     │
│ Average rollout reward:          -10.752167651568728                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/149 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m4.7%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:03[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 3758, 3781, 3785, 7000]                                                                                                                                     │
│ Average cumulative reward:       -11.093354557982455                                                                                                                     │
│ Average rollout reward:          -10.752167651568728                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.0032415033839976                                                                                                                             │
│ Best path: [0, 4, 2987, 3031, 3035, 3054, 3057]                                                                                                                          │
│ [-2.92325919 -2.92325919 -2.92325919 -2.92325919 -2.65177415 -2.65177415                                                                                                 │
│  -2.65177415 -2.38028912 -2.38028912]                                                                                                                                    │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/149 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.4%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:02[0m   1.14 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 7858, 7860, 7863, 7867, 7952, 8000]                                                                                                                         │
│ Average cumulative reward:       -11.52031819649416                                                                                                                      │
│ Average rollout reward:          -11.18271683561221                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/149 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.4%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:02[0m   1.06 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 7858, 7860, 7863, 7867, 7952, 8000]                                                                                                                         │
│ Average cumulative reward:       -11.52031819649416                                                                                                                      │
│ Average rollout reward:          -11.18271683561221                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/149 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.0%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:02[0m   1.12 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 4648, 8989, 8992, 9000]                                                                                                                                     │
│ Average cumulative reward:       -11.377003831632525                                                                                                                     │
│ Average rollout reward:          -11.034191539970601                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/149 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.0%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:02[0m   1.05 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 4, 4648, 8989, 8992, 9000]                                                                                                                                     │
│ Average cumulative reward:       -11.377003831632525                                                                                                                     │
│ Average rollout reward:          -11.034191539970601                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/149 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.7%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:01[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 4277, 4459, 4684, 9881, 10000]                                                                                                                              │
│ Average cumulative reward:       -11.067161225714614                                                                                                                     │
│ Average rollout reward:          -10.729831415135758                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/149 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.4%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:00[0m   1.15 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3927, 4016, 4018, 9456, 10368, 11000]                                                                                                                       │
│ Average cumulative reward:       -11.112147156598011                                                                                                                     │
│ Average rollout reward:          -10.773384363891623                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/149 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.4%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:00[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3927, 4016, 4018, 9456, 10368, 11000]                                                                                                                       │
│ Average cumulative reward:       -11.112147156598011                                                                                                                     │
│ Average rollout reward:          -10.773384363891623                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/149 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.4%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:00[0m   1.04 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3927, 4016, 4018, 9456, 10368, 11000]                                                                                                                       │
│ Average cumulative reward:       -11.112147156598011                                                                                                                     │
│ Average rollout reward:          -10.773384363891623                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/149 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.1%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:03[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 48, 11040, 11356, 12000]                                                                                                                                    │
│ Average cumulative reward:       -12.342334798723455                                                                                                                     │
│ Average rollout reward:          -12.001487808974574                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/149 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.1%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:03[0m   1.03 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 48, 11040, 11356, 12000]                                                                                                                                    │
│ Average cumulative reward:       -12.342334798723455                                                                                                                     │
│ Average rollout reward:          -12.001487808974574                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/149 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.7%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:02[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 12831, 12902, 12904, 12912, 12914, 12934, 13000]                                                                                                            │
│ Average cumulative reward:       -11.927961346357105                                                                                                                     │
│ Average rollout reward:          -11.57775799653481                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/149 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m9.4%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:01[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 504, 13455, 13954, 14000]                                                                                                                              │
│ Average cumulative reward:       -11.253404437355982                                                                                                                     │
│ Average rollout reward:          -10.886861934774215                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/149 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m9.4%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:01[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 504, 13455, 13954, 14000]                                                                                                                              │
│ Average cumulative reward:       -11.253404437355982                                                                                                                     │
│ Average rollout reward:          -10.886861934774215                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/149 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:01[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 2564, 8571, 8573, 15000]                                                                                                                                    │
│ Average cumulative reward:       -12.275035754735397                                                                                                                     │
│ Average rollout reward:          -11.924046130757816                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/149 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:01[0m   1.06 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 2564, 8571, 8573, 15000]                                                                                                                                    │
│ Average cumulative reward:       -12.275035754735397                                                                                                                     │
│ Average rollout reward:          -11.924046130757816                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/149 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.7%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:00[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 4277, 4283, 4296, 15140, 15951, 16000]                                                                                                                      │
│ Average cumulative reward:       -11.814216890007774                                                                                                                     │
│ Average rollout reward:          -11.46341568004612                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/149 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.7%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:00[0m   1.06 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 4277, 4283, 4296, 15140, 15951, 16000]                                                                                                                      │
│ Average cumulative reward:       -11.814216890007774                                                                                                                     │
│ Average rollout reward:          -11.46341568004612                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/149 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:01:59[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 16787, 17000]                                                                                                                                               │
│ Average cumulative reward:       -12.245546655550235                                                                                                                     │
│ Average rollout reward:          -11.879831517768938                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:01:59[0m   1.05 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 16787, 17000]                                                                                                                                               │
│ Average cumulative reward:       -12.245546655550235                                                                                                                     │
│ Average rollout reward:          -11.879831517768938                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/149 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.1%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:01:59[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 364, 992, 1041, 10058, 18000]                                                                                                                               │
│ Average cumulative reward:       -11.209152350638378                                                                                                                     │
│ Average rollout reward:          -10.853496523836707                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/149 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.8%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:01:58[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 18991, 18993, 19000]                                                                                                                                        │
│ Average cumulative reward:       -12.055860933858817                                                                                                                     │
│ Average rollout reward:          -11.697503121811884                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/149 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.8%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:01:58[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 18991, 18993, 19000]                                                                                                                                        │
│ Average cumulative reward:       -12.055860933858817                                                                                                                     │
│ Average rollout reward:          -11.697503121811884                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/149 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.4%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:01:58[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 19919, 19922, 19981, 19993, 20000]                                                                                                                          │
│ Average cumulative reward:       -13.565476324552664                                                                                                                     │
│ Average rollout reward:          -13.231442948826857                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/149 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.4%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:01:58[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 19919, 19922, 19981, 19993, 20000]                                                                                                                          │
│ Average cumulative reward:       -13.565476324552664                                                                                                                     │
│ Average rollout reward:          -13.231442948826857                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/149 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m14.1%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:01:57[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 12474, 20942, 20946, 21000]                                                                                                                                 │
│ Average cumulative reward:       -11.748030256879627                                                                                                                     │
│ Average rollout reward:          -11.405139973680399                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/149 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m14.1%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:01:57[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 12474, 20942, 20946, 21000]                                                                                                                                 │
│ Average cumulative reward:       -11.748030256879627                                                                                                                     │
│ Average rollout reward:          -11.405139973680399                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/149 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m14.8%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:01:56[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 6, 20648, 20745, 21992, 22000]                                                                                                                              │
│ Average cumulative reward:       -11.105409714740537                                                                                                                     │
│ Average rollout reward:          -10.753921692979349                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/149 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m14.8%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:01:56[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 6, 20648, 20745, 21992, 22000]                                                                                                                              │
│ Average cumulative reward:       -11.105409714740537                                                                                                                     │
│ Average rollout reward:          -10.753921692979349                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/149 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.4%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:01:55[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 13712, 13717, 13733, 14957, 23000]                                                                                                                     │
│ Average cumulative reward:       -11.475329235772733                                                                                                                     │
│ Average rollout reward:          -11.09689731893292                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/149 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.4%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:01:55[0m   1.06 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 13712, 13717, 13733, 14957, 23000]                                                                                                                     │
│ Average cumulative reward:       -11.475329235772733                                                                                                                     │
│ Average rollout reward:          -11.09689731893292                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/149 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.1%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:01:54[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 23902, 23991, 23995, 24000]                                                                                                                                 │
│ Average cumulative reward:       -11.422091261537686                                                                                                                     │
│ Average rollout reward:          -11.078094797178878                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/149 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.8%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:01:53[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 24968, 24985, 25000]                                                                                                                                        │
│ Average cumulative reward:       -10.876593829548403                                                                                                                     │
│ Average rollout reward:          -10.547859026410759                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/149 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.8%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:01:53[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 24968, 24985, 25000]                                                                                                                                        │
│ Average cumulative reward:       -10.876593829548403                                                                                                                     │
│ Average rollout reward:          -10.547859026410759                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/149 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.4%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:01:51[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3927, 23458, 23559, 26000]                                                                                                                                  │
│ Average cumulative reward:       -10.821441406722524                                                                                                                     │
│ Average rollout reward:          -10.461560183218122                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/149 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.4%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:01:51[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3927, 23458, 23559, 26000]                                                                                                                                  │
│ Average cumulative reward:       -10.821441406722524                                                                                                                     │
│ Average rollout reward:          -10.461560183218122                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/149 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m18.1%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:01:51[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1330, 27000]                                                                                                                                                │
│ Average cumulative reward:       -11.45322382049495                                                                                                                      │
│ Average rollout reward:          -11.090402497018179                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/149 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m18.1%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:01:51[0m   1.07 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1330, 27000]                                                                                                                                                │
│ Average cumulative reward:       -11.45322382049495                                                                                                                      │
│ Average rollout reward:          -11.090402497018179                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/149 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m18.8%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:01:50[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 27755, 28000]                                                                                                                                               │
│ Average cumulative reward:       -12.470745326956818                                                                                                                     │
│ Average rollout reward:          -12.159106234496129                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m19.5%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:01:49[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 28921, 28924, 28926, 28933, 28969, 28978, 29000]                                                                                                            │
│ Average cumulative reward:       -11.260509364315118                                                                                                                     │
│ Average rollout reward:          -10.890451470360597                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/149 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.5%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:01:49[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 28921, 28924, 28926, 28933, 28969, 28978, 29000]                                                                                                            │
│ Average cumulative reward:       -11.260509364315118                                                                                                                     │
│ Average rollout reward:          -10.890451470360597                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/149 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.1%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:01:48[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 19452, 19635, 19638, 30000]                                                                                                                                 │
│ Average cumulative reward:       -11.23917265012535                                                                                                                      │
│ Average rollout reward:          -10.871709371887414                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/149 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.1%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:01:48[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 19452, 19635, 19638, 30000]                                                                                                                                 │
│ Average cumulative reward:       -11.23917265012535                                                                                                                      │
│ Average rollout reward:          -10.871709371887414                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/149 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.8%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:47[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 6598, 6630, 6632, 6934, 29332, 31000]                                                                                                                       │
│ Average cumulative reward:       -11.374101596053801                                                                                                                     │
│ Average rollout reward:          -11.010349052826811                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/149 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.8%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:47[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 6598, 6630, 6632, 6934, 29332, 31000]                                                                                                                       │
│ Average cumulative reward:       -11.374101596053801                                                                                                                     │
│ Average rollout reward:          -11.010349052826811                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/149 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:46[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 31961, 31963, 31975, 31990, 32000]                                                                                                                          │
│ Average cumulative reward:       -11.249848972444097                                                                                                                     │
│ Average rollout reward:          -10.88852248031039                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/149 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.1%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:45[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 19452, 31574, 32330, 32974, 33000]                                                                                                                          │
│ Average cumulative reward:       -11.439168637573587                                                                                                                     │
│ Average rollout reward:          -11.087214693776408                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/149 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.1%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:45[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 19452, 31574, 32330, 32974, 33000]                                                                                                                          │
│ Average cumulative reward:       -11.439168637573587                                                                                                                     │
│ Average rollout reward:          -11.087214693776408                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/149 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:44[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 33875, 33983, 33985, 33989, 33994, 34000]                                                                                                                   │
│ Average cumulative reward:       -11.128684974694536                                                                                                                     │
│ Average rollout reward:          -10.778205532068753                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/149 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:44[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 33875, 33983, 33985, 33989, 33994, 34000]                                                                                                                   │
│ Average cumulative reward:       -11.128684974694536                                                                                                                     │
│ Average rollout reward:          -10.778205532068753                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/149 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m23.5%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:43[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 7858, 31323, 31439, 31597, 31671, 31725, 32568, 35000]                                                                                                      │
│ Average cumulative reward:       -11.278130294361375                                                                                                                     │
│ Average rollout reward:          -10.890529916758979                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/149 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m23.5%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:43[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 7858, 31323, 31439, 31597, 31671, 31725, 32568, 35000]                                                                                                      │
│ Average cumulative reward:       -11.278130294361375                                                                                                                     │
│ Average rollout reward:          -10.890529916758979                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/149 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.2%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:43[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 2987, 3010, 30926, 31603, 35528, 36000]                                                                                                                     │
│ Average cumulative reward:       -11.085015737191263                                                                                                                     │
│ Average rollout reward:          -10.702190824896325                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/149 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.2%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:43[0m   1.08 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 2987, 3010, 30926, 31603, 35528, 36000]                                                                                                                     │
│ Average cumulative reward:       -11.085015737191263                                                                                                                     │
│ Average rollout reward:          -10.702190824896325                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/149 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.8%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:42[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 18090, 36451, 36988, 36999, 37000]                                                                                                                          │
│ Average cumulative reward:       -11.173625546802514                                                                                                                     │
│ Average rollout reward:          -10.80197939836243                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/149 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.5%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:41[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 38000]                                                                                                                                                 │
│ Average cumulative reward:       -11.009331234588833                                                                                                                     │
│ Average rollout reward:          -10.661531479940786                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/149 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.5%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:41[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 38000]                                                                                                                                                 │
│ Average cumulative reward:       -11.009331234588833                                                                                                                     │
│ Average rollout reward:          -10.661531479940786                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/149 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.2%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:40[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 160, 38367, 38879, 38913, 39000]                                                                                                                            │
│ Average cumulative reward:       -10.944271414827837                                                                                                                     │
│ Average rollout reward:          -10.610099877676623                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/149 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.2%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:40[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 160, 38367, 38879, 38913, 39000]                                                                                                                            │
│ Average cumulative reward:       -10.944271414827837                                                                                                                     │
│ Average rollout reward:          -10.610099877676623                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m26.8%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:39[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 39310, 39903, 39904, 40000]                                                                                                                                 │
│ Average cumulative reward:       -10.960468996812272                                                                                                                     │
│ Average rollout reward:          -10.609679913096825                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/149 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.8%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:39[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 39310, 39903, 39904, 40000]                                                                                                                                 │
│ Average cumulative reward:       -10.960468996812272                                                                                                                     │
│ Average rollout reward:          -10.609679913096825                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.5%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:39[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3133, 40972, 40975, 40980, 41000]                                                                                                                           │
│ Average cumulative reward:       -11.210708513312648                                                                                                                     │
│ Average rollout reward:          -10.86499189947252                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m28.2%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:38[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 11113, 41855, 41857, 41996, 42000]                                                                                                                          │
│ Average cumulative reward:       -11.106472202829831                                                                                                                     │
│ Average rollout reward:          -10.772781001665892                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m28.2%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:38[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 11113, 41855, 41857, 41996, 42000]                                                                                                                          │
│ Average cumulative reward:       -11.106472202829831                                                                                                                     │
│ Average rollout reward:          -10.772781001665892                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m28.9%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:37[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 42958, 42990, 42992, 43000]                                                                                                                                 │
│ Average cumulative reward:       -11.049157796405886                                                                                                                     │
│ Average rollout reward:          -10.69362635649653                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m28.9%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:37[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 42958, 42990, 42992, 43000]                                                                                                                                 │
│ Average cumulative reward:       -11.049157796405886                                                                                                                     │
│ Average rollout reward:          -10.69362635649653                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.5%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:35[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 43711, 43743, 43745, 43878, 43883, 44000]                                                                                                                   │
│ Average cumulative reward:       -11.74255269479046                                                                                                                      │
│ Average rollout reward:          -11.399243938304819                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/149 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.5%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:35[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 43711, 43743, 43745, 43878, 43883, 44000]                                                                                                                   │
│ Average cumulative reward:       -11.74255269479046                                                                                                                      │
│ Average rollout reward:          -11.399243938304819                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.2%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:34[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 44472, 44971, 44974, 44981, 44986, 45000]                                                                                                                   │
│ Average cumulative reward:       -11.271763748803743                                                                                                                     │
│ Average rollout reward:          -10.875979297902546                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.2%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:34[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 44472, 44971, 44974, 44981, 44986, 45000]                                                                                                                   │
│ Average cumulative reward:       -11.271763748803743                                                                                                                     │
│ Average rollout reward:          -10.875979297902546                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.9%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:33[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3758, 11419, 11529, 13466, 41782, 46000]                                                                                                                    │
│ Average cumulative reward:       -11.97513675763359                                                                                                                      │
│ Average rollout reward:          -11.609729017741303                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.5%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:32[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 46802, 46818, 46820, 46959, 46967, 47000]                                                                                                                   │
│ Average cumulative reward:       -11.659680842941059                                                                                                                     │
│ Average rollout reward:          -11.295123558647099                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.5%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:32[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 46802, 46818, 46820, 46959, 46967, 47000]                                                                                                                   │
│ Average cumulative reward:       -11.659680842941059                                                                                                                     │
│ Average rollout reward:          -11.295123558647099                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.2%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:32[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 33875, 34233, 34408, 34457, 34460, 48000]                                                                                                                   │
│ Average cumulative reward:       -11.56163751151933                                                                                                                      │
│ Average rollout reward:          -11.17003588176201                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/149 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.2%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:32[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 33875, 34233, 34408, 34457, 34460, 48000]                                                                                                                   │
│ Average cumulative reward:       -11.56163751151933                                                                                                                      │
│ Average rollout reward:          -11.17003588176201                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/149 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:31[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 48397, 48429, 48433, 48990, 49000]                                                                                                                          │
│ Average cumulative reward:       -11.36316292739126                                                                                                                      │
│ Average rollout reward:          -10.99449807180398                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/149 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:31[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 48397, 48429, 48433, 48990, 49000]                                                                                                                          │
│ Average cumulative reward:       -11.36316292739126                                                                                                                      │
│ Average rollout reward:          -10.99449807180398                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/149 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m33.6%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:30[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3283, 49962, 50000]                                                                                                                                         │
│ Average cumulative reward:       -11.77436686569711                                                                                                                      │
│ Average rollout reward:          -11.404083996376968                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━[0m [35m33.6%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:30[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3283, 49962, 50000]                                                                                                                                         │
│ Average cumulative reward:       -11.77436686569711                                                                                                                      │
│ Average rollout reward:          -11.404083996376968                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/149 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:29[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 50850, 50882, 50884, 50931, 50933, 51000]                                                                                                                   │
│ Average cumulative reward:       -10.803378051312698                                                                                                                     │
│ Average rollout reward:          -10.472064488234546                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/149 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.9%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:28[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 51682, 52000]                                                                                                                                               │
│ Average cumulative reward:       -11.538775046262488                                                                                                                     │
│ Average rollout reward:          -11.171700863215982                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/149 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.9%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:28[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 51682, 52000]                                                                                                                                               │
│ Average cumulative reward:       -11.538775046262488                                                                                                                     │
│ Average rollout reward:          -11.171700863215982                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/149 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:27[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 36532, 36555, 53000]                                                                                                                                        │
│ Average cumulative reward:       -11.089469003668603                                                                                                                     │
│ Average rollout reward:          -10.726690690201652                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/149 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.6%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:27[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 36532, 36555, 53000]                                                                                                                                        │
│ Average cumulative reward:       -11.089469003668603                                                                                                                     │
│ Average rollout reward:          -10.726690690201652                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/149 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.2%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:26[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 14060, 14068, 53993, 54000]                                                                                                                            │
│ Average cumulative reward:       -11.226010669535068                                                                                                                     │
│ Average rollout reward:          -10.858572814232032                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/149 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.2%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:26[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 135, 14060, 14068, 53993, 54000]                                                                                                                            │
│ Average cumulative reward:       -11.226010669535068                                                                                                                     │
│ Average rollout reward:          -10.858572814232032                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/149 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.9%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:25[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 5045, 5591, 44344, 54943, 55000]                                                                                                                            │
│ Average cumulative reward:       -11.442231041877228                                                                                                                     │
│ Average rollout reward:          -11.05172040210803                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/149 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.9%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:25[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 5045, 5591, 44344, 54943, 55000]                                                                                                                            │
│ Average cumulative reward:       -11.442231041877228                                                                                                                     │
│ Average rollout reward:          -11.05172040210803                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m37.6%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:24[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 55973, 55974, 55985, 56000]                                                                                                                                 │
│ Average cumulative reward:       -11.315680311457395                                                                                                                     │
│ Average rollout reward:          -10.916495238345751                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.3%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:23[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 11113, 49965, 50495, 50545, 53173, 57000]                                                                                                                   │
│ Average cumulative reward:       -12.248102340174144                                                                                                                     │
│ Average rollout reward:          -11.861984038716272                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.3%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:23[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 11113, 49965, 50495, 50545, 53173, 57000]                                                                                                                   │
│ Average cumulative reward:       -12.248102340174144                                                                                                                     │
│ Average rollout reward:          -11.861984038716272                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.9%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:22[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 57748, 57991, 57993, 57996, 58000]                                                                                                                          │
│ Average cumulative reward:       -11.23852307832032                                                                                                                      │
│ Average rollout reward:          -10.855372916853234                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.9%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:22[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 57748, 57991, 57993, 57996, 58000]                                                                                                                          │
│ Average cumulative reward:       -11.23852307832032                                                                                                                      │
│ Average rollout reward:          -10.855372916853234                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.9232591869266003                                                                                                                             │
│ Best path: [0, 4, 1330, 5456, 5980, 7030, 7033, 7083, 7084]                                                                                                              │
│ [-2.73175647 -2.73175647 -2.73175647 -2.73175647 -2.46027144 -2.46027144                                                                                                 │
│  -2.46027144 -2.10880409 -2.10880409]                                                                                                                                    │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.6%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:22[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 58646, 58801, 58803, 58809, 59000]                                                                                                                          │
│ Average cumulative reward:       -10.736447746509933                                                                                                                     │
│ Average rollout reward:          -10.363205844445508                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/149 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.6%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:22[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 58646, 58801, 58803, 58809, 59000]                                                                                                                          │
│ Average cumulative reward:       -10.736447746509933                                                                                                                     │
│ Average rollout reward:          -10.363205844445508                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.3%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:21[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1519, 59972, 59977, 59978, 59988, 60000]                                                                                                                    │
│ Average cumulative reward:       -11.180243134958046                                                                                                                     │
│ Average rollout reward:          -10.798817215913601                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.3%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:21[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1519, 59972, 59977, 59978, 59988, 60000]                                                                                                                    │
│ Average cumulative reward:       -11.180243134958046                                                                                                                     │
│ Average rollout reward:          -10.798817215913601                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.9%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:20[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 58646, 60958, 60986, 61000]                                                                                                                                 │
│ Average cumulative reward:       -10.959819447695036                                                                                                                     │
│ Average rollout reward:          -10.550423996994622                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m41.6%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:19[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 61636, 61992, 61993, 62000]                                                                                                                          │
│ Average cumulative reward:       -11.577317181846                                                                                                                        │
│ Average rollout reward:          -11.184042510210435                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.6%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:19[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 61636, 61992, 61993, 62000]                                                                                                                          │
│ Average cumulative reward:       -11.577317181846                                                                                                                        │
│ Average rollout reward:          -11.184042510210435                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m42.3%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:18[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 62127, 62129, 62176, 62184, 63000]                                                                                                                   │
│ Average cumulative reward:       -10.842121827568606                                                                                                                     │
│ Average rollout reward:          -10.475821175277337                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m42.3%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:18[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 62127, 62129, 62176, 62184, 63000]                                                                                                                   │
│ Average cumulative reward:       -10.842121827568606                                                                                                                     │
│ Average rollout reward:          -10.475821175277337                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:17[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 64000]                                                                                                                                               │
│ Average cumulative reward:       -10.798977692428954                                                                                                                     │
│ Average rollout reward:          -10.429464735543172                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:17[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 64000]                                                                                                                                               │
│ Average cumulative reward:       -10.798977692428954                                                                                                                     │
│ Average rollout reward:          -10.429464735543172                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.6%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:16[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 64954, 64956, 64976, 65000]                                                                                                                          │
│ Average cumulative reward:       -10.755813659599708                                                                                                                     │
│ Average rollout reward:          -10.39631547123983                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:15[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 65967, 65970, 66000]                                                                                                                                 │
│ Average cumulative reward:       -10.519980307929828                                                                                                                     │
│ Average rollout reward:          -10.131962232830485                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:15[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 65967, 65970, 66000]                                                                                                                                 │
│ Average cumulative reward:       -10.519980307929828                                                                                                                     │
│ Average rollout reward:          -10.131962232830485                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m45.0%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:14[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 64464, 64466, 66747, 67000]                                                                                                                          │
│ Average cumulative reward:       -11.003884881517488                                                                                                                     │
│ Average rollout reward:          -10.552573247695765                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m45.0%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:14[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 64464, 64466, 66747, 67000]                                                                                                                          │
│ Average cumulative reward:       -11.003884881517488                                                                                                                     │
│ Average rollout reward:          -10.552573247695765                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:13[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 63267, 63271, 63295, 63560, 66577, 68000]                                                                                                            │
│ Average cumulative reward:       -11.236760180271368                                                                                                                     │
│ Average rollout reward:          -10.840358173953833                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:13[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 63267, 63271, 63295, 63560, 66577, 68000]                                                                                                            │
│ Average cumulative reward:       -11.236760180271368                                                                                                                     │
│ Average rollout reward:          -10.840358173953833                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.3%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:13[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 68889, 68895, 68904, 68914, 69000]                                                                                                                   │
│ Average cumulative reward:       -11.343261447821822                                                                                                                     │
│ Average rollout reward:          -10.914193794784568                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.3%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:13[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 68889, 68895, 68904, 68914, 69000]                                                                                                                   │
│ Average cumulative reward:       -11.343261447821822                                                                                                                     │
│ Average rollout reward:          -10.914193794784568                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m47.0%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:12[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1519, 59587, 59588, 70000]                                                                                                                                  │
│ Average cumulative reward:       -11.112029972958927                                                                                                                     │
│ Average rollout reward:          -10.74858756921658                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m47.7%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:11[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 70093, 70965, 70969, 70974, 70980, 71000]                                                                                                                   │
│ Average cumulative reward:       -10.802871527184395                                                                                                                     │
│ Average rollout reward:          -10.441049206710538                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m47.7%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:11[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 70093, 70965, 70969, 70974, 70980, 71000]                                                                                                                   │
│ Average cumulative reward:       -10.802871527184395                                                                                                                     │
│ Average rollout reward:          -10.441049206710538                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.3%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:10[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 10788, 71898, 72000]                                                                                                                                        │
│ Average cumulative reward:       -11.67143830467489                                                                                                                      │
│ Average rollout reward:          -11.292359228903525                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.3%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:10[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 10788, 71898, 72000]                                                                                                                                        │
│ Average cumulative reward:       -11.67143830467489                                                                                                                      │
│ Average rollout reward:          -11.292359228903525                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m49.0%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:09[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 72018, 72033, 72897, 72902, 73000]                                                                                                                   │
│ Average cumulative reward:       -11.191397041964686                                                                                                                     │
│ Average rollout reward:          -10.82115495153879                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.0%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:09[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 63266, 72018, 72033, 72897, 72902, 73000]                                                                                                                   │
│ Average cumulative reward:       -11.191397041964686                                                                                                                     │
│ Average rollout reward:          -10.82115495153879                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.7%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:08[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73703, 73976, 73979, 73990, 74000]                                                                                                                   │
│ Average cumulative reward:       -10.765673296791531                                                                                                                     │
│ Average rollout reward:          -10.40655924573346                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.7%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:08[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73703, 73976, 73979, 73990, 74000]                                                                                                                   │
│ Average cumulative reward:       -10.765673296791531                                                                                                                     │
│ Average rollout reward:          -10.40655924573346                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.3%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:07[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73884, 73886, 73912, 74441, 75000]                                                                                                                   │
│ Average cumulative reward:       -11.157803824248091                                                                                                                     │
│ Average rollout reward:          -10.78573944719217                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.3%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:07[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73884, 73886, 73912, 74441, 75000]                                                                                                                   │
│ Average cumulative reward:       -11.157803824248091                                                                                                                     │
│ Average rollout reward:          -10.78573944719217                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.0%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:07[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 75950, 75953, 75976, 75977, 76000]                                                                                                                   │
│ Average cumulative reward:       -10.919387013882224                                                                                                                     │
│ Average rollout reward:          -10.543822452852941                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.7%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:06[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 74541, 74544, 77000]                                                                                                                                 │
│ Average cumulative reward:       -10.852751057550591                                                                                                                     │
│ Average rollout reward:          -10.416013393591525                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.7%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:06[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 74541, 74544, 77000]                                                                                                                                 │
│ Average cumulative reward:       -10.852751057550591                                                                                                                     │
│ Average rollout reward:          -10.416013393591525                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.731756471363751                                                                                                                              │
│ Best path: [0, 4, 1519, 8565, 55509, 56959, 57129, 57359, 58267]                                                                                                         │
│ [-2.65177415 -2.65177415 -2.65177415 -2.65177415 -2.38028912 -2.38028912                                                                                                 │
│  -2.38028912 -2.10880409 -2.10880409]                                                                                                                                    │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m52.3%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:05[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73258, 77998, 78000]                                                                                                                                 │
│ Average cumulative reward:       -11.183181150124227                                                                                                                     │
│ Average rollout reward:          -10.734682809814489                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m52.3%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:05[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73258, 77998, 78000]                                                                                                                                 │
│ Average cumulative reward:       -11.183181150124227                                                                                                                     │
│ Average rollout reward:          -10.734682809814489                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.0%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:04[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 79000 ===                                                                                                                                                  │
│ 79001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 78420, 78430, 78997, 79000]                                                                                                                                 │
│ Average cumulative reward:       -11.458126721628856                                                                                                                     │
│ Average rollout reward:          -11.064702504154715                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.0%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:04[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 79000 ===                                                                                                                                                  │
│ 79001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 78420, 78430, 78997, 79000]                                                                                                                                 │
│ Average cumulative reward:       -11.458126721628856                                                                                                                     │
│ Average rollout reward:          -11.064702504154715                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K80/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.7%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:03[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 80000 ===                                                                                                                                                  │
│ 80001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73239, 73241, 79986, 79988, 80000]                                                                                                                   │
│ Average cumulative reward:       -11.099544014153432                                                                                                                     │
│ Average rollout reward:          -10.644632021958838                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K80/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.7%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:03[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 80000 ===                                                                                                                                                  │
│ 80001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73239, 73241, 79986, 79988, 80000]                                                                                                                   │
│ Average cumulative reward:       -11.099544014153432                                                                                                                     │
│ Average rollout reward:          -10.644632021958838                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K81/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:02[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 81000 ===                                                                                                                                                  │
│ 81001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 80590, 80907, 80909, 81000]                                                                                                                                 │
│ Average cumulative reward:       -11.003380899506759                                                                                                                     │
│ Average rollout reward:          -10.597311577638353                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K82/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.0%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:01[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 82000 ===                                                                                                                                                  │
│ 82001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 81688, 81731, 81733, 82000]                                                                                                                                 │
│ Average cumulative reward:       -11.103128568441191                                                                                                                     │
│ Average rollout reward:          -10.729634149579697                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K82/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.0%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:01[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 82000 ===                                                                                                                                                  │
│ 82001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 81688, 81731, 81733, 82000]                                                                                                                                 │
│ Average cumulative reward:       -11.103128568441191                                                                                                                     │
│ Average rollout reward:          -10.729634149579697                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K83/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:00[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 83000 ===                                                                                                                                                  │
│ 83001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 82939, 82941, 82983, 83000]                                                                                                                          │
│ Average cumulative reward:       -11.266543225569347                                                                                                                     │
│ Average rollout reward:          -10.833601302103848                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K83/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:00[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 83000 ===                                                                                                                                                  │
│ 83001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 82939, 82941, 82983, 83000]                                                                                                                          │
│ Average cumulative reward:       -11.266543225569347                                                                                                                     │
│ Average rollout reward:          -10.833601302103848                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m56.4%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:00:59[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 84000 ===                                                                                                                                                  │
│ 84001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 77049, 77051, 77131, 77134, 77141, 84000]                                                                                                            │
│ Average cumulative reward:       -11.219366647599516                                                                                                                     │
│ Average rollout reward:          -10.752625454044546                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K84/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m56.4%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:00:59[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 84000 ===                                                                                                                                                  │
│ 84001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 77049, 77051, 77131, 77134, 77141, 84000]                                                                                                            │
│ Average cumulative reward:       -11.219366647599516                                                                                                                     │
│ Average rollout reward:          -10.752625454044546                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K85/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:00:58[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 85000 ===                                                                                                                                                  │
│ 85001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1519, 8565, 55509, 61330, 68651, 69009, 85000]                                                                                                              │
│ Average cumulative reward:       -11.195765829971318                                                                                                                     │
│ Average rollout reward:          -10.796155299051163                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K85/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:00:58[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 85000 ===                                                                                                                                                  │
│ 85001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 1519, 8565, 55509, 61330, 68651, 69009, 85000]                                                                                                              │
│ Average cumulative reward:       -11.195765829971318                                                                                                                     │
│ Average rollout reward:          -10.796155299051163                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K86/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m57.7%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:00:58[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 86000 ===                                                                                                                                                  │
│ 86001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 61710, 85939, 85972, 85974, 86000]                                                                                                                   │
│ Average cumulative reward:       -10.757734024604957                                                                                                                     │
│ Average rollout reward:          -10.387569700095133                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K87/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.4%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:00:57[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 87000 ===                                                                                                                                                  │
│ 87001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 61548, 61552, 86719, 86937, 87000]                                                                                                                   │
│ Average cumulative reward:       -11.196098064086309                                                                                                                     │
│ Average rollout reward:          -10.790393997399192                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K87/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.4%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:00:57[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 87000 ===                                                                                                                                                  │
│ 87001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 61548, 61552, 86719, 86937, 87000]                                                                                                                   │
│ Average cumulative reward:       -11.196098064086309                                                                                                                     │
│ Average rollout reward:          -10.790393997399192                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K88/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.1%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:00:56[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 88000 ===                                                                                                                                                  │
│ 88001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 87867, 87869, 87983, 87988, 88000]                                                                                                                   │
│ Average cumulative reward:       -11.190064295558974                                                                                                                     │
│ Average rollout reward:          -10.769510072923365                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K88/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.1%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:00:56[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 88000 ===                                                                                                                                                  │
│ 88001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 61393, 87867, 87869, 87983, 87988, 88000]                                                                                                                   │
│ Average cumulative reward:       -11.190064295558974                                                                                                                     │
│ Average rollout reward:          -10.769510072923365                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K89/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.7%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:00:55[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 89000 ===                                                                                                                                                  │
│ 89001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 88469, 88577, 88992, 88996, 89000]                                                                                                                          │
│ Average cumulative reward:       -11.383273550861457                                                                                                                     │
│ Average rollout reward:          -10.982623899610536                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K89/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.7%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:00:55[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 89000 ===                                                                                                                                                  │
│ 89001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 88469, 88577, 88992, 88996, 89000]                                                                                                                          │
│ Average cumulative reward:       -11.383273550861457                                                                                                                     │
│ Average rollout reward:          -10.982623899610536                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K90/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.4%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:00:54[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 90000 ===                                                                                                                                                  │
│ 90001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3758, 55897, 55901, 84704, 90000]                                                                                                                           │
│ Average cumulative reward:       -11.415866185011694                                                                                                                     │
│ Average rollout reward:          -11.050525200435395                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K90/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:00:54[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 90000 ===                                                                                                                                                  │
│ 90001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 3758, 55897, 55901, 84704, 90000]                                                                                                                           │
│ Average cumulative reward:       -11.415866185011694                                                                                                                     │
│ Average rollout reward:          -11.050525200435395                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K91/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m61.1%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:00:53[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 91000 ===                                                                                                                                                  │
│ 91001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 90802, 90845, 90847, 91000]                                                                                                                                 │
│ Average cumulative reward:       -11.243643275982507                                                                                                                     │
│ Average rollout reward:          -10.846042414666444                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K91/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m61.1%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:00:53[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 91000 ===                                                                                                                                                  │
│ 91001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 90802, 90845, 90847, 91000]                                                                                                                                 │
│ Average cumulative reward:       -11.243643275982507                                                                                                                     │
│ Average rollout reward:          -10.846042414666444                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K92/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m61.7%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:00:52[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 92000 ===                                                                                                                                                  │
│ 92001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 91983, 91989, 92000]                                                                                                                                        │
│ Average cumulative reward:       -11.313769588497506                                                                                                                     │
│ Average rollout reward:          -10.915409984092548                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K93/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.4%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:00:51[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 93000 ===                                                                                                                                                  │
│ 93001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 5251, 9168, 84037, 84315, 85511, 93000]                                                                                                                     │
│ Average cumulative reward:       -11.565706517181082                                                                                                                     │
│ Average rollout reward:          -11.163194200997689                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K93/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.4%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:00:51[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 93000 ===                                                                                                                                                  │
│ 93001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 5251, 9168, 84037, 84315, 85511, 93000]                                                                                                                     │
│ Average cumulative reward:       -11.565706517181082                                                                                                                     │
│ Average rollout reward:          -11.163194200997689                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K94/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.1%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:00:50[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 94000 ===                                                                                                                                                  │
│ 94001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 93172, 93973, 93975, 93995, 94000]                                                                                                                          │
│ Average cumulative reward:       -11.51710484756574                                                                                                                      │
│ Average rollout reward:          -11.142176922894445                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K94/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.1%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:00:50[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 94000 ===                                                                                                                                                  │
│ 94001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 93172, 93973, 93975, 93995, 94000]                                                                                                                          │
│ Average cumulative reward:       -11.51710484756574                                                                                                                      │
│ Average rollout reward:          -11.142176922894445                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━━[0m [35m63.8%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:00:49[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 95000 ===                                                                                                                                                  │
│ 95001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 12831, 71901, 72081, 84264, 85633, 95000]                                                                                                                   │
│ Average cumulative reward:       -11.251676678727524                                                                                                                     │
│ Average rollout reward:          -10.833170996728422                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K95/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.8%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:00:49[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 95000 ===                                                                                                                                                  │
│ 95001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 12831, 71901, 72081, 84264, 85633, 95000]                                                                                                                   │
│ Average cumulative reward:       -11.251676678727524                                                                                                                     │
│ Average rollout reward:          -10.833170996728422                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K96/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.4%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:00:49[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 96000 ===                                                                                                                                                  │
│ 96001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 95579, 95981, 95984, 95995, 96000]                                                                                                                          │
│ Average cumulative reward:       -10.83923947567939                                                                                                                      │
│ Average rollout reward:          -10.458894259959777                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K96/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.4%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:00:49[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 96000 ===                                                                                                                                                  │
│ 96001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 95579, 95981, 95984, 95995, 96000]                                                                                                                          │
│ Average cumulative reward:       -10.83923947567939                                                                                                                      │
│ Average rollout reward:          -10.458894259959777                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K97/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.1%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:00:48[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 97000 ===                                                                                                                                                  │
│ 97001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 96797, 96979, 96981, 96988, 96996, 97000]                                                                                                                   │
│ Average cumulative reward:       -10.37753340564781                                                                                                                      │
│ Average rollout reward:          -10.006059674442286                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K98/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:00:47[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 98000 ===                                                                                                                                                  │
│ 98001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73173, 73177, 73217, 73278, 98000]                                                                                                                   │
│ Average cumulative reward:       -11.080533463925756                                                                                                                     │
│ Average rollout reward:          -10.692147475250858                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K98/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:00:47[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 98000 ===                                                                                                                                                  │
│ 98001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 73150, 73173, 73177, 73217, 73278, 98000]                                                                                                                   │
│ Average cumulative reward:       -11.080533463925756                                                                                                                     │
│ Average rollout reward:          -10.692147475250858                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K99/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m66.4%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:00:46[0m   1.11 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 99000 ===                                                                                                                                                  │
│ 99001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 45241, 45521, 45525, 98875, 98886, 98927, 98941, 98957, 99000]                                                                                              │
│ Average cumulative reward:       -12.032466718645278                                                                                                                     │
│ Average rollout reward:          -11.643880799315578                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K99/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m66.4%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:00:46[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 99000 ===                                                                                                                                                  │
│ 99001  nodes in tree                                                                                                                                                     │
│ Path: [0, 4, 45241, 45521, 45525, 98875, 98886, 98927, 98941, 98957, 99000]                                                                                              │
│ Average cumulative reward:       -12.032466718645278                                                                                                                     │
│ Average rollout reward:          -11.643880799315578                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K100/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:00:45[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 100000 ===                                                                                                                                                 │
│ 100001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 45241, 99973, 99977, 99983, 100000]                                                                                                                         │
│ Average cumulative reward:       -10.604368882808803                                                                                                                     │
│ Average rollout reward:          -10.202919630912803                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K100/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:00:45[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 100000 ===                                                                                                                                                 │
│ 100001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 45241, 99973, 99977, 99983, 100000]                                                                                                                         │
│ Average cumulative reward:       -10.604368882808803                                                                                                                     │
│ Average rollout reward:          -10.202919630912803                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K101/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m67.8%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:00:44[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 101000 ===                                                                                                                                                 │
│ 101001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 100504, 100505, 100509, 100516, 101000]                                                                                                                     │
│ Average cumulative reward:       -10.582537368112616                                                                                                                     │
│ Average rollout reward:          -10.202195677242333                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K101/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m67.8%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:00:44[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 101000 ===                                                                                                                                                 │
│ 101001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 100504, 100505, 100509, 100516, 101000]                                                                                                                     │
│ Average cumulative reward:       -10.582537368112616                                                                                                                     │
│ Average rollout reward:          -10.202195677242333                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K102/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.5%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:00:43[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 102000 ===                                                                                                                                                 │
│ 102001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 101759, 101761, 101767, 101983, 101999, 102000]                                                                                                             │
│ Average cumulative reward:       -11.727512287506876                                                                                                                     │
│ Average rollout reward:          -11.335594805121175                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K102/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.5%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:43[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 102000 ===                                                                                                                                                 │
│ 102001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 101759, 101761, 101767, 101983, 101999, 102000]                                                                                                             │
│ Average cumulative reward:       -11.727512287506876                                                                                                                     │
│ Average rollout reward:          -11.335594805121175                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K103/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.1%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:43[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 103000 ===                                                                                                                                                 │
│ 103001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 67116, 67224, 84790, 103000]                                                                                                                                │
│ Average cumulative reward:       -13.435211969829131                                                                                                                     │
│ Average rollout reward:          -13.029972708116997                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K103/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.1%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:43[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 103000 ===                                                                                                                                                 │
│ 103001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 67116, 67224, 84790, 103000]                                                                                                                                │
│ Average cumulative reward:       -13.435211969829131                                                                                                                     │
│ Average rollout reward:          -13.029972708116997                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K104/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.8%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:42[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 104000 ===                                                                                                                                                 │
│ 104001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 103023, 103969, 103973, 103999, 104000]                                                                                                                     │
│ Average cumulative reward:       -11.9015819903508                                                                                                                       │
│ Average rollout reward:          -11.549429734286635                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K105/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.5%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:41[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 105000 ===                                                                                                                                                 │
│ 105001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 160, 50512, 53064, 104854, 105000]                                                                                                                          │
│ Average cumulative reward:       -11.752957918682844                                                                                                                     │
│ Average rollout reward:          -11.362958703443253                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K105/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.5%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:41[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 105000 ===                                                                                                                                                 │
│ 105001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 160, 50512, 53064, 104854, 105000]                                                                                                                          │
│ Average cumulative reward:       -11.752957918682844                                                                                                                     │
│ Average rollout reward:          -11.362958703443253                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━[0m [35m71.1%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:40[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 106000 ===                                                                                                                                                 │
│ 106001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 73150, 105912, 105917, 105996, 106000]                                                                                                                      │
│ Average cumulative reward:       -11.005065474270447                                                                                                                     │
│ Average rollout reward:          -10.609340691836698                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K106/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m71.1%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:40[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 106000 ===                                                                                                                                                 │
│ 106001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 73150, 105912, 105917, 105996, 106000]                                                                                                                      │
│ Average cumulative reward:       -11.005065474270447                                                                                                                     │
│ Average rollout reward:          -10.609340691836698                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K107/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m71.8%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:39[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 107000 ===                                                                                                                                                 │
│ 107001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 106873, 106877, 106892, 106894, 107000]                                                                                                             │
│ Average cumulative reward:       -10.737919940711418                                                                                                                     │
│ Average rollout reward:          -10.341545346887228                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K107/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m71.8%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:39[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 107000 ===                                                                                                                                                 │
│ 107001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 106873, 106877, 106892, 106894, 107000]                                                                                                             │
│ Average cumulative reward:       -10.737919940711418                                                                                                                     │
│ Average rollout reward:          -10.341545346887228                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K108/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.5%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:38[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 108000 ===                                                                                                                                                 │
│ 108001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 106980, 107895, 107932, 107934, 107968, 108000]                                                                                                     │
│ Average cumulative reward:       -11.721950570827273                                                                                                                     │
│ Average rollout reward:          -11.333848887638828                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K108/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.5%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:38[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 108000 ===                                                                                                                                                 │
│ 108001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 106980, 107895, 107932, 107934, 107968, 108000]                                                                                                     │
│ Average cumulative reward:       -11.721950570827273                                                                                                                     │
│ Average rollout reward:          -11.333848887638828                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K109/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.2%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:37[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 109000 ===                                                                                                                                                 │
│ 109001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 108912, 108915, 108930, 108935, 108962, 109000]                                                                                                     │
│ Average cumulative reward:       -10.848126456207567                                                                                                                     │
│ Average rollout reward:          -10.477967617707417                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K110/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m73.8%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:36[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 110000 ===                                                                                                                                                 │
│ 110001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 109486, 109888, 109892, 109897, 109915, 109919, 109979, 110000]                                                                                             │
│ Average cumulative reward:       -10.879822971736088                                                                                                                     │
│ Average rollout reward:          -10.489393404657381                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K110/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m73.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:36[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 110000 ===                                                                                                                                                 │
│ 110001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 109486, 109888, 109892, 109897, 109915, 109919, 109979, 110000]                                                                                             │
│ Average cumulative reward:       -10.879822971736088                                                                                                                     │
│ Average rollout reward:          -10.489393404657381                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K111/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.5%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:35[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 111000 ===                                                                                                                                                 │
│ 111001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 109486, 110614, 110643, 110673, 111000]                                                                                                                     │
│ Average cumulative reward:       -11.124121073941936                                                                                                                     │
│ Average rollout reward:          -10.72662709847464                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K111/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.5%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:35[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 111000 ===                                                                                                                                                 │
│ 111001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 109486, 110614, 110643, 110673, 111000]                                                                                                                     │
│ Average cumulative reward:       -11.124121073941936                                                                                                                     │
│ Average rollout reward:          -10.72662709847464                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K112/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.2%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:34[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 112000 ===                                                                                                                                                 │
│ 112001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 109486, 111903, 111938, 111964, 112000]                                                                                                                     │
│ Average cumulative reward:       -11.679845530090745                                                                                                                     │
│ Average rollout reward:          -11.262402387183903                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K112/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.2%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:34[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 112000 ===                                                                                                                                                 │
│ 112001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 109486, 111903, 111938, 111964, 112000]                                                                                                                     │
│ Average cumulative reward:       -11.679845530090745                                                                                                                     │
│ Average rollout reward:          -11.262402387183903                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K113/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.8%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:33[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 113000 ===                                                                                                                                                 │
│ 113001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 112137, 112586, 112994, 112998, 113000]                                                                                                                     │
│ Average cumulative reward:       -10.877448390712685                                                                                                                     │
│ Average rollout reward:          -10.508594189383498                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K113/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.8%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:33[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 113000 ===                                                                                                                                                 │
│ 113001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 112137, 112586, 112994, 112998, 113000]                                                                                                                     │
│ Average cumulative reward:       -10.877448390712685                                                                                                                     │
│ Average rollout reward:          -10.508594189383498                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K114/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m76.5%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:33[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 114000 ===                                                                                                                                                 │
│ 114001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 113477, 113794, 113797, 113968, 114000]                                                                                                                     │
│ Average cumulative reward:       -10.792718254528104                                                                                                                     │
│ Average rollout reward:          -10.400841797311967                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K115/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:32[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 115000 ===                                                                                                                                                 │
│ 115001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 114827, 114898, 114900, 114903, 114978, 115000]                                                                                                             │
│ Average cumulative reward:       -11.337591999855208                                                                                                                     │
│ Average rollout reward:          -10.960095222173189                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K115/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:32[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 115000 ===                                                                                                                                                 │
│ 115001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 114827, 114898, 114900, 114903, 114978, 115000]                                                                                                             │
│ Average cumulative reward:       -11.337591999855208                                                                                                                     │
│ Average rollout reward:          -10.960095222173189                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K116/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m77.9%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:31[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 116000 ===                                                                                                                                                 │
│ 116001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 113477, 113566, 113569, 113996, 114694, 116000]                                                                                                             │
│ Average cumulative reward:       -11.519785049480351                                                                                                                     │
│ Average rollout reward:          -11.102945454569463                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K116/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m77.9%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:31[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 116000 ===                                                                                                                                                 │
│ 116001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 113477, 113566, 113569, 113996, 114694, 116000]                                                                                                             │
│ Average cumulative reward:       -11.519785049480351                                                                                                                     │
│ Average rollout reward:          -11.102945454569463                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:30[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 117000 ===                                                                                                                                                 │
│ 117001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 72122, 72193, 72195, 72395, 72554, 117000]                                                                                                                  │
│ Average cumulative reward:       -11.554810553372693                                                                                                                     │
│ Average rollout reward:          -11.154705308260866                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K117/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:30[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 117000 ===                                                                                                                                                 │
│ 117001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 72122, 72193, 72195, 72395, 72554, 117000]                                                                                                                  │
│ Average cumulative reward:       -11.554810553372693                                                                                                                     │
│ Average rollout reward:          -11.154705308260866                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K118/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.2%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:29[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 118000 ===                                                                                                                                                 │
│ 118001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 117555, 117558, 117563, 118000]                                                                                                                             │
│ Average cumulative reward:       -11.339320957118947                                                                                                                     │
│ Average rollout reward:          -10.949779810309003                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K118/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.2%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:29[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 118000 ===                                                                                                                                                 │
│ 118001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 117555, 117558, 117563, 118000]                                                                                                                             │
│ Average cumulative reward:       -11.339320957118947                                                                                                                     │
│ Average rollout reward:          -10.949779810309003                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K119/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.9%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:28[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 119000 ===                                                                                                                                                 │
│ 119001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 116712, 116766, 117490, 118653, 118678, 118767, 119000]                                                                                             │
│ Average cumulative reward:       -10.993511104698213                                                                                                                     │
│ Average rollout reward:          -10.583916291642327                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K119/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.9%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:28[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 119000 ===                                                                                                                                                 │
│ 119001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 116712, 116766, 117490, 118653, 118678, 118767, 119000]                                                                                             │
│ Average cumulative reward:       -10.993511104698213                                                                                                                     │
│ Average rollout reward:          -10.583916291642327                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K120/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m80.5%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:27[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 120000 ===                                                                                                                                                 │
│ 120001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 27184, 119208, 120000]                                                                                                                                      │
│ Average cumulative reward:       -11.247940899238515                                                                                                                     │
│ Average rollout reward:          -10.816796741279836                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K120/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m80.5%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:27[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 120000 ===                                                                                                                                                 │
│ 120001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 27184, 119208, 120000]                                                                                                                                      │
│ Average cumulative reward:       -11.247940899238515                                                                                                                     │
│ Average rollout reward:          -10.816796741279836                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K121/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.2%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:26[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 121000 ===                                                                                                                                                 │
│ 121001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 114827, 114843, 115540, 115560, 121000]                                                                                                                     │
│ Average cumulative reward:       -13.540579938939436                                                                                                                     │
│ Average rollout reward:          -13.15735074904532                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K122/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m81.9%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:25[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 122000 ===                                                                                                                                                 │
│ 122001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 121999, 122000]                                                                                                                                     │
│ Average cumulative reward:       -10.873466942476673                                                                                                                     │
│ Average rollout reward:          -10.469713473236695                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K122/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m81.9%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:25[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 122000 ===                                                                                                                                                 │
│ 122001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 121999, 122000]                                                                                                                                     │
│ Average cumulative reward:       -10.873466942476673                                                                                                                     │
│ Average rollout reward:          -10.469713473236695                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K123/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m82.6%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:24[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 123000 ===                                                                                                                                                 │
│ 123001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 122912, 122914, 122974, 122985, 122987, 123000]                                                                                                     │
│ Average cumulative reward:       -11.087331660838373                                                                                                                     │
│ Average rollout reward:          -10.686535929189988                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K123/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m82.6%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:24[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 123000 ===                                                                                                                                                 │
│ 123001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 122912, 122914, 122974, 122985, 122987, 123000]                                                                                                     │
│ Average cumulative reward:       -11.087331660838373                                                                                                                     │
│ Average rollout reward:          -10.686535929189988                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K124/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.2%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:23[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 124000 ===                                                                                                                                                 │
│ 124001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 121828, 122367, 124000]                                                                                                                             │
│ Average cumulative reward:       -10.402147914962296                                                                                                                     │
│ Average rollout reward:          -10.001638890743669                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K124/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.2%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:23[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 124000 ===                                                                                                                                                 │
│ 124001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 121828, 122367, 124000]                                                                                                                             │
│ Average cumulative reward:       -10.402147914962296                                                                                                                     │
│ Average rollout reward:          -10.001638890743669                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K125/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m83.9%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:22[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 125000 ===                                                                                                                                                 │
│ 125001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 124878, 124880, 124907, 124929, 125000]                                                                                                             │
│ Average cumulative reward:       -10.827204826947751                                                                                                                     │
│ Average rollout reward:          -10.395061895938861                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K125/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m83.9%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:22[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 125000 ===                                                                                                                                                 │
│ 125001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 124878, 124880, 124907, 124929, 125000]                                                                                                             │
│ Average cumulative reward:       -10.827204826947751                                                                                                                     │
│ Average rollout reward:          -10.395061895938861                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K126/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.6%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:22[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 126000 ===                                                                                                                                                 │
│ 126001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 121791, 121793, 125227, 125245, 126000]                                                                                                             │
│ Average cumulative reward:       -11.376204634672158                                                                                                                     │
│ Average rollout reward:          -10.949412374272608                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K127/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m85.2%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:21[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 127000 ===                                                                                                                                                 │
│ 127001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 125530, 125533, 125665, 125678, 127000]                                                                                                             │
│ Average cumulative reward:       -10.669249703032598                                                                                                                     │
│ Average rollout reward:          -10.21833065671343                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K127/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m85.2%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:21[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 127000 ===                                                                                                                                                 │
│ 127001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 121720, 125530, 125533, 125665, 125678, 127000]                                                                                                             │
│ Average cumulative reward:       -10.669249703032598                                                                                                                     │
│ Average rollout reward:          -10.21833065671343                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯37m━━━━━[0m [35m85.9%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:20[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 128000 ===                                                                                                                                                 │
│ 128001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 127410, 127945, 127948, 127956, 127978, 128000]                                                                                                             │
│ Average cumulative reward:       -10.554435287981091                                                                                                                     │
│ Average rollout reward:          -10.178461136800697                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K128/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m85.9%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:20[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 128000 ===                                                                                                                                                 │
│ 128001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 127410, 127945, 127948, 127956, 127978, 128000]                                                                                                             │
│ Average cumulative reward:       -10.554435287981091                                                                                                                     │
│ Average rollout reward:          -10.178461136800697                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K129/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m86.6%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:19[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 129000 ===                                                                                                                                                 │
│ 129001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 128857, 128965, 129000]                                                                                                                                     │
│ Average cumulative reward:       -10.794087843320009                                                                                                                     │
│ Average rollout reward:          -10.41394593756401                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K129/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m86.6%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:19[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 129000 ===                                                                                                                                                 │
│ 129001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 128857, 128965, 129000]                                                                                                                                     │
│ Average cumulative reward:       -10.794087843320009                                                                                                                     │
│ Average rollout reward:          -10.41394593756401                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K130/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.2%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:18[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 130000 ===                                                                                                                                                 │
│ 130001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 9251, 129888, 130000]                                                                                                                                       │
│ Average cumulative reward:       -11.16568466056033                                                                                                                      │
│ Average rollout reward:          -10.771795940616865                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K130/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.2%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:18[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 130000 ===                                                                                                                                                 │
│ 130001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 9251, 129888, 130000]                                                                                                                                       │
│ Average cumulative reward:       -11.16568466056033                                                                                                                      │
│ Average rollout reward:          -10.771795940616865                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K131/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m87.9%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:17[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 131000 ===                                                                                                                                                 │
│ 131001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 130314, 130984, 130987, 130990, 130995, 131000]                                                                                                             │
│ Average cumulative reward:       -10.695276938447636                                                                                                                     │
│ Average rollout reward:          -10.324823475058611                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K132/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:16[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 132000 ===                                                                                                                                                 │
│ 132001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 131781, 131824, 131885, 131997, 132000]                                                                                                                     │
│ Average cumulative reward:       -11.809717621100006                                                                                                                     │
│ Average rollout reward:          -11.439666306442392                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K132/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:16[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 132000 ===                                                                                                                                                 │
│ 132001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 131781, 131824, 131885, 131997, 132000]                                                                                                                     │
│ Average cumulative reward:       -11.809717621100006                                                                                                                     │
│ Average rollout reward:          -11.439666306442392                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K133/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.3%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:15[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 133000 ===                                                                                                                                                 │
│ 133001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 106873, 106877, 106906, 131228, 133000]                                                                                                             │
│ Average cumulative reward:       -11.590663980165548                                                                                                                     │
│ Average rollout reward:          -11.177953329548203                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K133/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.3%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:15[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 133000 ===                                                                                                                                                 │
│ 133001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 106872, 106873, 106877, 106906, 131228, 133000]                                                                                                             │
│ Average cumulative reward:       -11.590663980165548                                                                                                                     │
│ Average rollout reward:          -11.177953329548203                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K134/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:14[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 134000 ===                                                                                                                                                 │
│ 134001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 133258, 133301, 133981, 133998, 134000]                                                                                                                     │
│ Average cumulative reward:       -10.731786693909784                                                                                                                     │
│ Average rollout reward:          -10.367050436654354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K134/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:14[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 134000 ===                                                                                                                                                 │
│ 134001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 133258, 133301, 133981, 133998, 134000]                                                                                                                     │
│ Average cumulative reward:       -10.731786693909784                                                                                                                     │
│ Average rollout reward:          -10.367050436654354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K135/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m90.6%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:13[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 135000 ===                                                                                                                                                 │
│ 135001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 134776, 134780, 134820, 135000]                                                                                                                     │
│ Average cumulative reward:       -11.663280507396545                                                                                                                     │
│ Average rollout reward:          -11.289798803734612                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K135/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m90.6%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:13[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 135000 ===                                                                                                                                                 │
│ 135001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 134776, 134780, 134820, 135000]                                                                                                                     │
│ Average cumulative reward:       -11.663280507396545                                                                                                                     │
│ Average rollout reward:          -11.289798803734612                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K136/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m91.3%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:12[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 136000 ===                                                                                                                                                 │
│ 136001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 134874, 135597, 136000]                                                                                                                             │
│ Average cumulative reward:       -10.958866866289831                                                                                                                     │
│ Average rollout reward:          -10.561982771934696                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K136/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m91.3%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:12[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 136000 ===                                                                                                                                                 │
│ 136001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 134874, 135597, 136000]                                                                                                                             │
│ Average cumulative reward:       -10.958866866289831                                                                                                                     │
│ Average rollout reward:          -10.561982771934696                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K137/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m91.9%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:11[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 137000 ===                                                                                                                                                 │
│ 137001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 136895, 136899, 136904, 136999, 137000]                                                                                                             │
│ Average cumulative reward:       -10.829382673015205                                                                                                                     │
│ Average rollout reward:          -10.362444231258182                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K138/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m92.6%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:11[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 138000 ===                                                                                                                                                 │
│ 138001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 136325, 137981, 137987, 138000]                                                                                                                     │
│ Average cumulative reward:       -11.30263464402175                                                                                                                      │
│ Average rollout reward:          -10.837407486049742                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K138/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m92.6%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:11[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 138000 ===                                                                                                                                                 │
│ 138001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 136325, 137981, 137987, 138000]                                                                                                                     │
│ Average cumulative reward:       -11.30263464402175                                                                                                                      │
│ Average rollout reward:          -10.837407486049742                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯5;237m━━[0m [35m93.3%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:10[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 139000 ===                                                                                                                                                 │
│ 139001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 138907, 138911, 138986, 139000]                                                                                                                     │
│ Average cumulative reward:       -11.814963999280831                                                                                                                     │
│ Average rollout reward:          -11.39890804974961                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K139/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.3%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:10[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 139000 ===                                                                                                                                                 │
│ 139001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 138907, 138911, 138986, 139000]                                                                                                                     │
│ Average cumulative reward:       -11.814963999280831                                                                                                                     │
│ Average rollout reward:          -11.39890804974961                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K140/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.0%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:09[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 140000 ===                                                                                                                                                 │
│ 140001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 139899, 139903, 139982, 140000]                                                                                                                     │
│ Average cumulative reward:       -11.234137678591829                                                                                                                     │
│ Average rollout reward:          -10.77666305519573                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K140/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.0%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:09[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 140000 ===                                                                                                                                                 │
│ 140001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 139899, 139903, 139982, 140000]                                                                                                                     │
│ Average cumulative reward:       -11.234137678591829                                                                                                                     │
│ Average rollout reward:          -10.77666305519573                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K141/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.6%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:08[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 141000 ===                                                                                                                                                 │
│ 141001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 140790, 140806, 141000]                                                                                                                                     │
│ Average cumulative reward:       -12.756425765810368                                                                                                                     │
│ Average rollout reward:          -12.34149183709041                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K141/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.6%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:08[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 141000 ===                                                                                                                                                 │
│ 141001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 140790, 140806, 141000]                                                                                                                                     │
│ Average cumulative reward:       -12.756425765810368                                                                                                                     │
│ Average rollout reward:          -12.34149183709041                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K142/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m95.3%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:07[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 142000 ===                                                                                                                                                 │
│ 142001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 135414, 135416, 141615, 141665, 141688, 141692, 142000]                                                                                             │
│ Average cumulative reward:       -11.56675083982396                                                                                                                      │
│ Average rollout reward:          -11.096240058677166                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K142/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m95.3%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:07[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 142000 ===                                                                                                                                                 │
│ 142001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 135414, 135416, 141615, 141665, 141688, 141692, 142000]                                                                                             │
│ Average cumulative reward:       -11.56675083982396                                                                                                                      │
│ Average rollout reward:          -11.096240058677166                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K143/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.0%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:06[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 143000 ===                                                                                                                                                 │
│ 143001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 142854, 142859, 142951, 142998, 143000]                                                                                                             │
│ Average cumulative reward:       -11.315517047270411                                                                                                                     │
│ Average rollout reward:          -10.846204060364117                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K143/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.0%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:06[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 143000 ===                                                                                                                                                 │
│ 143001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 142854, 142859, 142951, 142998, 143000]                                                                                                             │
│ Average cumulative reward:       -11.315517047270411                                                                                                                     │
│ Average rollout reward:          -10.846204060364117                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K144/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m96.6%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:05[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 144000 ===                                                                                                                                                 │
│ 144001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 143992, 143994, 143997, 144000]                                                                                                                     │
│ Average cumulative reward:       -10.87189287824296                                                                                                                      │
│ Average rollout reward:          -10.392858399267023                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K144/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m96.6%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:05[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 144000 ===                                                                                                                                                 │
│ 144001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 143992, 143994, 143997, 144000]                                                                                                                     │
│ Average cumulative reward:       -10.87189287824296                                                                                                                      │
│ Average rollout reward:          -10.392858399267023                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K145/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.3%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:04[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 145000 ===                                                                                                                                                 │
│ 145001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 135061, 144940, 144976, 145000]                                                                                                                     │
│ Average cumulative reward:       -11.30005641585292                                                                                                                      │
│ Average rollout reward:          -10.8384384417584                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K145/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.3%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:04[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 145000 ===                                                                                                                                                 │
│ 145001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 134744, 135061, 144940, 144976, 145000]                                                                                                                     │
│ Average cumulative reward:       -11.30005641585292                                                                                                                      │
│ Average rollout reward:          -10.8384384417584                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K146/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.0%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:03[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 146000 ===                                                                                                                                                 │
│ 146001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 145430, 145929, 145931, 145981, 145986, 146000]                                                                                                             │
│ Average cumulative reward:       -11.301261442756077                                                                                                                     │
│ Average rollout reward:          -10.91393103554615                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K147/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:02[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 147000 ===                                                                                                                                                 │
│ 147001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 146997, 147000]                                                                                                                                             │
│ Average cumulative reward:       -11.036704712300546                                                                                                                     │
│ Average rollout reward:          -10.620170355391874                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K147/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:02[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 147000 ===                                                                                                                                                 │
│ 147001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 146997, 147000]                                                                                                                                             │
│ Average cumulative reward:       -11.036704712300546                                                                                                                     │
│ Average rollout reward:          -10.620170355391874                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K148/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m [35m99.3%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:01[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 148000 ===                                                                                                                                                 │
│ 148001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 1724, 132160, 132162, 146670, 148000]                                                                                                                       │
│ Average cumulative reward:       -11.411403080640468                                                                                                                     │
│ Average rollout reward:          -11.024204879862886                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K148/149 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m [35m99.3%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:01[0m   1.09 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 148000 ===                                                                                                                                                 │
│ 148001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 1724, 132160, 132162, 146670, 148000]                                                                                                                       │
│ Average cumulative reward:       -11.411403080640468                                                                                                                     │
│ Average rollout reward:          -11.024204879862886                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K149/149 [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100.0%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:00[0m   1.10 iters/s
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 148000 ===                                                                                                                                                 │
│ 148001  nodes in tree                                                                                                                                                    │
│ Path: [0, 4, 1724, 132160, 132162, 146670, 148000]                                                                                                                       │
│ Average cumulative reward:       -11.411403080640468                                                                                                                     │
│ Average rollout reward:          -11.024204879862886                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.6517741549063536                                                                                                                             │
│ Best path: [0, 4, 73150, 75058, 75062, 76911, 77020, 77238, 77652]                                                                                                       │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[?25hNode 0 is not terminal. Continue.
Node 4 is not terminal. Continue.
Node 73150 is not terminal. Continue.
Node 75058 is not terminal. Continue.
Node 75062 is not terminal. Continue.
Node 76911 is not terminal. Continue.
Node 77020 is not terminal. Continue.
Node 77238 is not terminal. Continue.
Node 77652 is not terminal. Continue.
Node 77654 is not terminal. Continue.
Node 77723 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 2 is not terminal. Continue.
Node 4835 is not terminal. Continue.
Node 5012 is not terminal. Continue.
Node 6000 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 4 is not terminal. Continue.
Node 73150 is not terminal. Continue.
Node 75058 is not terminal. Continue.
Node 75062 is not terminal. Continue.
Node 76911 is not terminal. Continue.
Node 77020 is not terminal. Continue.
Node 77238 is not terminal. Continue.
Node 77652 is not terminal. Continue.
Node 83755 is not terminal. Continue.
Node 85464 is not terminal. Continue.
No children found. Stop.
=== RESULT ===
By Visits: estimated reward: -11.640708570030707
sqrt_visser_coupled [2.9401197 1.0152249]
sqrt_visser_coupled [1.9652071 1.621771 ]
sqrt_nsv [3.6585667  0.14990641]
By Value: estimated reward: -11.449205854467856
sqrt_nsv [3.8434665 4.6530266]
By Best Value: estimated reward: 0
sqrt_visser_coupled [2.9401197 1.0152249]
sqrt_visser_coupled [1.9652071 1.621771 ]
sqrt_nsv [3.6585667 1.        0.        0.       ]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
Best value of root node:
-2.6517741549063536
Best root policy:
sqrt_visser_coupled [2.9401197 1.0152249]
sqrt_visser_coupled [1.9652071 1.621771 ]
sqrt_nsv [3.6585667 1.        0.        0.       ]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
sqrt_nsv [3, 1]
=== END ===
Finished making algorithm
