Matrix distribution: unif
Matrix distribution config: {'c': 0.25, 'd': 5000, 'eps': 0.001}
Initial matrix shape: torch.Size([5000, 5000])
Algorithm name: mcts
Algorithm config: {'c_ucb': 5.0, 'alpha_pw': 0.4, 'epsilon': 1e-11, 'EXPLORE_K': 5, 'early_termination_epsilon': 1e-05, 'budget': 80000, 'print_every': 1000, 'max_termination_count': 10, 'tree_initial_capacity': 10000, 'device': 'cuda', 'actions': [['sign_ns', [[0, 0], [5, 5]]], ['sign_newton', [[0], [40]]], ['sign_quintic', [[0, 0, 0], [5, 5, 5]]], ['sign_halley', [[0, 0, 0], [40, 40, 40]]]], 'initialize_with_baselines': True}
Actions: ['sign_halley', 'sign_newton', 'sign_ns', 'sign_quintic']
Action sign_halley took 1.0 times longer than sign_halley
Action sign_newton took 0.32687511885531023 times longer than sign_halley
Action sign_ns took 0.3891708427208845 times longer than sign_halley
Action sign_quintic took 0.5864396385978199 times longer than sign_halley
Skipping sign_newton_variant because not all actions are in the tree
Skipping inv_ns because not all actions are in the tree
Skipping inv_ns_chebyshev because not all actions are in the tree
Skipping sqrt_db because not all actions are in the tree
Skipping sqrt_nsv because not all actions are in the tree
Skipping sqrt_visser because not all actions are in the tree
Skipping sqrt_newton because not all actions are in the tree
Skipping sqrt_visser_coupled because not all actions are in the tree
Skipping sqrt_newton_coupled because not all actions are in the tree
Skipping proot_newton because not all actions are in the tree
Skipping proot_visser because not all actions are in the tree
Skipping proot_iannazzo because not all actions are in the tree
[?25l0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:00[0m Remaining: [36m-:--:--[0m 502121.22 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-3.28812583 -3.28812583]                                                                                                                                                │
│ [-1.96125071 -1.96125071 -1.96125071]                                                                                                                                    │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1005397.03 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-3.28812583 -3.28812583]                                                                                                                                                │
│ [-1.96125071 -1.96125071 -1.96125071]                                                                                                                                    │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1508862.42 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-3.28812583 -3.28812583]                                                                                                                                                │
│ [-1.96125071 -1.96125071 -1.96125071]                                                                                                                                    │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 957, 958, 969, 1000]                                                                                                                                        │
│ Average cumulative reward:       -7.911532060476167                                                                                                                      │
│ Average rollout reward:          -7.566054087498051                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 957, 958, 969, 1000]                                                                                                                                        │
│ Average cumulative reward:       -7.911532060476167                                                                                                                      │
│ Average rollout reward:          -7.566054087498051                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 957, 958, 969, 1000]                                                                                                                                        │
│ Average cumulative reward:       -7.911532060476167                                                                                                                      │
│ Average rollout reward:          -7.566054087498051                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 957, 958, 969, 1000]                                                                                                                                        │
│ Average cumulative reward:       -7.911532060476167                                                                                                                      │
│ Average rollout reward:          -7.566054087498051                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:23[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1679, 1683, 1694, 1698, 2000]                                                                                                                               │
│ Average cumulative reward:       -7.72619321514503                                                                                                                       │
│ Average rollout reward:          -7.308267474775788                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:23[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1679, 1683, 1694, 1698, 2000]                                                                                                                               │
│ Average cumulative reward:       -7.72619321514503                                                                                                                       │
│ Average rollout reward:          -7.308267474775788                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:23[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1679, 1683, 1694, 1698, 2000]                                                                                                                               │
│ Average cumulative reward:       -7.72619321514503                                                                                                                       │
│ Average rollout reward:          -7.308267474775788                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:19[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2949, 2950, 2994, 3000]                                                                                                                                     │
│ Average cumulative reward:       -7.850507248771517                                                                                                                      │
│ Average rollout reward:          -7.408226090717843                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:19[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2949, 2950, 2994, 3000]                                                                                                                                     │
│ Average cumulative reward:       -7.850507248771517                                                                                                                      │
│ Average rollout reward:          -7.408226090717843                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:19[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2949, 2950, 2994, 3000]                                                                                                                                     │
│ Average cumulative reward:       -7.850507248771517                                                                                                                      │
│ Average rollout reward:          -7.408226090717843                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:19[0m   2.35 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2949, 2950, 2994, 3000]                                                                                                                                     │
│ Average cumulative reward:       -7.850507248771517                                                                                                                      │
│ Average rollout reward:          -7.408226090717843                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:17[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 25, 26, 1276, 1278, 4000]                                                                                                                                   │
│ Average cumulative reward:       -7.614698239697202                                                                                                                      │
│ Average rollout reward:          -7.164571194305603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:17[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 25, 26, 1276, 1278, 4000]                                                                                                                                   │
│ Average cumulative reward:       -7.614698239697202                                                                                                                      │
│ Average rollout reward:          -7.164571194305603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:17[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 25, 26, 1276, 1278, 4000]                                                                                                                                   │
│ Average cumulative reward:       -7.614698239697202                                                                                                                      │
│ Average rollout reward:          -7.164571194305603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:17[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 25, 26, 1276, 1278, 4000]                                                                                                                                   │
│ Average cumulative reward:       -7.614698239697202                                                                                                                      │
│ Average rollout reward:          -7.164571194305603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:17[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 4291, 4292, 4305, 5000]                                                                                                                                     │
│ Average cumulative reward:       -7.88780847470696                                                                                                                       │
│ Average rollout reward:          -7.410433393850116                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:17[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 4291, 4292, 4305, 5000]                                                                                                                                     │
│ Average cumulative reward:       -7.88780847470696                                                                                                                       │
│ Average rollout reward:          -7.410433393850116                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:17[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 4291, 4292, 4305, 5000]                                                                                                                                     │
│ Average cumulative reward:       -7.88780847470696                                                                                                                       │
│ Average rollout reward:          -7.410433393850116                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:14[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5948, 5950, 5954, 6000]                                                                                                                                     │
│ Average cumulative reward:       -7.220114964281444                                                                                                                      │
│ Average rollout reward:          -6.7364393333520285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:14[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5948, 5950, 5954, 6000]                                                                                                                                     │
│ Average cumulative reward:       -7.220114964281444                                                                                                                      │
│ Average rollout reward:          -6.7364393333520285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:14[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5948, 5950, 5954, 6000]                                                                                                                                     │
│ Average cumulative reward:       -7.220114964281444                                                                                                                      │
│ Average rollout reward:          -6.7364393333520285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:14[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5948, 5950, 5954, 6000]                                                                                                                                     │
│ Average cumulative reward:       -7.220114964281444                                                                                                                      │
│ Average rollout reward:          -6.7364393333520285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:13[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 165, 167, 222, 278, 7000]                                                                                                                                   │
│ Average cumulative reward:       -7.781273066149877                                                                                                                      │
│ Average rollout reward:          -7.287741480352024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:13[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 165, 167, 222, 278, 7000]                                                                                                                                   │
│ Average cumulative reward:       -7.781273066149877                                                                                                                      │
│ Average rollout reward:          -7.287741480352024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:13[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 165, 167, 222, 278, 7000]                                                                                                                                   │
│ Average cumulative reward:       -7.781273066149877                                                                                                                      │
│ Average rollout reward:          -7.287741480352024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:13[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 165, 167, 222, 278, 7000]                                                                                                                                   │
│ Average cumulative reward:       -7.781273066149877                                                                                                                      │
│ Average rollout reward:          -7.287741480352024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:10[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 7943, 7945, 7968, 8000]                                                                                                                                     │
│ Average cumulative reward:       -7.593350230647774                                                                                                                      │
│ Average rollout reward:          -7.091263314658634                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
│ [-1.96125071 -1.96125071 -1.96125071 -1.63437559 -1.63437559 -1.30750048]                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:10[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 7943, 7945, 7968, 8000]                                                                                                                                     │
│ Average cumulative reward:       -7.593350230647774                                                                                                                      │
│ Average rollout reward:          -7.091263314658634                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
│ [-1.96125071 -1.96125071 -1.96125071 -1.63437559 -1.63437559 -1.30750048]                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:02:10[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 7943, 7945, 7968, 8000]                                                                                                                                     │
│ Average cumulative reward:       -7.593350230647774                                                                                                                      │
│ Average rollout reward:          -7.091263314658634                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318615                                                                                                                             │
│ Best path: [0, 2, 5]                                                                                                                                                     │
│ [-1.96125071 -1.96125071 -1.96125071 -1.63437559 -1.63437559 -1.30750048]                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:02:09[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 6655, 6656, 6807, 6812, 9000]                                                                                                                               │
│ Average cumulative reward:       -7.894092302616548                                                                                                                      │
│ Average rollout reward:          -7.360222937052319                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:02:09[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 6655, 6656, 6807, 6812, 9000]                                                                                                                               │
│ Average cumulative reward:       -7.894092302616548                                                                                                                      │
│ Average rollout reward:          -7.360222937052319                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:02:09[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 6655, 6656, 6807, 6812, 9000]                                                                                                                               │
│ Average cumulative reward:       -7.894092302616548                                                                                                                      │
│ Average rollout reward:          -7.360222937052319                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:02:09[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 6655, 6656, 6807, 6812, 9000]                                                                                                                               │
│ Average cumulative reward:       -7.894092302616548                                                                                                                      │
│ Average rollout reward:          -7.360222937052319                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:02:07[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9984, 9992, 10000]                                                                                                                                    │
│ Average cumulative reward:       -8.135323107895466                                                                                                                      │
│ Average rollout reward:          -7.6406658879403455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:02:07[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9984, 9992, 10000]                                                                                                                                    │
│ Average cumulative reward:       -8.135323107895466                                                                                                                      │
│ Average rollout reward:          -7.6406658879403455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:02:07[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9984, 9992, 10000]                                                                                                                                    │
│ Average cumulative reward:       -8.135323107895466                                                                                                                      │
│ Average rollout reward:          -7.6406658879403455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:02:07[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9984, 9992, 10000]                                                                                                                                    │
│ Average cumulative reward:       -8.135323107895466                                                                                                                      │
│ Average rollout reward:          -7.6406658879403455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:02:06[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10943, 10945, 10963, 11000]                                                                                                                                 │
│ Average cumulative reward:       -7.949180367394494                                                                                                                      │
│ Average rollout reward:          -7.4416615177063274                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:02:06[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10943, 10945, 10963, 11000]                                                                                                                                 │
│ Average cumulative reward:       -7.949180367394494                                                                                                                      │
│ Average rollout reward:          -7.4416615177063274                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:02:06[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10943, 10945, 10963, 11000]                                                                                                                                 │
│ Average cumulative reward:       -7.949180367394494                                                                                                                      │
│ Average rollout reward:          -7.4416615177063274                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:02:04[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11959, 11963, 11969, 12000]                                                                                                                                 │
│ Average cumulative reward:       -7.331570936369152                                                                                                                      │
│ Average rollout reward:          -6.81145863176666                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:02:04[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11959, 11963, 11969, 12000]                                                                                                                                 │
│ Average cumulative reward:       -7.331570936369152                                                                                                                      │
│ Average rollout reward:          -6.81145863176666                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:02:04[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11959, 11963, 11969, 12000]                                                                                                                                 │
│ Average cumulative reward:       -7.331570936369152                                                                                                                      │
│ Average rollout reward:          -6.81145863176666                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:02:04[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11959, 11963, 11969, 12000]                                                                                                                                 │
│ Average cumulative reward:       -7.331570936369152                                                                                                                      │
│ Average rollout reward:          -6.81145863176666                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:02:02[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 256, 258, 261, 551, 13000]                                                                                                                                  │
│ Average cumulative reward:       -7.486220052527361                                                                                                                      │
│ Average rollout reward:          -6.9793459600577545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:02:02[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 256, 258, 261, 551, 13000]                                                                                                                                  │
│ Average cumulative reward:       -7.486220052527361                                                                                                                      │
│ Average rollout reward:          -6.9793459600577545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:02:02[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 256, 258, 261, 551, 13000]                                                                                                                                  │
│ Average cumulative reward:       -7.486220052527361                                                                                                                      │
│ Average rollout reward:          -6.9793459600577545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:02:02[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 256, 258, 261, 551, 13000]                                                                                                                                  │
│ Average cumulative reward:       -7.486220052527361                                                                                                                      │
│ Average rollout reward:          -6.9793459600577545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:02:01[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11275, 11276, 14000]                                                                                                                                        │
│ Average cumulative reward:       -7.898767278825587                                                                                                                      │
│ Average rollout reward:          -7.377178056538293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:02:01[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11275, 11276, 14000]                                                                                                                                        │
│ Average cumulative reward:       -7.898767278825587                                                                                                                      │
│ Average rollout reward:          -7.377178056538293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:02:01[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11275, 11276, 14000]                                                                                                                                        │
│ Average cumulative reward:       -7.898767278825587                                                                                                                      │
│ Average rollout reward:          -7.377178056538293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:02:01[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11275, 11276, 14000]                                                                                                                                        │
│ Average cumulative reward:       -7.898767278825587                                                                                                                      │
│ Average rollout reward:          -7.377178056538293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:59[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14936, 14937, 14948, 14954, 15000]                                                                                                                          │
│ Average cumulative reward:       -7.600273864156118                                                                                                                      │
│ Average rollout reward:          -7.0722553285549505                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:59[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14936, 14937, 14948, 14954, 15000]                                                                                                                          │
│ Average cumulative reward:       -7.600273864156118                                                                                                                      │
│ Average rollout reward:          -7.0722553285549505                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:59[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14936, 14937, 14948, 14954, 15000]                                                                                                                          │
│ Average cumulative reward:       -7.600273864156118                                                                                                                      │
│ Average rollout reward:          -7.0722553285549505                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:57[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5288, 5290, 14915, 15279, 16000]                                                                                                                            │
│ Average cumulative reward:       -7.753730044047493                                                                                                                      │
│ Average rollout reward:          -7.204580557133072                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:57[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5288, 5290, 14915, 15279, 16000]                                                                                                                            │
│ Average cumulative reward:       -7.753730044047493                                                                                                                      │
│ Average rollout reward:          -7.204580557133072                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:57[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5288, 5290, 14915, 15279, 16000]                                                                                                                            │
│ Average cumulative reward:       -7.753730044047493                                                                                                                      │
│ Average rollout reward:          -7.204580557133072                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:57[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5288, 5290, 14915, 15279, 16000]                                                                                                                            │
│ Average cumulative reward:       -7.753730044047493                                                                                                                      │
│ Average rollout reward:          -7.204580557133072                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:56[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7154, 7156, 7182, 7184, 7227, 17000]                                                                                                                        │
│ Average cumulative reward:       -8.105807173399832                                                                                                                      │
│ Average rollout reward:          -7.562879072109944                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:56[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7154, 7156, 7182, 7184, 7227, 17000]                                                                                                                        │
│ Average cumulative reward:       -8.105807173399832                                                                                                                      │
│ Average rollout reward:          -7.562879072109944                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:56[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7154, 7156, 7182, 7184, 7227, 17000]                                                                                                                        │
│ Average cumulative reward:       -8.105807173399832                                                                                                                      │
│ Average rollout reward:          -7.562879072109944                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:56[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7154, 7156, 7182, 7184, 7227, 17000]                                                                                                                        │
│ Average cumulative reward:       -8.105807173399832                                                                                                                      │
│ Average rollout reward:          -7.562879072109944                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:54[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17876, 17878, 17911, 17915, 18000]                                                                                                                          │
│ Average cumulative reward:       -8.155151505275601                                                                                                                      │
│ Average rollout reward:          -7.612735777671023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:54[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17876, 17878, 17911, 17915, 18000]                                                                                                                          │
│ Average cumulative reward:       -8.155151505275601                                                                                                                      │
│ Average rollout reward:          -7.612735777671023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:54[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17876, 17878, 17911, 17915, 18000]                                                                                                                          │
│ Average cumulative reward:       -8.155151505275601                                                                                                                      │
│ Average rollout reward:          -7.612735777671023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:54[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17876, 17878, 17911, 17915, 18000]                                                                                                                          │
│ Average cumulative reward:       -8.155151505275601                                                                                                                      │
│ Average rollout reward:          -7.612735777671023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:52[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3417, 3419, 18983, 18984, 18992, 19000]                                                                                                                     │
│ Average cumulative reward:       -7.9940925406999215                                                                                                                     │
│ Average rollout reward:          -7.433522830022161                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:52[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3417, 3419, 18983, 18984, 18992, 19000]                                                                                                                     │
│ Average cumulative reward:       -7.9940925406999215                                                                                                                     │
│ Average rollout reward:          -7.433522830022161                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:52[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3417, 3419, 18983, 18984, 18992, 19000]                                                                                                                     │
│ Average cumulative reward:       -7.9940925406999215                                                                                                                     │
│ Average rollout reward:          -7.433522830022161                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:50[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7412, 7413, 19169, 20000]                                                                                                                                   │
│ Average cumulative reward:       -8.153147301966904                                                                                                                      │
│ Average rollout reward:          -7.603142842743208                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:50[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7412, 7413, 19169, 20000]                                                                                                                                   │
│ Average cumulative reward:       -8.153147301966904                                                                                                                      │
│ Average rollout reward:          -7.603142842743208                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:50[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7412, 7413, 19169, 20000]                                                                                                                                   │
│ Average cumulative reward:       -8.153147301966904                                                                                                                      │
│ Average rollout reward:          -7.603142842743208                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:50[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7412, 7413, 19169, 20000]                                                                                                                                   │
│ Average cumulative reward:       -8.153147301966904                                                                                                                      │
│ Average rollout reward:          -7.603142842743208                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:49[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 957, 959, 967, 987, 21000]                                                                                                                                  │
│ Average cumulative reward:       -8.260122667888295                                                                                                                      │
│ Average rollout reward:          -7.684637059704605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:49[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 957, 959, 967, 987, 21000]                                                                                                                                  │
│ Average cumulative reward:       -8.260122667888295                                                                                                                      │
│ Average rollout reward:          -7.684637059704605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:49[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 957, 959, 967, 987, 21000]                                                                                                                                  │
│ Average cumulative reward:       -8.260122667888295                                                                                                                      │
│ Average rollout reward:          -7.684637059704605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:49[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 957, 959, 967, 987, 21000]                                                                                                                                  │
│ Average cumulative reward:       -8.260122667888295                                                                                                                      │
│ Average rollout reward:          -7.684637059704605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:48[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 657, 3530, 22000]                                                                                                                                 │
│ Average cumulative reward:       -7.812762505862315                                                                                                                      │
│ Average rollout reward:          -7.297438760367225                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:48[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 657, 3530, 22000]                                                                                                                                 │
│ Average cumulative reward:       -7.812762505862315                                                                                                                      │
│ Average rollout reward:          -7.297438760367225                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:48[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 657, 3530, 22000]                                                                                                                                 │
│ Average cumulative reward:       -7.812762505862315                                                                                                                      │
│ Average rollout reward:          -7.297438760367225                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:45[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12309, 12310, 12340, 23000]                                                                                                                                 │
│ Average cumulative reward:       -7.513152967843559                                                                                                                      │
│ Average rollout reward:          -6.952551094451648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:45[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12309, 12310, 12340, 23000]                                                                                                                                 │
│ Average cumulative reward:       -7.513152967843559                                                                                                                      │
│ Average rollout reward:          -6.952551094451648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:45[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12309, 12310, 12340, 23000]                                                                                                                                 │
│ Average cumulative reward:       -7.513152967843559                                                                                                                      │
│ Average rollout reward:          -6.952551094451648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:45[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12309, 12310, 12340, 23000]                                                                                                                                 │
│ Average cumulative reward:       -7.513152967843559                                                                                                                      │
│ Average rollout reward:          -6.952551094451648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:44[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2015, 2019, 4925, 24000]                                                                                                                                    │
│ Average cumulative reward:       -8.095597527745273                                                                                                                      │
│ Average rollout reward:          -7.533787627191224                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:44[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2015, 2019, 4925, 24000]                                                                                                                                    │
│ Average cumulative reward:       -8.095597527745273                                                                                                                      │
│ Average rollout reward:          -7.533787627191224                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:44[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2015, 2019, 4925, 24000]                                                                                                                                    │
│ Average cumulative reward:       -8.095597527745273                                                                                                                      │
│ Average rollout reward:          -7.533787627191224                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:44[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2015, 2019, 4925, 24000]                                                                                                                                    │
│ Average cumulative reward:       -8.095597527745273                                                                                                                      │
│ Average rollout reward:          -7.533787627191224                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:43[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 24937, 24938, 25000]                                                                                                                          │
│ Average cumulative reward:       -7.990792105536839                                                                                                                      │
│ Average rollout reward:          -7.435840524893957                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:43[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 24937, 24938, 25000]                                                                                                                          │
│ Average cumulative reward:       -7.990792105536839                                                                                                                      │
│ Average rollout reward:          -7.435840524893957                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:43[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 24937, 24938, 25000]                                                                                                                          │
│ Average cumulative reward:       -7.990792105536839                                                                                                                      │
│ Average rollout reward:          -7.435840524893957                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:43[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 24937, 24938, 25000]                                                                                                                          │
│ Average cumulative reward:       -7.990792105536839                                                                                                                      │
│ Average rollout reward:          -7.435840524893957                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:40[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25837, 25845, 25980, 26000]                                                                                                                          │
│ Average cumulative reward:       -7.7448967622133535                                                                                                                     │
│ Average rollout reward:          -7.180455446367606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:40[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25837, 25845, 25980, 26000]                                                                                                                          │
│ Average cumulative reward:       -7.7448967622133535                                                                                                                     │
│ Average rollout reward:          -7.180455446367606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:40[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25837, 25845, 25980, 26000]                                                                                                                          │
│ Average cumulative reward:       -7.7448967622133535                                                                                                                     │
│ Average rollout reward:          -7.180455446367606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:40[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25837, 25845, 25980, 26000]                                                                                                                          │
│ Average cumulative reward:       -7.7448967622133535                                                                                                                     │
│ Average rollout reward:          -7.180455446367606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:39[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8217, 8219, 14886, 15427, 23553, 27000]                                                                                                                     │
│ Average cumulative reward:       -7.982305701775607                                                                                                                      │
│ Average rollout reward:          -7.424770628884885                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:39[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8217, 8219, 14886, 15427, 23553, 27000]                                                                                                                     │
│ Average cumulative reward:       -7.982305701775607                                                                                                                      │
│ Average rollout reward:          -7.424770628884885                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:39[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8217, 8219, 14886, 15427, 23553, 27000]                                                                                                                     │
│ Average cumulative reward:       -7.982305701775607                                                                                                                      │
│ Average rollout reward:          -7.424770628884885                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:39[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8217, 8219, 14886, 15427, 23553, 27000]                                                                                                                     │
│ Average cumulative reward:       -7.982305701775607                                                                                                                      │
│ Average rollout reward:          -7.424770628884885                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:37[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9985, 26595, 28000]                                                                                                                                   │
│ Average cumulative reward:       -8.122764488259929                                                                                                                      │
│ Average rollout reward:          -7.571130412974767                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:37[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9985, 26595, 28000]                                                                                                                                   │
│ Average cumulative reward:       -8.122764488259929                                                                                                                      │
│ Average rollout reward:          -7.571130412974767                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:37[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9983, 9985, 26595, 28000]                                                                                                                                   │
│ Average cumulative reward:       -8.122764488259929                                                                                                                      │
│ Average rollout reward:          -7.571130412974767                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:35[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28983, 28986, 29000]                                                                                                                          │
│ Average cumulative reward:       -7.8294651490914005                                                                                                                     │
│ Average rollout reward:          -7.280684307179026                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:35[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28983, 28986, 29000]                                                                                                                          │
│ Average cumulative reward:       -7.8294651490914005                                                                                                                     │
│ Average rollout reward:          -7.280684307179026                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:35[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28983, 28986, 29000]                                                                                                                          │
│ Average cumulative reward:       -7.8294651490914005                                                                                                                     │
│ Average rollout reward:          -7.280684307179026                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:35[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28983, 28986, 29000]                                                                                                                          │
│ Average cumulative reward:       -7.8294651490914005                                                                                                                     │
│ Average rollout reward:          -7.280684307179026                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:34[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10297, 10298, 10308, 10392, 30000]                                                                                                                          │
│ Average cumulative reward:       -8.137180153899369                                                                                                                      │
│ Average rollout reward:          -7.566962024307409                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:34[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10297, 10298, 10308, 10392, 30000]                                                                                                                          │
│ Average cumulative reward:       -8.137180153899369                                                                                                                      │
│ Average rollout reward:          -7.566962024307409                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:34[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10297, 10298, 10308, 10392, 30000]                                                                                                                          │
│ Average cumulative reward:       -8.137180153899369                                                                                                                      │
│ Average rollout reward:          -7.566962024307409                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:34[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10297, 10298, 10308, 10392, 30000]                                                                                                                          │
│ Average cumulative reward:       -8.137180153899369                                                                                                                      │
│ Average rollout reward:          -7.566962024307409                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:32[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 24217, 24252, 31000]                                                                                                                          │
│ Average cumulative reward:       -8.2274491427142                                                                                                                        │
│ Average rollout reward:          -7.644788418987353                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:32[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 24217, 24252, 31000]                                                                                                                          │
│ Average cumulative reward:       -8.2274491427142                                                                                                                        │
│ Average rollout reward:          -7.644788418987353                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:32[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 24217, 24252, 31000]                                                                                                                          │
│ Average cumulative reward:       -8.2274491427142                                                                                                                        │
│ Average rollout reward:          -7.644788418987353                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:32[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 24217, 24252, 31000]                                                                                                                          │
│ Average cumulative reward:       -8.2274491427142                                                                                                                        │
│ Average rollout reward:          -7.644788418987353                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:30[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 3853, 4023, 6049, 32000]                                                                                                                          │
│ Average cumulative reward:       -7.738443855080607                                                                                                                      │
│ Average rollout reward:          -7.176115233075432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:30[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 3853, 4023, 6049, 32000]                                                                                                                          │
│ Average cumulative reward:       -7.738443855080607                                                                                                                      │
│ Average rollout reward:          -7.176115233075432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:30[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 3853, 4023, 6049, 32000]                                                                                                                          │
│ Average cumulative reward:       -7.738443855080607                                                                                                                      │
│ Average rollout reward:          -7.176115233075432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:30[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 625, 3853, 4023, 6049, 32000]                                                                                                                          │
│ Average cumulative reward:       -7.738443855080607                                                                                                                      │
│ Average rollout reward:          -7.176115233075432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:29[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3926, 3927, 3961, 3968, 33000]                                                                                                                              │
│ Average cumulative reward:       -8.273331314072909                                                                                                                      │
│ Average rollout reward:          -7.699061076731712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:29[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3926, 3927, 3961, 3968, 33000]                                                                                                                              │
│ Average cumulative reward:       -8.273331314072909                                                                                                                      │
│ Average rollout reward:          -7.699061076731712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:29[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3926, 3927, 3961, 3968, 33000]                                                                                                                              │
│ Average cumulative reward:       -8.273331314072909                                                                                                                      │
│ Average rollout reward:          -7.699061076731712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:29[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3926, 3927, 3961, 3968, 33000]                                                                                                                              │
│ Average cumulative reward:       -8.273331314072909                                                                                                                      │
│ Average rollout reward:          -7.699061076731712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:27[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33635, 33636, 33702, 33708, 33765, 34000]                                                                                                                   │
│ Average cumulative reward:       -7.917964826951686                                                                                                                      │
│ Average rollout reward:          -7.376201016566777                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:27[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33635, 33636, 33702, 33708, 33765, 34000]                                                                                                                   │
│ Average cumulative reward:       -7.917964826951686                                                                                                                      │
│ Average rollout reward:          -7.376201016566777                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:27[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33635, 33636, 33702, 33708, 33765, 34000]                                                                                                                   │
│ Average cumulative reward:       -7.917964826951686                                                                                                                      │
│ Average rollout reward:          -7.376201016566777                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:27[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33635, 33636, 33702, 33708, 33765, 34000]                                                                                                                   │
│ Average cumulative reward:       -7.917964826951686                                                                                                                      │
│ Average rollout reward:          -7.376201016566777                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:25[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25283, 25285, 35000]                                                                                                                                        │
│ Average cumulative reward:       -7.869684697200515                                                                                                                      │
│ Average rollout reward:          -7.2990243961985115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:25[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25283, 25285, 35000]                                                                                                                                        │
│ Average cumulative reward:       -7.869684697200515                                                                                                                      │
│ Average rollout reward:          -7.2990243961985115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:25[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25283, 25285, 35000]                                                                                                                                        │
│ Average cumulative reward:       -7.869684697200515                                                                                                                      │
│ Average rollout reward:          -7.2990243961985115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:23[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35614, 35616, 35987, 35988, 36000]                                                                                                                          │
│ Average cumulative reward:       -7.8171485544315145                                                                                                                     │
│ Average rollout reward:          -7.20390284357508                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:23[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35614, 35616, 35987, 35988, 36000]                                                                                                                          │
│ Average cumulative reward:       -7.8171485544315145                                                                                                                     │
│ Average rollout reward:          -7.20390284357508                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:23[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35614, 35616, 35987, 35988, 36000]                                                                                                                          │
│ Average cumulative reward:       -7.8171485544315145                                                                                                                     │
│ Average rollout reward:          -7.20390284357508                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:23[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35614, 35616, 35987, 35988, 36000]                                                                                                                          │
│ Average cumulative reward:       -7.8171485544315145                                                                                                                     │
│ Average rollout reward:          -7.20390284357508                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:23[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35614, 35616, 35987, 35988, 36000]                                                                                                                          │
│ Average cumulative reward:       -7.8171485544315145                                                                                                                     │
│ Average rollout reward:          -7.20390284357508                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:22[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5948, 5949, 5992, 37000]                                                                                                                                    │
│ Average cumulative reward:       -8.451453044254047                                                                                                                      │
│ Average rollout reward:          -7.86296260274502                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:22[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5948, 5949, 5992, 37000]                                                                                                                                    │
│ Average cumulative reward:       -8.451453044254047                                                                                                                      │
│ Average rollout reward:          -7.86296260274502                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:22[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5948, 5949, 5992, 37000]                                                                                                                                    │
│ Average cumulative reward:       -8.451453044254047                                                                                                                      │
│ Average rollout reward:          -7.86296260274502                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:22[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5948, 5949, 5992, 37000]                                                                                                                                    │
│ Average cumulative reward:       -8.451453044254047                                                                                                                      │
│ Average rollout reward:          -7.86296260274502                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:20[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 34985, 35073, 38000]                                                                                                                          │
│ Average cumulative reward:       -8.289597268198987                                                                                                                      │
│ Average rollout reward:          -7.717576491022443                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:20[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 34985, 35073, 38000]                                                                                                                          │
│ Average cumulative reward:       -8.289597268198987                                                                                                                      │
│ Average rollout reward:          -7.717576491022443                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:20[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 34985, 35073, 38000]                                                                                                                          │
│ Average cumulative reward:       -8.289597268198987                                                                                                                      │
│ Average rollout reward:          -7.717576491022443                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:19[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 624, 12299, 39000]                                                                                                                                     │
│ Average cumulative reward:       -8.186290234957154                                                                                                                      │
│ Average rollout reward:          -7.59960163477561                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:19[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 624, 12299, 39000]                                                                                                                                     │
│ Average cumulative reward:       -8.186290234957154                                                                                                                      │
│ Average rollout reward:          -7.59960163477561                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:19[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 624, 12299, 39000]                                                                                                                                     │
│ Average cumulative reward:       -8.186290234957154                                                                                                                      │
│ Average rollout reward:          -7.59960163477561                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:19[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 623, 624, 12299, 39000]                                                                                                                                     │
│ Average cumulative reward:       -8.186290234957154                                                                                                                      │
│ Average rollout reward:          -7.59960163477561                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:01:17[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24745, 24750, 24755, 40000]                                                                                                                          │
│ Average cumulative reward:       -8.076756725921982                                                                                                                      │
│ Average rollout reward:          -7.458793034975446                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:01:17[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24745, 24750, 24755, 40000]                                                                                                                          │
│ Average cumulative reward:       -8.076756725921982                                                                                                                      │
│ Average rollout reward:          -7.458793034975446                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:01:17[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24745, 24750, 24755, 40000]                                                                                                                          │
│ Average cumulative reward:       -8.076756725921982                                                                                                                      │
│ Average rollout reward:          -7.458793034975446                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:01:17[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24745, 24750, 24755, 40000]                                                                                                                          │
│ Average cumulative reward:       -8.076756725921982                                                                                                                      │
│ Average rollout reward:          -7.458793034975446                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:01:15[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 463, 466, 470, 38750, 41000]                                                                                                                                │
│ Average cumulative reward:       -8.424698930689793                                                                                                                      │
│ Average rollout reward:          -7.828947193101726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:01:15[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 463, 466, 470, 38750, 41000]                                                                                                                                │
│ Average cumulative reward:       -8.424698930689793                                                                                                                      │
│ Average rollout reward:          -7.828947193101726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:01:15[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 463, 466, 470, 38750, 41000]                                                                                                                                │
│ Average cumulative reward:       -8.424698930689793                                                                                                                      │
│ Average rollout reward:          -7.828947193101726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:01:15[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 463, 466, 470, 38750, 41000]                                                                                                                                │
│ Average cumulative reward:       -8.424698930689793                                                                                                                      │
│ Average rollout reward:          -7.828947193101726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:01:13[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41964, 41967, 42000]                                                                                                                                        │
│ Average cumulative reward:       -8.253022484262319                                                                                                                      │
│ Average rollout reward:          -7.6490873886544986                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:01:13[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41964, 41967, 42000]                                                                                                                                        │
│ Average cumulative reward:       -8.253022484262319                                                                                                                      │
│ Average rollout reward:          -7.6490873886544986                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:01:13[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41964, 41967, 42000]                                                                                                                                        │
│ Average cumulative reward:       -8.253022484262319                                                                                                                      │
│ Average rollout reward:          -7.6490873886544986                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:01:13[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41964, 41967, 42000]                                                                                                                                        │
│ Average cumulative reward:       -8.253022484262319                                                                                                                      │
│ Average rollout reward:          -7.6490873886544986                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:01:11[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42708, 42709, 42989, 42995, 43000]                                                                                                                          │
│ Average cumulative reward:       -7.617748415842711                                                                                                                      │
│ Average rollout reward:          -7.053588337118851                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:01:11[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42708, 42709, 42989, 42995, 43000]                                                                                                                          │
│ Average cumulative reward:       -7.617748415842711                                                                                                                      │
│ Average rollout reward:          -7.053588337118851                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:01:11[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42708, 42709, 42989, 42995, 43000]                                                                                                                          │
│ Average cumulative reward:       -7.617748415842711                                                                                                                      │
│ Average rollout reward:          -7.053588337118851                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:01:11[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42708, 42709, 42989, 42995, 43000]                                                                                                                          │
│ Average cumulative reward:       -7.617748415842711                                                                                                                      │
│ Average rollout reward:          -7.053588337118851                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:01:09[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7943, 7945, 27106, 27259, 36221, 44000]                                                                                                                     │
│ Average cumulative reward:       -8.01209979537625                                                                                                                       │
│ Average rollout reward:          -7.4505210916793425                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:01:09[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7943, 7945, 27106, 27259, 36221, 44000]                                                                                                                     │
│ Average cumulative reward:       -8.01209979537625                                                                                                                       │
│ Average rollout reward:          -7.4505210916793425                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:01:09[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 7943, 7945, 27106, 27259, 36221, 44000]                                                                                                                     │
│ Average cumulative reward:       -8.01209979537625                                                                                                                       │
│ Average rollout reward:          -7.4505210916793425                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:01:07[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44989, 44993, 45000]                                                                                                                                        │
│ Average cumulative reward:       -7.833542327613441                                                                                                                      │
│ Average rollout reward:          -7.227348535435761                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:01:07[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44989, 44993, 45000]                                                                                                                                        │
│ Average cumulative reward:       -7.833542327613441                                                                                                                      │
│ Average rollout reward:          -7.227348535435761                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:01:07[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44989, 44993, 45000]                                                                                                                                        │
│ Average cumulative reward:       -7.833542327613441                                                                                                                      │
│ Average rollout reward:          -7.227348535435761                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:01:07[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44989, 44993, 45000]                                                                                                                                        │
│ Average cumulative reward:       -7.833542327613441                                                                                                                      │
│ Average rollout reward:          -7.227348535435761                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:01:05[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45765, 45766, 45877, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.999157509663904                                                                                                                      │
│ Average rollout reward:          -7.418930945344719                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:01:05[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45765, 45766, 45877, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.999157509663904                                                                                                                      │
│ Average rollout reward:          -7.418930945344719                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:01:05[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45765, 45766, 45877, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.999157509663904                                                                                                                      │
│ Average rollout reward:          -7.418930945344719                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:01:05[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45765, 45766, 45877, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.999157509663904                                                                                                                      │
│ Average rollout reward:          -7.418930945344719                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:01:03[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39064, 39066, 44125, 44703, 46649, 47000]                                                                                                                   │
│ Average cumulative reward:       -8.530867505131205                                                                                                                      │
│ Average rollout reward:          -7.893380760586517                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:01:03[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39064, 39066, 44125, 44703, 46649, 47000]                                                                                                                   │
│ Average cumulative reward:       -8.530867505131205                                                                                                                      │
│ Average rollout reward:          -7.893380760586517                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:01:03[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39064, 39066, 44125, 44703, 46649, 47000]                                                                                                                   │
│ Average cumulative reward:       -8.530867505131205                                                                                                                      │
│ Average rollout reward:          -7.893380760586517                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:01:03[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39064, 39066, 44125, 44703, 46649, 47000]                                                                                                                   │
│ Average cumulative reward:       -8.530867505131205                                                                                                                      │
│ Average rollout reward:          -7.893380760586517                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:01:01[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 36288, 36289, 45618, 46365, 46729, 48000]                                                                                                                   │
│ Average cumulative reward:       -8.332863219233488                                                                                                                      │
│ Average rollout reward:          -7.705542071166309                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:01:01[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 36288, 36289, 45618, 46365, 46729, 48000]                                                                                                                   │
│ Average cumulative reward:       -8.332863219233488                                                                                                                      │
│ Average rollout reward:          -7.705542071166309                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:01:01[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 36288, 36289, 45618, 46365, 46729, 48000]                                                                                                                   │
│ Average cumulative reward:       -8.332863219233488                                                                                                                      │
│ Average rollout reward:          -7.705542071166309                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:01:01[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 36288, 36289, 45618, 46365, 46729, 48000]                                                                                                                   │
│ Average cumulative reward:       -8.332863219233488                                                                                                                      │
│ Average rollout reward:          -7.705542071166309                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:59[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48950, 48953, 49000]                                                                                                                                        │
│ Average cumulative reward:       -7.667690517438972                                                                                                                      │
│ Average rollout reward:          -7.0971500363955                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:59[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48950, 48953, 49000]                                                                                                                                        │
│ Average cumulative reward:       -7.667690517438972                                                                                                                      │
│ Average rollout reward:          -7.0971500363955                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:59[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48950, 48953, 49000]                                                                                                                                        │
│ Average cumulative reward:       -7.667690517438972                                                                                                                      │
│ Average rollout reward:          -7.0971500363955                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:59[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48950, 48953, 49000]                                                                                                                                        │
│ Average cumulative reward:       -7.667690517438972                                                                                                                      │
│ Average rollout reward:          -7.0971500363955                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:57[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49768, 49999, 50000]                                                                                                                                 │
│ Average cumulative reward:       -7.82140286879749                                                                                                                       │
│ Average rollout reward:          -7.230578314527183                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:57[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49768, 49999, 50000]                                                                                                                                 │
│ Average cumulative reward:       -7.82140286879749                                                                                                                       │
│ Average rollout reward:          -7.230578314527183                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:57[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49768, 49999, 50000]                                                                                                                                 │
│ Average cumulative reward:       -7.82140286879749                                                                                                                       │
│ Average rollout reward:          -7.230578314527183                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:55[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 50593, 50596, 50605, 50945, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.990986515632253                                                                                                                      │
│ Average rollout reward:          -7.491694464300487                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:55[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 50593, 50596, 50605, 50945, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.990986515632253                                                                                                                      │
│ Average rollout reward:          -7.491694464300487                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:55[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 50593, 50596, 50605, 50945, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.990986515632253                                                                                                                      │
│ Average rollout reward:          -7.491694464300487                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:55[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 50593, 50596, 50605, 50945, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.990986515632253                                                                                                                      │
│ Average rollout reward:          -7.491694464300487                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:53[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 291, 295, 27234, 28036, 52000]                                                                                                                              │
│ Average cumulative reward:       -8.1246543657301                                                                                                                        │
│ Average rollout reward:          -7.544311904611981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:53[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 291, 295, 27234, 28036, 52000]                                                                                                                              │
│ Average cumulative reward:       -8.1246543657301                                                                                                                        │
│ Average rollout reward:          -7.544311904611981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:53[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 291, 295, 27234, 28036, 52000]                                                                                                                              │
│ Average cumulative reward:       -8.1246543657301                                                                                                                        │
│ Average rollout reward:          -7.544311904611981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:53[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 291, 295, 27234, 28036, 52000]                                                                                                                              │
│ Average cumulative reward:       -8.1246543657301                                                                                                                        │
│ Average rollout reward:          -7.544311904611981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:51[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 96, 97, 48452, 48723, 53000]                                                                                                                                │
│ Average cumulative reward:       -7.954124068374728                                                                                                                      │
│ Average rollout reward:          -7.361798498480986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:51[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 96, 97, 48452, 48723, 53000]                                                                                                                                │
│ Average cumulative reward:       -7.954124068374728                                                                                                                      │
│ Average rollout reward:          -7.361798498480986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:51[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 96, 97, 48452, 48723, 53000]                                                                                                                                │
│ Average cumulative reward:       -7.954124068374728                                                                                                                      │
│ Average rollout reward:          -7.361798498480986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:51[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 96, 97, 48452, 48723, 53000]                                                                                                                                │
│ Average cumulative reward:       -7.954124068374728                                                                                                                      │
│ Average rollout reward:          -7.361798498480986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:49[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53973, 53975, 53993, 54000]                                                                                                                                 │
│ Average cumulative reward:       -8.223907104351609                                                                                                                      │
│ Average rollout reward:          -7.6325334154098625                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:49[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53973, 53975, 53993, 54000]                                                                                                                                 │
│ Average cumulative reward:       -8.223907104351609                                                                                                                      │
│ Average rollout reward:          -7.6325334154098625                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:49[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53973, 53975, 53993, 54000]                                                                                                                                 │
│ Average cumulative reward:       -8.223907104351609                                                                                                                      │
│ Average rollout reward:          -7.6325334154098625                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:49[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53973, 53975, 53993, 54000]                                                                                                                                 │
│ Average cumulative reward:       -8.223907104351609                                                                                                                      │
│ Average rollout reward:          -7.6325334154098625                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:47[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5079, 5081, 5117, 5120, 38234, 55000]                                                                                                                       │
│ Average cumulative reward:       -8.070216638885462                                                                                                                      │
│ Average rollout reward:          -7.476473990847934                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:47[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5079, 5081, 5117, 5120, 38234, 55000]                                                                                                                       │
│ Average cumulative reward:       -8.070216638885462                                                                                                                      │
│ Average rollout reward:          -7.476473990847934                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:47[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5079, 5081, 5117, 5120, 38234, 55000]                                                                                                                       │
│ Average cumulative reward:       -8.070216638885462                                                                                                                      │
│ Average rollout reward:          -7.476473990847934                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:47[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5079, 5081, 5117, 5120, 38234, 55000]                                                                                                                       │
│ Average cumulative reward:       -8.070216638885462                                                                                                                      │
│ Average rollout reward:          -7.476473990847934                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:45[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55714, 55984, 55993, 56000]                                                                                                                          │
│ Average cumulative reward:       -7.816043535398948                                                                                                                      │
│ Average rollout reward:          -7.219435593721567                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:45[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55714, 55984, 55993, 56000]                                                                                                                          │
│ Average cumulative reward:       -7.816043535398948                                                                                                                      │
│ Average rollout reward:          -7.219435593721567                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:45[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55714, 55984, 55993, 56000]                                                                                                                          │
│ Average cumulative reward:       -7.816043535398948                                                                                                                      │
│ Average rollout reward:          -7.219435593721567                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:45[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55714, 55984, 55993, 56000]                                                                                                                          │
│ Average cumulative reward:       -7.816043535398948                                                                                                                      │
│ Average rollout reward:          -7.219435593721567                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯7m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:43[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56596, 56598, 56987, 56988, 57000]                                                                                                                          │
│ Average cumulative reward:       -8.27420999399827                                                                                                                       │
│ Average rollout reward:          -7.688185417190987                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:43[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56596, 56598, 56987, 56988, 57000]                                                                                                                          │
│ Average cumulative reward:       -8.27420999399827                                                                                                                       │
│ Average rollout reward:          -7.688185417190987                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:43[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56596, 56598, 56987, 56988, 57000]                                                                                                                          │
│ Average cumulative reward:       -8.27420999399827                                                                                                                       │
│ Average rollout reward:          -7.688185417190987                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:43[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56596, 56598, 56987, 56988, 57000]                                                                                                                          │
│ Average cumulative reward:       -8.27420999399827                                                                                                                       │
│ Average rollout reward:          -7.688185417190987                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:42[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28707, 28712, 58000]                                                                                                                          │
│ Average cumulative reward:       -8.32039014264469                                                                                                                       │
│ Average rollout reward:          -7.700289027022413                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:42[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28707, 28712, 58000]                                                                                                                          │
│ Average cumulative reward:       -8.32039014264469                                                                                                                       │
│ Average rollout reward:          -7.700289027022413                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:42[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28707, 28712, 58000]                                                                                                                          │
│ Average cumulative reward:       -8.32039014264469                                                                                                                       │
│ Average rollout reward:          -7.700289027022413                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:42[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28688, 28690, 28707, 28712, 58000]                                                                                                                          │
│ Average cumulative reward:       -8.32039014264469                                                                                                                       │
│ Average rollout reward:          -7.700289027022413                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:40[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1034, 1038, 27238, 29477, 59000]                                                                                                                            │
│ Average cumulative reward:       -7.839046621536869                                                                                                                      │
│ Average rollout reward:          -7.220457571484347                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:40[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1034, 1038, 27238, 29477, 59000]                                                                                                                            │
│ Average cumulative reward:       -7.839046621536869                                                                                                                      │
│ Average rollout reward:          -7.220457571484347                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:40[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1034, 1038, 27238, 29477, 59000]                                                                                                                            │
│ Average cumulative reward:       -7.839046621536869                                                                                                                      │
│ Average rollout reward:          -7.220457571484347                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:40[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1034, 1038, 27238, 29477, 59000]                                                                                                                            │
│ Average cumulative reward:       -7.839046621536869                                                                                                                      │
│ Average rollout reward:          -7.220457571484347                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:38[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14542, 14544, 56539, 57484, 60000]                                                                                                                          │
│ Average cumulative reward:       -7.937370626599359                                                                                                                      │
│ Average rollout reward:          -7.374687845339045                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:38[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14542, 14544, 56539, 57484, 60000]                                                                                                                          │
│ Average cumulative reward:       -7.937370626599359                                                                                                                      │
│ Average rollout reward:          -7.374687845339045                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:38[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14542, 14544, 56539, 57484, 60000]                                                                                                                          │
│ Average cumulative reward:       -7.937370626599359                                                                                                                      │
│ Average rollout reward:          -7.374687845339045                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:38[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14542, 14544, 56539, 57484, 60000]                                                                                                                          │
│ Average cumulative reward:       -7.937370626599359                                                                                                                      │
│ Average rollout reward:          -7.374687845339045                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:36[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44221, 44222, 48712, 61000]                                                                                                                                 │
│ Average cumulative reward:       -7.977646754750959                                                                                                                      │
│ Average rollout reward:          -7.395231939523232                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:36[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44221, 44222, 48712, 61000]                                                                                                                                 │
│ Average cumulative reward:       -7.977646754750959                                                                                                                      │
│ Average rollout reward:          -7.395231939523232                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:36[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44221, 44222, 48712, 61000]                                                                                                                                 │
│ Average cumulative reward:       -7.977646754750959                                                                                                                      │
│ Average rollout reward:          -7.395231939523232                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:36[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44221, 44222, 48712, 61000]                                                                                                                                 │
│ Average cumulative reward:       -7.977646754750959                                                                                                                      │
│ Average rollout reward:          -7.395231939523232                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:34[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49770, 61114, 62000]                                                                                                                                 │
│ Average cumulative reward:       -8.305352839451851                                                                                                                      │
│ Average rollout reward:          -7.7533618465913134                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:34[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49770, 61114, 62000]                                                                                                                                 │
│ Average cumulative reward:       -8.305352839451851                                                                                                                      │
│ Average rollout reward:          -7.7533618465913134                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:34[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49770, 61114, 62000]                                                                                                                                 │
│ Average cumulative reward:       -8.305352839451851                                                                                                                      │
│ Average rollout reward:          -7.7533618465913134                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:34[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49767, 49770, 61114, 62000]                                                                                                                                 │
│ Average cumulative reward:       -8.305352839451851                                                                                                                      │
│ Average rollout reward:          -7.7533618465913134                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:32[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55717, 63000]                                                                                                                                        │
│ Average cumulative reward:       -7.9763214822577435                                                                                                                     │
│ Average rollout reward:          -7.376326674486651                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:32[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55717, 63000]                                                                                                                                        │
│ Average cumulative reward:       -7.9763214822577435                                                                                                                     │
│ Average rollout reward:          -7.376326674486651                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:32[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55717, 63000]                                                                                                                                        │
│ Average cumulative reward:       -7.9763214822577435                                                                                                                     │
│ Average rollout reward:          -7.376326674486651                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:32[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55713, 55717, 63000]                                                                                                                                        │
│ Average cumulative reward:       -7.9763214822577435                                                                                                                     │
│ Average rollout reward:          -7.376326674486651                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:30[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 14294, 14301, 14326, 64000]                                                                                                                   │
│ Average cumulative reward:       -8.105902103275337                                                                                                                      │
│ Average rollout reward:          -7.495456484347866                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:30[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 14294, 14301, 14326, 64000]                                                                                                                   │
│ Average cumulative reward:       -8.105902103275337                                                                                                                      │
│ Average rollout reward:          -7.495456484347866                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:30[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 14294, 14301, 14326, 64000]                                                                                                                   │
│ Average cumulative reward:       -8.105902103275337                                                                                                                      │
│ Average rollout reward:          -7.495456484347866                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:30[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14154, 14156, 14294, 14301, 14326, 64000]                                                                                                                   │
│ Average cumulative reward:       -8.105902103275337                                                                                                                      │
│ Average rollout reward:          -7.495456484347866                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:28[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64919, 64989, 64991, 65000]                                                                                                                          │
│ Average cumulative reward:       -8.269567902110449                                                                                                                      │
│ Average rollout reward:          -7.653544151705999                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:28[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64919, 64989, 64991, 65000]                                                                                                                          │
│ Average cumulative reward:       -8.269567902110449                                                                                                                      │
│ Average rollout reward:          -7.653544151705999                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:28[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64919, 64989, 64991, 65000]                                                                                                                          │
│ Average cumulative reward:       -8.269567902110449                                                                                                                      │
│ Average rollout reward:          -7.653544151705999                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:28[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64919, 64989, 64991, 65000]                                                                                                                          │
│ Average cumulative reward:       -8.269567902110449                                                                                                                      │
│ Average rollout reward:          -7.653544151705999                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:27[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 65883, 65885, 65900, 65912, 66000]                                                                                                                          │
│ Average cumulative reward:       -8.032320239340185                                                                                                                      │
│ Average rollout reward:          -7.426767798027831                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:27[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 65883, 65885, 65900, 65912, 66000]                                                                                                                          │
│ Average cumulative reward:       -8.032320239340185                                                                                                                      │
│ Average rollout reward:          -7.426767798027831                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:27[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 65883, 65885, 65900, 65912, 66000]                                                                                                                          │
│ Average cumulative reward:       -8.032320239340185                                                                                                                      │
│ Average rollout reward:          -7.426767798027831                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:27[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 65883, 65885, 65900, 65912, 66000]                                                                                                                          │
│ Average cumulative reward:       -8.032320239340185                                                                                                                      │
│ Average rollout reward:          -7.426767798027831                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:25[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64921, 65421, 67000]                                                                                                                                 │
│ Average cumulative reward:       -8.121125987823893                                                                                                                      │
│ Average rollout reward:          -7.495519388625432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:25[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64921, 65421, 67000]                                                                                                                                 │
│ Average cumulative reward:       -8.121125987823893                                                                                                                      │
│ Average rollout reward:          -7.495519388625432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:25[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64921, 65421, 67000]                                                                                                                                 │
│ Average cumulative reward:       -8.121125987823893                                                                                                                      │
│ Average rollout reward:          -7.495519388625432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:25[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 64917, 64921, 65421, 67000]                                                                                                                                 │
│ Average cumulative reward:       -8.121125987823893                                                                                                                      │
│ Average rollout reward:          -7.495519388625432                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:23[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 67844, 67846, 67959, 67961, 68000]                                                                                                                          │
│ Average cumulative reward:       -7.907260686868386                                                                                                                      │
│ Average rollout reward:          -7.29509892404894                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:23[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 67844, 67846, 67959, 67961, 68000]                                                                                                                          │
│ Average cumulative reward:       -7.907260686868386                                                                                                                      │
│ Average rollout reward:          -7.29509892404894                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:23[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 67844, 67846, 67959, 67961, 68000]                                                                                                                          │
│ Average cumulative reward:       -7.907260686868386                                                                                                                      │
│ Average rollout reward:          -7.29509892404894                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:23[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 67844, 67846, 67959, 67961, 68000]                                                                                                                          │
│ Average cumulative reward:       -7.907260686868386                                                                                                                      │
│ Average rollout reward:          -7.29509892404894                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:21[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 68837, 68840, 68964, 68985, 69000]                                                                                                                          │
│ Average cumulative reward:       -7.921528441036802                                                                                                                      │
│ Average rollout reward:          -7.322531977948737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:21[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 68837, 68840, 68964, 68985, 69000]                                                                                                                          │
│ Average cumulative reward:       -7.921528441036802                                                                                                                      │
│ Average rollout reward:          -7.322531977948737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:21[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 68837, 68840, 68964, 68985, 69000]                                                                                                                          │
│ Average cumulative reward:       -7.921528441036802                                                                                                                      │
│ Average rollout reward:          -7.322531977948737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:21[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 68837, 68840, 68964, 68985, 69000]                                                                                                                          │
│ Average cumulative reward:       -7.921528441036802                                                                                                                      │
│ Average rollout reward:          -7.322531977948737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:19[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69838, 69842, 69992, 69997, 70000]                                                                                                                          │
│ Average cumulative reward:       -8.21380726853216                                                                                                                       │
│ Average rollout reward:          -7.631537872730221                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:16[0m Remaining: [36m0:00:19[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69838, 69842, 69992, 69997, 70000]                                                                                                                          │
│ Average cumulative reward:       -8.21380726853216                                                                                                                       │
│ Average rollout reward:          -7.631537872730221                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:16[0m Remaining: [36m0:00:19[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69838, 69842, 69992, 69997, 70000]                                                                                                                          │
│ Average cumulative reward:       -8.21380726853216                                                                                                                       │
│ Average rollout reward:          -7.631537872730221                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:17[0m Remaining: [36m0:00:19[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69838, 69842, 69992, 69997, 70000]                                                                                                                          │
│ Average cumulative reward:       -8.21380726853216                                                                                                                       │
│ Average rollout reward:          -7.631537872730221                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:17[0m Remaining: [36m0:00:17[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 62066, 62068, 62094, 62097, 62524, 71000]                                                                                                                   │
│ Average cumulative reward:       -7.615595278035365                                                                                                                      │
│ Average rollout reward:          -7.093959955035092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:18[0m Remaining: [36m0:00:17[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 62066, 62068, 62094, 62097, 62524, 71000]                                                                                                                   │
│ Average cumulative reward:       -7.615595278035365                                                                                                                      │
│ Average rollout reward:          -7.093959955035092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:18[0m Remaining: [36m0:00:17[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 62066, 62068, 62094, 62097, 62524, 71000]                                                                                                                   │
│ Average cumulative reward:       -7.615595278035365                                                                                                                      │
│ Average rollout reward:          -7.093959955035092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:19[0m Remaining: [36m0:00:17[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 62066, 62068, 62094, 62097, 62524, 71000]                                                                                                                   │
│ Average cumulative reward:       -7.615595278035365                                                                                                                      │
│ Average rollout reward:          -7.093959955035092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:19[0m Remaining: [36m0:00:14[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 71868, 71872, 72000]                                                                                                                                        │
│ Average cumulative reward:       -7.8791206345736375                                                                                                                     │
│ Average rollout reward:          -7.273142424235734                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯37m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:20[0m Remaining: [36m0:00:14[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 71868, 71872, 72000]                                                                                                                                        │
│ Average cumulative reward:       -7.8791206345736375                                                                                                                     │
│ Average rollout reward:          -7.273142424235734                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:20[0m Remaining: [36m0:00:14[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 71868, 71872, 72000]                                                                                                                                        │
│ Average cumulative reward:       -7.8791206345736375                                                                                                                     │
│ Average rollout reward:          -7.273142424235734                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:21[0m Remaining: [36m0:00:14[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 71868, 71872, 72000]                                                                                                                                        │
│ Average cumulative reward:       -7.8791206345736375                                                                                                                     │
│ Average rollout reward:          -7.273142424235734                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:21[0m Remaining: [36m0:00:13[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 72897, 72899, 72987, 72995, 73000]                                                                                                                          │
│ Average cumulative reward:       -7.769208780084467                                                                                                                      │
│ Average rollout reward:          -7.160396633375868                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:22[0m Remaining: [36m0:00:13[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 72897, 72899, 72987, 72995, 73000]                                                                                                                          │
│ Average cumulative reward:       -7.769208780084467                                                                                                                      │
│ Average rollout reward:          -7.160396633375868                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:22[0m Remaining: [36m0:00:13[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 72897, 72899, 72987, 72995, 73000]                                                                                                                          │
│ Average cumulative reward:       -7.769208780084467                                                                                                                      │
│ Average rollout reward:          -7.160396633375868                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:23[0m Remaining: [36m0:00:11[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 73933, 73934, 73996, 73998, 74000]                                                                                                                          │
│ Average cumulative reward:       -7.977179672074907                                                                                                                      │
│ Average rollout reward:          -7.349179203606731                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:23[0m Remaining: [36m0:00:11[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 73933, 73934, 73996, 73998, 74000]                                                                                                                          │
│ Average cumulative reward:       -7.977179672074907                                                                                                                      │
│ Average rollout reward:          -7.349179203606731                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:24[0m Remaining: [36m0:00:11[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 73933, 73934, 73996, 73998, 74000]                                                                                                                          │
│ Average cumulative reward:       -7.977179672074907                                                                                                                      │
│ Average rollout reward:          -7.349179203606731                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:24[0m Remaining: [36m0:00:11[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 73933, 73934, 73996, 73998, 74000]                                                                                                                          │
│ Average cumulative reward:       -7.977179672074907                                                                                                                      │
│ Average rollout reward:          -7.349179203606731                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:25[0m Remaining: [36m0:00:11[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 73933, 73934, 73996, 73998, 74000]                                                                                                                          │
│ Average cumulative reward:       -7.977179672074907                                                                                                                      │
│ Average rollout reward:          -7.349179203606731                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:25[0m Remaining: [36m0:00:09[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 74979, 74982, 74989, 75000]                                                                                                                                 │
│ Average cumulative reward:       -8.028115204282459                                                                                                                      │
│ Average rollout reward:          -7.404153623036226                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:26[0m Remaining: [36m0:00:09[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 74979, 74982, 74989, 75000]                                                                                                                                 │
│ Average cumulative reward:       -8.028115204282459                                                                                                                      │
│ Average rollout reward:          -7.404153623036226                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:26[0m Remaining: [36m0:00:09[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 74979, 74982, 74989, 75000]                                                                                                                                 │
│ Average cumulative reward:       -8.028115204282459                                                                                                                      │
│ Average rollout reward:          -7.404153623036226                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:27[0m Remaining: [36m0:00:06[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 72883, 73734, 76000]                                                                                                                          │
│ Average cumulative reward:       -7.897777154644377                                                                                                                      │
│ Average rollout reward:          -7.2733731661226235                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:27[0m Remaining: [36m0:00:06[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 72883, 73734, 76000]                                                                                                                          │
│ Average cumulative reward:       -7.897777154644377                                                                                                                      │
│ Average rollout reward:          -7.2733731661226235                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:28[0m Remaining: [36m0:00:06[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 72883, 73734, 76000]                                                                                                                          │
│ Average cumulative reward:       -7.897777154644377                                                                                                                      │
│ Average rollout reward:          -7.2733731661226235                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:28[0m Remaining: [36m0:00:06[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24206, 24207, 72883, 73734, 76000]                                                                                                                          │
│ Average cumulative reward:       -7.897777154644377                                                                                                                      │
│ Average rollout reward:          -7.2733731661226235                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:29[0m Remaining: [36m0:00:04[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 71396, 71727, 72656, 77000]                                                                                                                   │
│ Average cumulative reward:       -8.014780950576352                                                                                                                      │
│ Average rollout reward:          -7.411437523421101                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:29[0m Remaining: [36m0:00:04[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 71396, 71727, 72656, 77000]                                                                                                                   │
│ Average cumulative reward:       -8.014780950576352                                                                                                                      │
│ Average rollout reward:          -7.411437523421101                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:30[0m Remaining: [36m0:00:04[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 71396, 71727, 72656, 77000]                                                                                                                   │
│ Average cumulative reward:       -8.014780950576352                                                                                                                      │
│ Average rollout reward:          -7.411437523421101                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:30[0m Remaining: [36m0:00:04[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24741, 24743, 71396, 71727, 72656, 77000]                                                                                                                   │
│ Average cumulative reward:       -8.014780950576352                                                                                                                      │
│ Average rollout reward:          -7.411437523421101                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:31[0m Remaining: [36m0:00:02[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25835, 25843, 25847, 25999, 78000]                                                                                                                   │
│ Average cumulative reward:       -8.140965065553683                                                                                                                      │
│ Average rollout reward:          -7.538608792258602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:31[0m Remaining: [36m0:00:02[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25835, 25843, 25847, 25999, 78000]                                                                                                                   │
│ Average cumulative reward:       -8.140965065553683                                                                                                                      │
│ Average rollout reward:          -7.538608792258602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:32[0m Remaining: [36m0:00:02[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25835, 25843, 25847, 25999, 78000]                                                                                                                   │
│ Average cumulative reward:       -8.140965065553683                                                                                                                      │
│ Average rollout reward:          -7.538608792258602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:32[0m Remaining: [36m0:00:02[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25835, 25843, 25847, 25999, 78000]                                                                                                                   │
│ Average cumulative reward:       -8.140965065553683                                                                                                                      │
│ Average rollout reward:          -7.538608792258602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/79 [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100.0%[0m Elapsed: [33m0:02:33[0m Remaining: [36m0:00:00[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 25833, 25835, 25843, 25847, 25999, 78000]                                                                                                                   │
│ Average cumulative reward:       -8.140965065553683                                                                                                                      │
│ Average rollout reward:          -7.538608792258602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.9612507131318613                                                                                                                             │
│ Best path: [0, 2, 8217, 8219, 8225, 8246]                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[?25hNode 0 is not terminal. Continue.
Node 2 is not terminal. Continue.
Node 5 is not terminal. Continue.
Node 32 is not terminal. Continue.
Node 42 is not terminal. Continue.
Node 840 is not terminal. Continue.
Node 891 is not terminal. Continue.
Node 67548 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 1 is not terminal. Continue.
Node 63068 is not terminal. Continue.
Node 64858 is not terminal. Continue.
Node 70559 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 2 is not terminal. Continue.
Node 165 is not terminal. Continue.
Node 167 is not terminal. Continue.
Node 41823 is not terminal. Continue.
Node 42631 is not terminal. Continue.
Node 43346 is not terminal. Continue.
Node 46765 is not terminal. Continue.
No children found. Stop.
=== RESULT ===
By Visits: estimated reward: -3.350421555852746
sign_newton [25.746721]
sign_halley [38.374146 32.463005  4.902586]
By Value: estimated reward: -4.5025375844879605
By Best Value: estimated reward: 0
sign_newton [22.055187]
sign_newton [0.24484505]
sign_newton [0.5906156 0.        0.        0.       ]
sign_newton [0.9358136145006742]
sign_newton [0.9989011960198997]
sign_newton [0.9999996978255596]
Best value of root node:
-1.9612507131318613
Best root policy:
sign_newton [22.055187]
sign_newton [0.24484505]
sign_newton [0.5906156 0.        0.        0.       ]
sign_newton [0.9358136145006742]
sign_newton [0.9989011960198997]
sign_newton [0.9999996978255596]
=== END ===
Finished making algorithm
