Matrix distribution: unif
Matrix distribution config: {'c': 0.25, 'd': 5000, 'eps': 0.001}
Initial matrix shape: torch.Size([5000, 5000])
Algorithm name: mcts
Algorithm config: {'c_ucb': 5.0, 'alpha_pw': 0.4, 'epsilon': 1e-06, 'EXPLORE_K': 5, 'early_termination_epsilon': 1e-05, 'budget': 80000, 'print_every': 1000, 'max_termination_count': 10, 'tree_initial_capacity': 10000, 'device': 'cpu', 'actions': [['sign_ns', [[0, 0], [5, 5]]], ['sign_newton', [[0], [40]]], ['sign_quintic', [[0, 0, 0], [5, 5, 5]]], ['sign_halley', [[0, 0, 0], [40, 40, 40]]]], 'initialize_with_baselines': True}
Actions: ['sign_halley', 'sign_newton', 'sign_ns', 'sign_quintic']
Action sign_halley took 1.0 times longer than sign_halley
Action sign_newton took 0.17258612791309708 times longer than sign_halley
Action sign_ns took 0.12358273298726453 times longer than sign_halley
Action sign_quintic took 0.15842903405932404 times longer than sign_halley
Skipping sign_newton_variant because not all actions are in the tree
Skipping inv_ns because not all actions are in the tree
Skipping inv_ns_chebyshev because not all actions are in the tree
Skipping sqrt_db because not all actions are in the tree
Skipping sqrt_nsv because not all actions are in the tree
Skipping sqrt_visser because not all actions are in the tree
Skipping sqrt_newton because not all actions are in the tree
Skipping sqrt_visser_coupled because not all actions are in the tree
Skipping sqrt_newton_coupled because not all actions are in the tree
Skipping proot_newton because not all actions are in the tree
Skipping proot_visser because not all actions are in the tree
Skipping proot_iannazzo because not all actions are in the tree
[?25l0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:00[0m Remaining: [36m-:--:--[0m 502459.84 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-2.35941006 -2.35941006]                                                                                                                                                │
│ [-1.40841346 -1.40841346]                                                                                                                                                │
│ [-1.35941006 -1.35941006]                                                                                                                                                │
│ [-1.33168563 -1.33168563 -1.33168563]                                                                                                                                    │
│ [-1.03766526 -1.03766526 -1.03766526]                                                                                                                                    │
│ [-0.86293064 -0.86293064 -0.86293064]                                                                                                                                    │
│ [-0.83950319 -0.83950319 -0.83950319 -0.66691706]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1006044.64 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-2.35941006 -2.35941006]                                                                                                                                                │
│ [-1.40841346 -1.40841346]                                                                                                                                                │
│ [-1.35941006 -1.35941006]                                                                                                                                                │
│ [-1.33168563 -1.33168563 -1.33168563]                                                                                                                                    │
│ [-1.03766526 -1.03766526 -1.03766526]                                                                                                                                    │
│ [-0.86293064 -0.86293064 -0.86293064]                                                                                                                                    │
│ [-0.83950319 -0.83950319 -0.83950319 -0.66691706]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1508926.73 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-2.35941006 -2.35941006]                                                                                                                                                │
│ [-1.40841346 -1.40841346]                                                                                                                                                │
│ [-1.35941006 -1.35941006]                                                                                                                                                │
│ [-1.33168563 -1.33168563 -1.33168563]                                                                                                                                    │
│ [-1.03766526 -1.03766526 -1.03766526]                                                                                                                                    │
│ [-0.86293064 -0.86293064 -0.86293064]                                                                                                                                    │
│ [-0.83950319 -0.83950319 -0.83950319 -0.66691706]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 987, 988, 1000]                                                                                                                                             │
│ Average cumulative reward:       -5.857667609641076                                                                                                                      │
│ Average rollout reward:          -5.70769200106994                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 987, 988, 1000]                                                                                                                                             │
│ Average cumulative reward:       -5.857667609641076                                                                                                                      │
│ Average rollout reward:          -5.70769200106994                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 987, 988, 1000]                                                                                                                                             │
│ Average cumulative reward:       -5.857667609641076                                                                                                                      │
│ Average rollout reward:          -5.70769200106994                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 987, 988, 1000]                                                                                                                                             │
│ Average cumulative reward:       -5.857667609641076                                                                                                                      │
│ Average rollout reward:          -5.70769200106994                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:22[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1069, 1072, 1084, 2000]                                                                                                                                     │
│ Average cumulative reward:       -5.627935217522513                                                                                                                      │
│ Average rollout reward:          -5.436072331754758                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:22[0m   2.26 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1069, 1072, 1084, 2000]                                                                                                                                     │
│ Average cumulative reward:       -5.627935217522513                                                                                                                      │
│ Average rollout reward:          -5.436072331754758                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:22[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1069, 1072, 1084, 2000]                                                                                                                                     │
│ Average cumulative reward:       -5.627935217522513                                                                                                                      │
│ Average rollout reward:          -5.436072331754758                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:18[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1531, 1534, 1960, 2064, 3000]                                                                                                                               │
│ Average cumulative reward:       -5.345071635621605                                                                                                                      │
│ Average rollout reward:          -5.134646175868638                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:18[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1531, 1534, 1960, 2064, 3000]                                                                                                                               │
│ Average cumulative reward:       -5.345071635621605                                                                                                                      │
│ Average rollout reward:          -5.134646175868638                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:18[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1531, 1534, 1960, 2064, 3000]                                                                                                                               │
│ Average cumulative reward:       -5.345071635621605                                                                                                                      │
│ Average rollout reward:          -5.134646175868638                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:18[0m   2.35 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 1531, 1534, 1960, 2064, 3000]                                                                                                                               │
│ Average cumulative reward:       -5.345071635621605                                                                                                                      │
│ Average rollout reward:          -5.134646175868638                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:16[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 458, 459, 2441, 2762, 4000]                                                                                                                                 │
│ Average cumulative reward:       -5.339279444140334                                                                                                                      │
│ Average rollout reward:          -5.112801326000092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:16[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 458, 459, 2441, 2762, 4000]                                                                                                                                 │
│ Average cumulative reward:       -5.339279444140334                                                                                                                      │
│ Average rollout reward:          -5.112801326000092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:16[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 458, 459, 2441, 2762, 4000]                                                                                                                                 │
│ Average cumulative reward:       -5.339279444140334                                                                                                                      │
│ Average rollout reward:          -5.112801326000092                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:15[0m   1.81 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 8, 5000]                                                                                                                                                    │
│ Average cumulative reward:       -5.928229987511146                                                                                                                      │
│ Average rollout reward:          -5.725157185488408                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:15[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 8, 5000]                                                                                                                                                    │
│ Average cumulative reward:       -5.928229987511146                                                                                                                      │
│ Average rollout reward:          -5.725157185488408                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:15[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 8, 5000]                                                                                                                                                    │
│ Average cumulative reward:       -5.928229987511146                                                                                                                      │
│ Average rollout reward:          -5.725157185488408                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:15[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 8, 5000]                                                                                                                                                    │
│ Average cumulative reward:       -5.928229987511146                                                                                                                      │
│ Average rollout reward:          -5.725157185488408                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:14[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 69, 77, 391, 552, 6000]                                                                                                                                     │
│ Average cumulative reward:       -5.3234800468829775                                                                                                                     │
│ Average rollout reward:          -5.182239109744667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:14[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 69, 77, 391, 552, 6000]                                                                                                                                     │
│ Average cumulative reward:       -5.3234800468829775                                                                                                                     │
│ Average rollout reward:          -5.182239109744667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:14[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 69, 77, 391, 552, 6000]                                                                                                                                     │
│ Average cumulative reward:       -5.3234800468829775                                                                                                                     │
│ Average rollout reward:          -5.182239109744667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:14[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 69, 77, 391, 552, 6000]                                                                                                                                     │
│ Average cumulative reward:       -5.3234800468829775                                                                                                                     │
│ Average rollout reward:          -5.182239109744667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:14[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3536, 3538, 3598, 3603, 7000]                                                                                                                               │
│ Average cumulative reward:       -5.384015074644896                                                                                                                      │
│ Average rollout reward:          -5.171637661154261                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:14[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3536, 3538, 3598, 3603, 7000]                                                                                                                               │
│ Average cumulative reward:       -5.384015074644896                                                                                                                      │
│ Average rollout reward:          -5.171637661154261                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:14[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3536, 3538, 3598, 3603, 7000]                                                                                                                               │
│ Average cumulative reward:       -5.384015074644896                                                                                                                      │
│ Average rollout reward:          -5.171637661154261                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:14[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3536, 3538, 3598, 3603, 7000]                                                                                                                               │
│ Average cumulative reward:       -5.384015074644896                                                                                                                      │
│ Average rollout reward:          -5.171637661154261                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:13[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 26, 27, 7515, 8000]                                                                                                                                         │
│ Average cumulative reward:       -5.1053813527644945                                                                                                                     │
│ Average rollout reward:          -4.89495568705851                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:13[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 26, 27, 7515, 8000]                                                                                                                                         │
│ Average cumulative reward:       -5.1053813527644945                                                                                                                     │
│ Average rollout reward:          -4.89495568705851                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:02:13[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 26, 27, 7515, 8000]                                                                                                                                         │
│ Average cumulative reward:       -5.1053813527644945                                                                                                                     │
│ Average rollout reward:          -4.89495568705851                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:02:13[0m   2.08 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 26, 27, 7515, 8000]                                                                                                                                         │
│ Average cumulative reward:       -5.1053813527644945                                                                                                                     │
│ Average rollout reward:          -4.89495568705851                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:02:12[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 8980, 8982, 9000]                                                                                                                                           │
│ Average cumulative reward:       -5.5297853499259135                                                                                                                     │
│ Average rollout reward:          -5.3045065347626785                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:02:12[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 8980, 8982, 9000]                                                                                                                                           │
│ Average cumulative reward:       -5.5297853499259135                                                                                                                     │
│ Average rollout reward:          -5.3045065347626785                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:02:12[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 8980, 8982, 9000]                                                                                                                                           │
│ Average cumulative reward:       -5.5297853499259135                                                                                                                     │
│ Average rollout reward:          -5.3045065347626785                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:02:09[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9889, 9891, 9930, 10000]                                                                                                                                    │
│ Average cumulative reward:       -5.801485218381521                                                                                                                      │
│ Average rollout reward:          -5.557138961797421                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:02:09[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9889, 9891, 9930, 10000]                                                                                                                                    │
│ Average cumulative reward:       -5.801485218381521                                                                                                                      │
│ Average rollout reward:          -5.557138961797421                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:02:09[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9889, 9891, 9930, 10000]                                                                                                                                    │
│ Average cumulative reward:       -5.801485218381521                                                                                                                      │
│ Average rollout reward:          -5.557138961797421                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:02:09[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 9889, 9891, 9930, 10000]                                                                                                                                    │
│ Average cumulative reward:       -5.801485218381521                                                                                                                      │
│ Average rollout reward:          -5.557138961797421                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:02:07[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10849, 10850, 10990, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.558017173417674                                                                                                                      │
│ Average rollout reward:          -5.326898905519474                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:02:07[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10849, 10850, 10990, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.558017173417674                                                                                                                      │
│ Average rollout reward:          -5.326898905519474                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:02:07[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10849, 10850, 10990, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.558017173417674                                                                                                                      │
│ Average rollout reward:          -5.326898905519474                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:02:07[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10849, 10850, 10990, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.558017173417674                                                                                                                      │
│ Average rollout reward:          -5.326898905519474                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:02:06[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2627, 2629, 11635, 12000]                                                                                                                                   │
│ Average cumulative reward:       -5.236076041772616                                                                                                                      │
│ Average rollout reward:          -5.000203487434003                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:02:06[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2627, 2629, 11635, 12000]                                                                                                                                   │
│ Average cumulative reward:       -5.236076041772616                                                                                                                      │
│ Average rollout reward:          -5.000203487434003                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:02:06[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2627, 2629, 11635, 12000]                                                                                                                                   │
│ Average cumulative reward:       -5.236076041772616                                                                                                                      │
│ Average rollout reward:          -5.000203487434003                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:02:04[0m   1.86 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8400, 8401, 8410, 8461, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.598129488778994                                                                                                                      │
│ Average rollout reward:          -5.357450098801959                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:02:04[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8400, 8401, 8410, 8461, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.598129488778994                                                                                                                      │
│ Average rollout reward:          -5.357450098801959                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:02:04[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8400, 8401, 8410, 8461, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.598129488778994                                                                                                                      │
│ Average rollout reward:          -5.357450098801959                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:02:04[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8400, 8401, 8410, 8461, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.598129488778994                                                                                                                      │
│ Average rollout reward:          -5.357450098801959                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:02:02[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6593, 6681, 6699, 14000]                                                                                                                                    │
│ Average cumulative reward:       -5.193482721351892                                                                                                                      │
│ Average rollout reward:          -4.949781220521648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:02:02[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6593, 6681, 6699, 14000]                                                                                                                                    │
│ Average cumulative reward:       -5.193482721351892                                                                                                                      │
│ Average rollout reward:          -4.949781220521648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:02:02[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6593, 6681, 6699, 14000]                                                                                                                                    │
│ Average cumulative reward:       -5.193482721351892                                                                                                                      │
│ Average rollout reward:          -4.949781220521648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:02:02[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6593, 6681, 6699, 14000]                                                                                                                                    │
│ Average cumulative reward:       -5.193482721351892                                                                                                                      │
│ Average rollout reward:          -4.949781220521648                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:02:00[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1431, 1432, 14784, 15000]                                                                                                                                   │
│ Average cumulative reward:       -5.58433414783326                                                                                                                       │
│ Average rollout reward:          -5.353870626966043                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:02:00[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1431, 1432, 14784, 15000]                                                                                                                                   │
│ Average cumulative reward:       -5.58433414783326                                                                                                                       │
│ Average rollout reward:          -5.353870626966043                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:02:00[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1431, 1432, 14784, 15000]                                                                                                                                   │
│ Average cumulative reward:       -5.58433414783326                                                                                                                       │
│ Average rollout reward:          -5.353870626966043                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:02:00[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1431, 1432, 14784, 15000]                                                                                                                                   │
│ Average cumulative reward:       -5.58433414783326                                                                                                                       │
│ Average rollout reward:          -5.353870626966043                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:59[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12564, 12566, 16000]                                                                                                                                        │
│ Average cumulative reward:       -5.775700844602941                                                                                                                      │
│ Average rollout reward:          -5.533962510406624                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:59[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12564, 12566, 16000]                                                                                                                                        │
│ Average cumulative reward:       -5.775700844602941                                                                                                                      │
│ Average rollout reward:          -5.533962510406624                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:59[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 12564, 12566, 16000]                                                                                                                                        │
│ Average cumulative reward:       -5.775700844602941                                                                                                                      │
│ Average rollout reward:          -5.533962510406624                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:57[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16853, 16858, 16866, 16934, 17000]                                                                                                                          │
│ Average cumulative reward:       -5.7993440142973                                                                                                                        │
│ Average rollout reward:          -5.5459734740344                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:57[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16853, 16858, 16866, 16934, 17000]                                                                                                                          │
│ Average cumulative reward:       -5.7993440142973                                                                                                                        │
│ Average rollout reward:          -5.5459734740344                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:57[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16853, 16858, 16866, 16934, 17000]                                                                                                                          │
│ Average cumulative reward:       -5.7993440142973                                                                                                                        │
│ Average rollout reward:          -5.5459734740344                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:57[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16853, 16858, 16866, 16934, 17000]                                                                                                                          │
│ Average cumulative reward:       -5.7993440142973                                                                                                                        │
│ Average rollout reward:          -5.5459734740344                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:56[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2358, 2360, 2394, 2400, 18000]                                                                                                                              │
│ Average cumulative reward:       -5.627277243489264                                                                                                                      │
│ Average rollout reward:          -5.371058862479017                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:56[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2358, 2360, 2394, 2400, 18000]                                                                                                                              │
│ Average cumulative reward:       -5.627277243489264                                                                                                                      │
│ Average rollout reward:          -5.371058862479017                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:56[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2358, 2360, 2394, 2400, 18000]                                                                                                                              │
│ Average cumulative reward:       -5.627277243489264                                                                                                                      │
│ Average rollout reward:          -5.371058862479017                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:56[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2358, 2360, 2394, 2400, 18000]                                                                                                                              │
│ Average cumulative reward:       -5.627277243489264                                                                                                                      │
│ Average rollout reward:          -5.371058862479017                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:54[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 763, 767, 776, 795, 19000]                                                                                                                                  │
│ Average cumulative reward:       -5.420936812292367                                                                                                                      │
│ Average rollout reward:          -5.181254678144603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:54[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 763, 767, 776, 795, 19000]                                                                                                                                  │
│ Average cumulative reward:       -5.420936812292367                                                                                                                      │
│ Average rollout reward:          -5.181254678144603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:54[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 763, 767, 776, 795, 19000]                                                                                                                                  │
│ Average cumulative reward:       -5.420936812292367                                                                                                                      │
│ Average rollout reward:          -5.181254678144603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:54[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 763, 767, 776, 795, 19000]                                                                                                                                  │
│ Average cumulative reward:       -5.420936812292367                                                                                                                      │
│ Average rollout reward:          -5.181254678144603                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:52[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19970, 19972, 19985, 20000]                                                                                                                                 │
│ Average cumulative reward:       -5.350847341200025                                                                                                                      │
│ Average rollout reward:          -5.100079079483265                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:52[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19970, 19972, 19985, 20000]                                                                                                                                 │
│ Average cumulative reward:       -5.350847341200025                                                                                                                      │
│ Average rollout reward:          -5.100079079483265                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:52[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19970, 19972, 19985, 20000]                                                                                                                                 │
│ Average cumulative reward:       -5.350847341200025                                                                                                                      │
│ Average rollout reward:          -5.100079079483265                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:52[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19970, 19972, 19985, 20000]                                                                                                                                 │
│ Average cumulative reward:       -5.350847341200025                                                                                                                      │
│ Average rollout reward:          -5.100079079483265                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:51[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 909, 911, 20226, 21000]                                                                                                                                     │
│ Average cumulative reward:       -5.668982086880625                                                                                                                      │
│ Average rollout reward:          -5.411941774413583                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:51[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 909, 911, 20226, 21000]                                                                                                                                     │
│ Average cumulative reward:       -5.668982086880625                                                                                                                      │
│ Average rollout reward:          -5.411941774413583                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:51[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 909, 911, 20226, 21000]                                                                                                                                     │
│ Average cumulative reward:       -5.668982086880625                                                                                                                      │
│ Average rollout reward:          -5.411941774413583                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:51[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 909, 911, 20226, 21000]                                                                                                                                     │
│ Average cumulative reward:       -5.668982086880625                                                                                                                      │
│ Average rollout reward:          -5.411941774413583                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:49[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20499, 20672, 20726, 22000]                                                                                                                                 │
│ Average cumulative reward:       -5.714855287025081                                                                                                                      │
│ Average rollout reward:          -5.4651353766449455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:49[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20499, 20672, 20726, 22000]                                                                                                                                 │
│ Average cumulative reward:       -5.714855287025081                                                                                                                      │
│ Average rollout reward:          -5.4651353766449455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:49[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20499, 20672, 20726, 22000]                                                                                                                                 │
│ Average cumulative reward:       -5.714855287025081                                                                                                                      │
│ Average rollout reward:          -5.4651353766449455                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:47[0m   1.88 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22375, 22872, 22918, 23000]                                                                                                                                 │
│ Average cumulative reward:       -5.530218033425865                                                                                                                      │
│ Average rollout reward:          -5.2837894051706975                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:47[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22375, 22872, 22918, 23000]                                                                                                                                 │
│ Average cumulative reward:       -5.530218033425865                                                                                                                      │
│ Average rollout reward:          -5.2837894051706975                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:47[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22375, 22872, 22918, 23000]                                                                                                                                 │
│ Average cumulative reward:       -5.530218033425865                                                                                                                      │
│ Average rollout reward:          -5.2837894051706975                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:47[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22375, 22872, 22918, 23000]                                                                                                                                 │
│ Average cumulative reward:       -5.530218033425865                                                                                                                      │
│ Average rollout reward:          -5.2837894051706975                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:45[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23916, 23917, 23991, 24000]                                                                                                                                 │
│ Average cumulative reward:       -5.691998736063208                                                                                                                      │
│ Average rollout reward:          -5.450710026386608                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:45[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23916, 23917, 23991, 24000]                                                                                                                                 │
│ Average cumulative reward:       -5.691998736063208                                                                                                                      │
│ Average rollout reward:          -5.450710026386608                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:45[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23916, 23917, 23991, 24000]                                                                                                                                 │
│ Average cumulative reward:       -5.691998736063208                                                                                                                      │
│ Average rollout reward:          -5.450710026386608                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:45[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23916, 23917, 23991, 24000]                                                                                                                                 │
│ Average cumulative reward:       -5.691998736063208                                                                                                                      │
│ Average rollout reward:          -5.450710026386608                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:43[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24409, 24939, 24963, 25000]                                                                                                                                 │
│ Average cumulative reward:       -5.394839114844117                                                                                                                      │
│ Average rollout reward:          -5.149772434869526                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:43[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24409, 24939, 24963, 25000]                                                                                                                                 │
│ Average cumulative reward:       -5.394839114844117                                                                                                                      │
│ Average rollout reward:          -5.149772434869526                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:43[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24409, 24939, 24963, 25000]                                                                                                                                 │
│ Average cumulative reward:       -5.394839114844117                                                                                                                      │
│ Average rollout reward:          -5.149772434869526                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:43[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24409, 24939, 24963, 25000]                                                                                                                                 │
│ Average cumulative reward:       -5.394839114844117                                                                                                                      │
│ Average rollout reward:          -5.149772434869526                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:42[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6060, 6062, 6171, 6490, 26000]                                                                                                                              │
│ Average cumulative reward:       -5.4900899786409205                                                                                                                     │
│ Average rollout reward:          -5.246234345759299                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:42[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6060, 6062, 6171, 6490, 26000]                                                                                                                              │
│ Average cumulative reward:       -5.4900899786409205                                                                                                                     │
│ Average rollout reward:          -5.246234345759299                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:42[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6060, 6062, 6171, 6490, 26000]                                                                                                                              │
│ Average cumulative reward:       -5.4900899786409205                                                                                                                     │
│ Average rollout reward:          -5.246234345759299                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:42[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6060, 6062, 6171, 6490, 26000]                                                                                                                              │
│ Average cumulative reward:       -5.4900899786409205                                                                                                                     │
│ Average rollout reward:          -5.246234345759299                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:40[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24439, 24443, 24474, 24475, 27000]                                                                                                                          │
│ Average cumulative reward:       -6.010312680319633                                                                                                                      │
│ Average rollout reward:          -5.738005110643113                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:40[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24439, 24443, 24474, 24475, 27000]                                                                                                                          │
│ Average cumulative reward:       -6.010312680319633                                                                                                                      │
│ Average rollout reward:          -5.738005110643113                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:40[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24439, 24443, 24474, 24475, 27000]                                                                                                                          │
│ Average cumulative reward:       -6.010312680319633                                                                                                                      │
│ Average rollout reward:          -5.738005110643113                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:40[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 24439, 24443, 24474, 24475, 27000]                                                                                                                          │
│ Average cumulative reward:       -6.010312680319633                                                                                                                      │
│ Average rollout reward:          -5.738005110643113                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:38[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 27718, 27719, 27744, 27747, 28000]                                                                                                                          │
│ Average cumulative reward:       -5.395972024657048                                                                                                                      │
│ Average rollout reward:          -5.135911699375345                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:38[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 27718, 27719, 27744, 27747, 28000]                                                                                                                          │
│ Average cumulative reward:       -5.395972024657048                                                                                                                      │
│ Average rollout reward:          -5.135911699375345                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:38[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 27718, 27719, 27744, 27747, 28000]                                                                                                                          │
│ Average cumulative reward:       -5.395972024657048                                                                                                                      │
│ Average rollout reward:          -5.135911699375345                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:37[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28865, 28866, 29000]                                                                                                                                        │
│ Average cumulative reward:       -5.706552023696349                                                                                                                      │
│ Average rollout reward:          -5.437654289123606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:37[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28865, 28866, 29000]                                                                                                                                        │
│ Average cumulative reward:       -5.706552023696349                                                                                                                      │
│ Average rollout reward:          -5.437654289123606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:37[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28865, 28866, 29000]                                                                                                                                        │
│ Average cumulative reward:       -5.706552023696349                                                                                                                      │
│ Average rollout reward:          -5.437654289123606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:37[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28865, 28866, 29000]                                                                                                                                        │
│ Average cumulative reward:       -5.706552023696349                                                                                                                      │
│ Average rollout reward:          -5.437654289123606                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:35[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 406, 408, 21038, 21858, 30000]                                                                                                                              │
│ Average cumulative reward:       -5.6550002655136264                                                                                                                     │
│ Average rollout reward:          -5.387674552760979                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:35[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 406, 408, 21038, 21858, 30000]                                                                                                                              │
│ Average cumulative reward:       -5.6550002655136264                                                                                                                     │
│ Average rollout reward:          -5.387674552760979                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:35[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 406, 408, 21038, 21858, 30000]                                                                                                                              │
│ Average cumulative reward:       -5.6550002655136264                                                                                                                     │
│ Average rollout reward:          -5.387674552760979                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:35[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 406, 408, 21038, 21858, 30000]                                                                                                                              │
│ Average cumulative reward:       -5.6550002655136264                                                                                                                     │
│ Average rollout reward:          -5.387674552760979                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:33[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14427, 14474, 20942, 31000]                                                                                                                          │
│ Average cumulative reward:       -5.6886308206050415                                                                                                                     │
│ Average rollout reward:          -5.40992509107983                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:33[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14427, 14474, 20942, 31000]                                                                                                                          │
│ Average cumulative reward:       -5.6886308206050415                                                                                                                     │
│ Average rollout reward:          -5.40992509107983                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:33[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14427, 14474, 20942, 31000]                                                                                                                          │
│ Average cumulative reward:       -5.6886308206050415                                                                                                                     │
│ Average rollout reward:          -5.40992509107983                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:33[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14427, 14474, 20942, 31000]                                                                                                                          │
│ Average cumulative reward:       -5.6886308206050415                                                                                                                     │
│ Average rollout reward:          -5.40992509107983                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:31[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31857, 31985, 31992, 32000]                                                                                                                          │
│ Average cumulative reward:       -5.710703526489479                                                                                                                      │
│ Average rollout reward:          -5.447473702295712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:31[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31857, 31985, 31992, 32000]                                                                                                                          │
│ Average cumulative reward:       -5.710703526489479                                                                                                                      │
│ Average rollout reward:          -5.447473702295712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:31[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31857, 31985, 31992, 32000]                                                                                                                          │
│ Average cumulative reward:       -5.710703526489479                                                                                                                      │
│ Average rollout reward:          -5.447473702295712                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:29[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 31834, 32908, 32954, 33000]                                                                                                                                 │
│ Average cumulative reward:       -5.922325192441835                                                                                                                      │
│ Average rollout reward:          -5.646229402629814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:29[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 31834, 32908, 32954, 33000]                                                                                                                                 │
│ Average cumulative reward:       -5.922325192441835                                                                                                                      │
│ Average rollout reward:          -5.646229402629814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:29[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 31834, 32908, 32954, 33000]                                                                                                                                 │
│ Average cumulative reward:       -5.922325192441835                                                                                                                      │
│ Average rollout reward:          -5.646229402629814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:29[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 31834, 32908, 32954, 33000]                                                                                                                                 │
│ Average cumulative reward:       -5.922325192441835                                                                                                                      │
│ Average rollout reward:          -5.646229402629814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:27[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 33746, 33761, 34000]                                                                                                                          │
│ Average cumulative reward:       -5.360263364623164                                                                                                                      │
│ Average rollout reward:          -5.096296260300992                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:27[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 33746, 33761, 34000]                                                                                                                          │
│ Average cumulative reward:       -5.360263364623164                                                                                                                      │
│ Average rollout reward:          -5.096296260300992                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:27[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 33746, 33761, 34000]                                                                                                                          │
│ Average cumulative reward:       -5.360263364623164                                                                                                                      │
│ Average rollout reward:          -5.096296260300992                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:27[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 33746, 33761, 34000]                                                                                                                          │
│ Average cumulative reward:       -5.360263364623164                                                                                                                      │
│ Average rollout reward:          -5.096296260300992                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:26[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11181, 11183, 11212, 11216, 11312, 35000]                                                                                                                   │
│ Average cumulative reward:       -6.199150333814852                                                                                                                      │
│ Average rollout reward:          -5.939468605444053                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:26[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11181, 11183, 11212, 11216, 11312, 35000]                                                                                                                   │
│ Average cumulative reward:       -6.199150333814852                                                                                                                      │
│ Average rollout reward:          -5.939468605444053                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:26[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11181, 11183, 11212, 11216, 11312, 35000]                                                                                                                   │
│ Average cumulative reward:       -6.199150333814852                                                                                                                      │
│ Average rollout reward:          -5.939468605444053                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:26[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 11181, 11183, 11212, 11216, 11312, 35000]                                                                                                                   │
│ Average cumulative reward:       -6.199150333814852                                                                                                                      │
│ Average rollout reward:          -5.939468605444053                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:24[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35681, 35924, 35933, 36000]                                                                                                                          │
│ Average cumulative reward:       -5.539508011952983                                                                                                                      │
│ Average rollout reward:          -5.308483411749263                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:24[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35681, 35924, 35933, 36000]                                                                                                                          │
│ Average cumulative reward:       -5.539508011952983                                                                                                                      │
│ Average rollout reward:          -5.308483411749263                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:24[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35681, 35924, 35933, 36000]                                                                                                                          │
│ Average cumulative reward:       -5.539508011952983                                                                                                                      │
│ Average rollout reward:          -5.308483411749263                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:24[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35681, 35924, 35933, 36000]                                                                                                                          │
│ Average cumulative reward:       -5.539508011952983                                                                                                                      │
│ Average rollout reward:          -5.308483411749263                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:22[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8980, 8982, 9039, 9047, 13206, 37000]                                                                                                                       │
│ Average cumulative reward:       -5.8593716606773185                                                                                                                     │
│ Average rollout reward:          -5.592499582760881                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:22[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8980, 8982, 9039, 9047, 13206, 37000]                                                                                                                       │
│ Average cumulative reward:       -5.8593716606773185                                                                                                                     │
│ Average rollout reward:          -5.592499582760881                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:22[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8980, 8982, 9039, 9047, 13206, 37000]                                                                                                                       │
│ Average cumulative reward:       -5.8593716606773185                                                                                                                     │
│ Average rollout reward:          -5.592499582760881                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:22[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8980, 8982, 9039, 9047, 13206, 37000]                                                                                                                       │
│ Average cumulative reward:       -5.8593716606773185                                                                                                                     │
│ Average rollout reward:          -5.592499582760881                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:20[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37685, 37688, 37925, 37934, 38000]                                                                                                                          │
│ Average cumulative reward:       -5.360629889793931                                                                                                                      │
│ Average rollout reward:          -5.1040996240908045                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:20[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37685, 37688, 37925, 37934, 38000]                                                                                                                          │
│ Average cumulative reward:       -5.360629889793931                                                                                                                      │
│ Average rollout reward:          -5.1040996240908045                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:20[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37685, 37688, 37925, 37934, 38000]                                                                                                                          │
│ Average cumulative reward:       -5.360629889793931                                                                                                                      │
│ Average rollout reward:          -5.1040996240908045                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:20[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37685, 37688, 37925, 37934, 38000]                                                                                                                          │
│ Average cumulative reward:       -5.360629889793931                                                                                                                      │
│ Average rollout reward:          -5.1040996240908045                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:18[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21890, 21893, 21984, 24886, 39000]                                                                                                                          │
│ Average cumulative reward:       -5.978215667150135                                                                                                                      │
│ Average rollout reward:          -5.712448041497584                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:18[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21890, 21893, 21984, 24886, 39000]                                                                                                                          │
│ Average cumulative reward:       -5.978215667150135                                                                                                                      │
│ Average rollout reward:          -5.712448041497584                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:18[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21890, 21893, 21984, 24886, 39000]                                                                                                                          │
│ Average cumulative reward:       -5.978215667150135                                                                                                                      │
│ Average rollout reward:          -5.712448041497584                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:01:16[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 37353, 37358, 37365, 40000]                                                                                                                   │
│ Average cumulative reward:       -5.5317475773521325                                                                                                                     │
│ Average rollout reward:          -5.256065007020602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:01:16[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 37353, 37358, 37365, 40000]                                                                                                                   │
│ Average cumulative reward:       -5.5317475773521325                                                                                                                     │
│ Average rollout reward:          -5.256065007020602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:01:16[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 37353, 37358, 37365, 40000]                                                                                                                   │
│ Average cumulative reward:       -5.5317475773521325                                                                                                                     │
│ Average rollout reward:          -5.256065007020602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:01:16[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 37353, 37358, 37365, 40000]                                                                                                                   │
│ Average cumulative reward:       -5.5317475773521325                                                                                                                     │
│ Average rollout reward:          -5.256065007020602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:01:13[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1743, 1747, 26593, 41000]                                                                                                                                   │
│ Average cumulative reward:       -5.738317548417355                                                                                                                      │
│ Average rollout reward:          -5.456978233921888                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:01:13[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1743, 1747, 26593, 41000]                                                                                                                                   │
│ Average cumulative reward:       -5.738317548417355                                                                                                                      │
│ Average rollout reward:          -5.456978233921888                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:01:13[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1743, 1747, 26593, 41000]                                                                                                                                   │
│ Average cumulative reward:       -5.738317548417355                                                                                                                      │
│ Average rollout reward:          -5.456978233921888                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:01:13[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1743, 1747, 26593, 41000]                                                                                                                                   │
│ Average cumulative reward:       -5.738317548417355                                                                                                                      │
│ Average rollout reward:          -5.456978233921888                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:01:12[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41902, 41905, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.984837058631663                                                                                                                      │
│ Average rollout reward:          -5.707209317155493                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:01:12[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41902, 41905, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.984837058631663                                                                                                                      │
│ Average rollout reward:          -5.707209317155493                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:01:12[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41902, 41905, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.984837058631663                                                                                                                      │
│ Average rollout reward:          -5.707209317155493                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:01:12[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 41902, 41905, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.984837058631663                                                                                                                      │
│ Average rollout reward:          -5.707209317155493                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:01:10[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42634, 42646, 42949, 43000]                                                                                                                          │
│ Average cumulative reward:       -5.5341848557445354                                                                                                                     │
│ Average rollout reward:          -5.255163826889313                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:01:10[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42634, 42646, 42949, 43000]                                                                                                                          │
│ Average cumulative reward:       -5.5341848557445354                                                                                                                     │
│ Average rollout reward:          -5.255163826889313                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:01:10[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42634, 42646, 42949, 43000]                                                                                                                          │
│ Average cumulative reward:       -5.5341848557445354                                                                                                                     │
│ Average rollout reward:          -5.255163826889313                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:01:10[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42634, 42646, 42949, 43000]                                                                                                                          │
│ Average cumulative reward:       -5.5341848557445354                                                                                                                     │
│ Average rollout reward:          -5.255163826889313                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:01:08[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14426, 14433, 14441, 14541, 44000]                                                                                                                   │
│ Average cumulative reward:       -5.8690476299885885                                                                                                                     │
│ Average rollout reward:          -5.591684122595473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:01:08[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14426, 14433, 14441, 14541, 44000]                                                                                                                   │
│ Average cumulative reward:       -5.8690476299885885                                                                                                                     │
│ Average rollout reward:          -5.591684122595473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:01:08[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14426, 14433, 14441, 14541, 44000]                                                                                                                   │
│ Average cumulative reward:       -5.8690476299885885                                                                                                                     │
│ Average rollout reward:          -5.591684122595473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:01:08[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 14424, 14426, 14433, 14441, 14541, 44000]                                                                                                                   │
│ Average cumulative reward:       -5.8690476299885885                                                                                                                     │
│ Average rollout reward:          -5.591684122595473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:01:06[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44863, 44865, 44915, 44925, 45000]                                                                                                                          │
│ Average cumulative reward:       -5.593065165834268                                                                                                                      │
│ Average rollout reward:          -5.338453950106066                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:01:06[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44863, 44865, 44915, 44925, 45000]                                                                                                                          │
│ Average cumulative reward:       -5.593065165834268                                                                                                                      │
│ Average rollout reward:          -5.338453950106066                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:01:06[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44863, 44865, 44915, 44925, 45000]                                                                                                                          │
│ Average cumulative reward:       -5.593065165834268                                                                                                                      │
│ Average rollout reward:          -5.338453950106066                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:01:04[0m   1.90 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45622, 45626, 45907, 45932, 46000]                                                                                                                          │
│ Average cumulative reward:       -5.6624440620187935                                                                                                                     │
│ Average rollout reward:          -5.390474754610667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:01:04[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45622, 45626, 45907, 45932, 46000]                                                                                                                          │
│ Average cumulative reward:       -5.6624440620187935                                                                                                                     │
│ Average rollout reward:          -5.390474754610667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:01:04[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45622, 45626, 45907, 45932, 46000]                                                                                                                          │
│ Average cumulative reward:       -5.6624440620187935                                                                                                                     │
│ Average rollout reward:          -5.390474754610667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:01:04[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45622, 45626, 45907, 45932, 46000]                                                                                                                          │
│ Average cumulative reward:       -5.6624440620187935                                                                                                                     │
│ Average rollout reward:          -5.390474754610667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:01:04[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45622, 45626, 45907, 45932, 46000]                                                                                                                          │
│ Average cumulative reward:       -5.6624440620187935                                                                                                                     │
│ Average rollout reward:          -5.390474754610667                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:01:03[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 34024, 34031, 47000]                                                                                                                          │
│ Average cumulative reward:       -5.6793751429600015                                                                                                                     │
│ Average rollout reward:          -5.4455406150321775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:01:03[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 34024, 34031, 47000]                                                                                                                          │
│ Average cumulative reward:       -5.6793751429600015                                                                                                                     │
│ Average rollout reward:          -5.4455406150321775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:01:03[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 34024, 34031, 47000]                                                                                                                          │
│ Average cumulative reward:       -5.6793751429600015                                                                                                                     │
│ Average rollout reward:          -5.4455406150321775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:01:03[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 33734, 33736, 34024, 34031, 47000]                                                                                                                          │
│ Average cumulative reward:       -5.6793751429600015                                                                                                                     │
│ Average rollout reward:          -5.4455406150321775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:01:01[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 47945, 47948, 48000]                                                                                                                                        │
│ Average cumulative reward:       -5.519676143119163                                                                                                                      │
│ Average rollout reward:          -5.237080186846776                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:01:01[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 47945, 47948, 48000]                                                                                                                                        │
│ Average cumulative reward:       -5.519676143119163                                                                                                                      │
│ Average rollout reward:          -5.237080186846776                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:01:01[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 47945, 47948, 48000]                                                                                                                                        │
│ Average cumulative reward:       -5.519676143119163                                                                                                                      │
│ Average rollout reward:          -5.237080186846776                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:01:01[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 47945, 47948, 48000]                                                                                                                                        │
│ Average cumulative reward:       -5.519676143119163                                                                                                                      │
│ Average rollout reward:          -5.237080186846776                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:59[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48735, 48737, 48988, 49000]                                                                                                                                 │
│ Average cumulative reward:       -5.987244901555938                                                                                                                      │
│ Average rollout reward:          -5.71585866151167                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:59[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48735, 48737, 48988, 49000]                                                                                                                                 │
│ Average cumulative reward:       -5.987244901555938                                                                                                                      │
│ Average rollout reward:          -5.71585866151167                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:59[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48735, 48737, 48988, 49000]                                                                                                                                 │
│ Average cumulative reward:       -5.987244901555938                                                                                                                      │
│ Average rollout reward:          -5.71585866151167                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:57[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 569, 570, 38968, 39911, 50000]                                                                                                                              │
│ Average cumulative reward:       -5.457693094508123                                                                                                                      │
│ Average rollout reward:          -5.186047680395383                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:57[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 569, 570, 38968, 39911, 50000]                                                                                                                              │
│ Average cumulative reward:       -5.457693094508123                                                                                                                      │
│ Average rollout reward:          -5.186047680395383                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:57[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 569, 570, 38968, 39911, 50000]                                                                                                                              │
│ Average cumulative reward:       -5.457693094508123                                                                                                                      │
│ Average rollout reward:          -5.186047680395383                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:57[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 569, 570, 38968, 39911, 50000]                                                                                                                              │
│ Average cumulative reward:       -5.457693094508123                                                                                                                      │
│ Average rollout reward:          -5.186047680395383                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:55[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 34376, 34379, 34430, 34537, 51000]                                                                                                                          │
│ Average cumulative reward:       -5.612723934508999                                                                                                                      │
│ Average rollout reward:          -5.336221873804365                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:55[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 34376, 34379, 34430, 34537, 51000]                                                                                                                          │
│ Average cumulative reward:       -5.612723934508999                                                                                                                      │
│ Average rollout reward:          -5.336221873804365                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:55[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 34376, 34379, 34430, 34537, 51000]                                                                                                                          │
│ Average cumulative reward:       -5.612723934508999                                                                                                                      │
│ Average rollout reward:          -5.336221873804365                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:55[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 34376, 34379, 34430, 34537, 51000]                                                                                                                          │
│ Average cumulative reward:       -5.612723934508999                                                                                                                      │
│ Average rollout reward:          -5.336221873804365                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:55[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 34376, 34379, 34430, 34537, 51000]                                                                                                                          │
│ Average cumulative reward:       -5.612723934508999                                                                                                                      │
│ Average rollout reward:          -5.336221873804365                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:53[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 51973, 51977, 51986, 52000]                                                                                                                                 │
│ Average cumulative reward:       -5.525088963631631                                                                                                                      │
│ Average rollout reward:          -5.270766382991737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:53[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 51973, 51977, 51986, 52000]                                                                                                                                 │
│ Average cumulative reward:       -5.525088963631631                                                                                                                      │
│ Average rollout reward:          -5.270766382991737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:53[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 51973, 51977, 51986, 52000]                                                                                                                                 │
│ Average cumulative reward:       -5.525088963631631                                                                                                                      │
│ Average rollout reward:          -5.270766382991737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:51[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 52801, 52803, 52849, 52863, 53000]                                                                                                                          │
│ Average cumulative reward:       -5.6816895901880455                                                                                                                     │
│ Average rollout reward:          -5.4055659085890175                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:51[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 52801, 52803, 52849, 52863, 53000]                                                                                                                          │
│ Average cumulative reward:       -5.6816895901880455                                                                                                                     │
│ Average rollout reward:          -5.4055659085890175                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:51[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 52801, 52803, 52849, 52863, 53000]                                                                                                                          │
│ Average cumulative reward:       -5.6816895901880455                                                                                                                     │
│ Average rollout reward:          -5.4055659085890175                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:51[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 52801, 52803, 52849, 52863, 53000]                                                                                                                          │
│ Average cumulative reward:       -5.6816895901880455                                                                                                                     │
│ Average rollout reward:          -5.4055659085890175                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:49[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53639, 53641, 53801, 53807, 54000]                                                                                                                          │
│ Average cumulative reward:       -5.637721986310178                                                                                                                      │
│ Average rollout reward:          -5.348183063763418                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:49[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53639, 53641, 53801, 53807, 54000]                                                                                                                          │
│ Average cumulative reward:       -5.637721986310178                                                                                                                      │
│ Average rollout reward:          -5.348183063763418                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:49[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53639, 53641, 53801, 53807, 54000]                                                                                                                          │
│ Average cumulative reward:       -5.637721986310178                                                                                                                      │
│ Average rollout reward:          -5.348183063763418                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:49[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 53639, 53641, 53801, 53807, 54000]                                                                                                                          │
│ Average cumulative reward:       -5.637721986310178                                                                                                                      │
│ Average rollout reward:          -5.348183063763418                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:47[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 170, 1218, 1549, 55000]                                                                                                                                     │
│ Average cumulative reward:       -5.049431345978916                                                                                                                      │
│ Average rollout reward:          -4.769149503603913                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:47[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 170, 1218, 1549, 55000]                                                                                                                                     │
│ Average cumulative reward:       -5.049431345978916                                                                                                                      │
│ Average rollout reward:          -4.769149503603913                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:47[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 170, 1218, 1549, 55000]                                                                                                                                     │
│ Average cumulative reward:       -5.049431345978916                                                                                                                      │
│ Average rollout reward:          -4.769149503603913                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:47[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 170, 1218, 1549, 55000]                                                                                                                                     │
│ Average cumulative reward:       -5.049431345978916                                                                                                                      │
│ Average rollout reward:          -4.769149503603913                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:46[0m   1.92 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 26, 48, 3817, 43718, 56000]                                                                                                                                 │
│ Average cumulative reward:       -5.842435983670165                                                                                                                      │
│ Average rollout reward:          -5.563190334455212                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:46[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 26, 48, 3817, 43718, 56000]                                                                                                                                 │
│ Average cumulative reward:       -5.842435983670165                                                                                                                      │
│ Average rollout reward:          -5.563190334455212                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:46[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 26, 48, 3817, 43718, 56000]                                                                                                                                 │
│ Average cumulative reward:       -5.842435983670165                                                                                                                      │
│ Average rollout reward:          -5.563190334455212                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:46[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 26, 48, 3817, 43718, 56000]                                                                                                                                 │
│ Average cumulative reward:       -5.842435983670165                                                                                                                      │
│ Average rollout reward:          -5.563190334455212                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:44[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55336, 55339, 55587, 55610, 57000]                                                                                                                          │
│ Average cumulative reward:       -5.5169488459068505                                                                                                                     │
│ Average rollout reward:          -5.250999834208016                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:44[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55336, 55339, 55587, 55610, 57000]                                                                                                                          │
│ Average cumulative reward:       -5.5169488459068505                                                                                                                     │
│ Average rollout reward:          -5.250999834208016                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:44[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55336, 55339, 55587, 55610, 57000]                                                                                                                          │
│ Average cumulative reward:       -5.5169488459068505                                                                                                                     │
│ Average rollout reward:          -5.250999834208016                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:44[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 55336, 55339, 55587, 55610, 57000]                                                                                                                          │
│ Average cumulative reward:       -5.5169488459068505                                                                                                                     │
│ Average rollout reward:          -5.250999834208016                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:42[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3062, 3064, 54986, 55104, 58000]                                                                                                                            │
│ Average cumulative reward:       -5.617127342400399                                                                                                                      │
│ Average rollout reward:          -5.36446893464965                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:42[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3062, 3064, 54986, 55104, 58000]                                                                                                                            │
│ Average cumulative reward:       -5.617127342400399                                                                                                                      │
│ Average rollout reward:          -5.36446893464965                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:42[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3062, 3064, 54986, 55104, 58000]                                                                                                                            │
│ Average cumulative reward:       -5.617127342400399                                                                                                                      │
│ Average rollout reward:          -5.36446893464965                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:42[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 3062, 3064, 54986, 55104, 58000]                                                                                                                            │
│ Average cumulative reward:       -5.617127342400399                                                                                                                      │
│ Average rollout reward:          -5.36446893464965                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:40[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 58828, 58831, 58834, 58853, 59000]                                                                                                                          │
│ Average cumulative reward:       -5.828056043713714                                                                                                                      │
│ Average rollout reward:          -5.551880539762012                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:40[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 58828, 58831, 58834, 58853, 59000]                                                                                                                          │
│ Average cumulative reward:       -5.828056043713714                                                                                                                      │
│ Average rollout reward:          -5.551880539762012                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:40[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 58828, 58831, 58834, 58853, 59000]                                                                                                                          │
│ Average cumulative reward:       -5.828056043713714                                                                                                                      │
│ Average rollout reward:          -5.551880539762012                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:40[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 58828, 58831, 58834, 58853, 59000]                                                                                                                          │
│ Average cumulative reward:       -5.828056043713714                                                                                                                      │
│ Average rollout reward:          -5.551880539762012                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:39[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59722, 59966, 60000]                                                                                                                                 │
│ Average cumulative reward:       -5.787322028096121                                                                                                                      │
│ Average rollout reward:          -5.529176890059681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:39[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59722, 59966, 60000]                                                                                                                                 │
│ Average cumulative reward:       -5.787322028096121                                                                                                                      │
│ Average rollout reward:          -5.529176890059681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:39[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59722, 59966, 60000]                                                                                                                                 │
│ Average cumulative reward:       -5.787322028096121                                                                                                                      │
│ Average rollout reward:          -5.529176890059681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:39[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59722, 59966, 60000]                                                                                                                                 │
│ Average cumulative reward:       -5.787322028096121                                                                                                                      │
│ Average rollout reward:          -5.529176890059681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:36[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 60621, 60623, 60662, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.431258571509101                                                                                                                      │
│ Average rollout reward:          -5.160199049644107                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:36[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 60621, 60623, 60662, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.431258571509101                                                                                                                      │
│ Average rollout reward:          -5.160199049644107                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:36[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 60621, 60623, 60662, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.431258571509101                                                                                                                      │
│ Average rollout reward:          -5.160199049644107                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:36[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 60621, 60623, 60662, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.431258571509101                                                                                                                      │
│ Average rollout reward:          -5.160199049644107                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:34[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 61532, 61534, 61980, 61982, 62000]                                                                                                                          │
│ Average cumulative reward:       -5.323852010354471                                                                                                                      │
│ Average rollout reward:          -5.054037499537956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:34[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 61532, 61534, 61980, 61982, 62000]                                                                                                                          │
│ Average cumulative reward:       -5.323852010354471                                                                                                                      │
│ Average rollout reward:          -5.054037499537956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:34[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 61532, 61534, 61980, 61982, 62000]                                                                                                                          │
│ Average cumulative reward:       -5.323852010354471                                                                                                                      │
│ Average rollout reward:          -5.054037499537956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:34[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 61532, 61534, 61980, 61982, 62000]                                                                                                                          │
│ Average cumulative reward:       -5.323852010354471                                                                                                                      │
│ Average rollout reward:          -5.054037499537956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:33[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35682, 35688, 39858, 63000]                                                                                                                          │
│ Average cumulative reward:       -5.932174606942459                                                                                                                      │
│ Average rollout reward:          -5.658656997758359                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:33[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35682, 35688, 39858, 63000]                                                                                                                          │
│ Average cumulative reward:       -5.932174606942459                                                                                                                      │
│ Average rollout reward:          -5.658656997758359                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:33[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35682, 35688, 39858, 63000]                                                                                                                          │
│ Average cumulative reward:       -5.932174606942459                                                                                                                      │
│ Average rollout reward:          -5.658656997758359                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:33[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 35677, 35682, 35688, 39858, 63000]                                                                                                                          │
│ Average cumulative reward:       -5.932174606942459                                                                                                                      │
│ Average rollout reward:          -5.658656997758359                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:31[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31858, 31929, 31934, 31965, 64000]                                                                                                                   │
│ Average cumulative reward:       -5.543150640441255                                                                                                                      │
│ Average rollout reward:          -5.25273707581332                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:31[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31858, 31929, 31934, 31965, 64000]                                                                                                                   │
│ Average cumulative reward:       -5.543150640441255                                                                                                                      │
│ Average rollout reward:          -5.25273707581332                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:31[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31858, 31929, 31934, 31965, 64000]                                                                                                                   │
│ Average cumulative reward:       -5.543150640441255                                                                                                                      │
│ Average rollout reward:          -5.25273707581332                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:31[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31856, 31858, 31929, 31934, 31965, 64000]                                                                                                                   │
│ Average cumulative reward:       -5.543150640441255                                                                                                                      │
│ Average rollout reward:          -5.25273707581332                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:29[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 38, 40, 60557, 65000]                                                                                                                                       │
│ Average cumulative reward:       -5.7448249522926735                                                                                                                     │
│ Average rollout reward:          -5.446100032204334                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:29[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 38, 40, 60557, 65000]                                                                                                                                       │
│ Average cumulative reward:       -5.7448249522926735                                                                                                                     │
│ Average rollout reward:          -5.446100032204334                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:29[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 38, 40, 60557, 65000]                                                                                                                                       │
│ Average cumulative reward:       -5.7448249522926735                                                                                                                     │
│ Average rollout reward:          -5.446100032204334                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:29[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 38, 40, 60557, 65000]                                                                                                                                       │
│ Average cumulative reward:       -5.7448249522926735                                                                                                                     │
│ Average rollout reward:          -5.446100032204334                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:27[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1335, 1337, 62420, 63264, 66000]                                                                                                                            │
│ Average cumulative reward:       -5.429794556564665                                                                                                                      │
│ Average rollout reward:          -5.1371791167100795                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:27[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1335, 1337, 62420, 63264, 66000]                                                                                                                            │
│ Average cumulative reward:       -5.429794556564665                                                                                                                      │
│ Average rollout reward:          -5.1371791167100795                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:27[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1335, 1337, 62420, 63264, 66000]                                                                                                                            │
│ Average cumulative reward:       -5.429794556564665                                                                                                                      │
│ Average rollout reward:          -5.1371791167100795                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:27[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1335, 1337, 62420, 63264, 66000]                                                                                                                            │
│ Average cumulative reward:       -5.429794556564665                                                                                                                      │
│ Average rollout reward:          -5.1371791167100795                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:27[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1335, 1337, 62420, 63264, 66000]                                                                                                                            │
│ Average cumulative reward:       -5.429794556564665                                                                                                                      │
│ Average rollout reward:          -5.1371791167100795                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:25[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 5606, 5607, 5614, 67000]                                                                                                                                    │
│ Average cumulative reward:       -5.349274427244602                                                                                                                      │
│ Average rollout reward:          -5.054644967388292                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:25[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 5606, 5607, 5614, 67000]                                                                                                                                    │
│ Average cumulative reward:       -5.349274427244602                                                                                                                      │
│ Average rollout reward:          -5.054644967388292                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:25[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 5606, 5607, 5614, 67000]                                                                                                                                    │
│ Average cumulative reward:       -5.349274427244602                                                                                                                      │
│ Average rollout reward:          -5.054644967388292                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:23[0m   1.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 169, 172, 67750, 68000]                                                                                                                                     │
│ Average cumulative reward:       -5.840887726770182                                                                                                                      │
│ Average rollout reward:          -5.550834751711565                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:23[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 169, 172, 67750, 68000]                                                                                                                                     │
│ Average cumulative reward:       -5.840887726770182                                                                                                                      │
│ Average rollout reward:          -5.550834751711565                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:23[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 169, 172, 67750, 68000]                                                                                                                                     │
│ Average cumulative reward:       -5.840887726770182                                                                                                                      │
│ Average rollout reward:          -5.550834751711565                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:23[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 169, 172, 67750, 68000]                                                                                                                                     │
│ Average cumulative reward:       -5.840887726770182                                                                                                                      │
│ Average rollout reward:          -5.550834751711565                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:23[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 169, 172, 67750, 68000]                                                                                                                                     │
│ Average cumulative reward:       -5.840887726770182                                                                                                                      │
│ Average rollout reward:          -5.550834751711565                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:21[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69, 72, 1519, 62163, 69000]                                                                                                                                 │
│ Average cumulative reward:       -5.930117688296635                                                                                                                      │
│ Average rollout reward:          -5.658727123163878                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:21[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69, 72, 1519, 62163, 69000]                                                                                                                                 │
│ Average cumulative reward:       -5.930117688296635                                                                                                                      │
│ Average rollout reward:          -5.658727123163878                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:21[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69, 72, 1519, 62163, 69000]                                                                                                                                 │
│ Average cumulative reward:       -5.930117688296635                                                                                                                      │
│ Average rollout reward:          -5.658727123163878                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:16[0m Remaining: [36m0:00:21[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 69, 72, 1519, 62163, 69000]                                                                                                                                 │
│ Average cumulative reward:       -5.930117688296635                                                                                                                      │
│ Average rollout reward:          -5.658727123163878                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:16[0m Remaining: [36m0:00:19[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6022, 6251, 6262, 70000]                                                                                                                                    │
│ Average cumulative reward:       -5.962779897183183                                                                                                                      │
│ Average rollout reward:          -5.676923383796828                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:17[0m Remaining: [36m0:00:19[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6022, 6251, 6262, 70000]                                                                                                                                    │
│ Average cumulative reward:       -5.962779897183183                                                                                                                      │
│ Average rollout reward:          -5.676923383796828                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:17[0m Remaining: [36m0:00:19[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6022, 6251, 6262, 70000]                                                                                                                                    │
│ Average cumulative reward:       -5.962779897183183                                                                                                                      │
│ Average rollout reward:          -5.676923383796828                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:18[0m Remaining: [36m0:00:19[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6022, 6251, 6262, 70000]                                                                                                                                    │
│ Average cumulative reward:       -5.962779897183183                                                                                                                      │
│ Average rollout reward:          -5.676923383796828                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:18[0m Remaining: [36m0:00:17[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1243, 1246, 65932, 71000]                                                                                                                                   │
│ Average cumulative reward:       -5.6647134728132205                                                                                                                     │
│ Average rollout reward:          -5.382537777174775                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:19[0m Remaining: [36m0:00:17[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1243, 1246, 65932, 71000]                                                                                                                                   │
│ Average cumulative reward:       -5.6647134728132205                                                                                                                     │
│ Average rollout reward:          -5.382537777174775                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:19[0m Remaining: [36m0:00:17[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1243, 1246, 65932, 71000]                                                                                                                                   │
│ Average cumulative reward:       -5.6647134728132205                                                                                                                     │
│ Average rollout reward:          -5.382537777174775                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:20[0m Remaining: [36m0:00:17[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1243, 1246, 65932, 71000]                                                                                                                                   │
│ Average cumulative reward:       -5.6647134728132205                                                                                                                     │
│ Average rollout reward:          -5.382537777174775                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:20[0m Remaining: [36m0:00:17[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1243, 1246, 65932, 71000]                                                                                                                                   │
│ Average cumulative reward:       -5.6647134728132205                                                                                                                     │
│ Average rollout reward:          -5.382537777174775                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:21[0m Remaining: [36m0:00:15[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42632, 65050, 72000]                                                                                                                                 │
│ Average cumulative reward:       -5.6656936433242535                                                                                                                     │
│ Average rollout reward:          -5.369289238080726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:21[0m Remaining: [36m0:00:15[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42632, 65050, 72000]                                                                                                                                 │
│ Average cumulative reward:       -5.6656936433242535                                                                                                                     │
│ Average rollout reward:          -5.369289238080726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:22[0m Remaining: [36m0:00:15[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42632, 65050, 72000]                                                                                                                                 │
│ Average cumulative reward:       -5.6656936433242535                                                                                                                     │
│ Average rollout reward:          -5.369289238080726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:22[0m Remaining: [36m0:00:15[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42631, 42632, 65050, 72000]                                                                                                                                 │
│ Average cumulative reward:       -5.6656936433242535                                                                                                                     │
│ Average rollout reward:          -5.369289238080726                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:23[0m Remaining: [36m0:00:13[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48735, 48736, 49046, 73000]                                                                                                                                 │
│ Average cumulative reward:       -5.5566888756991295                                                                                                                     │
│ Average rollout reward:          -5.281410633721859                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:23[0m Remaining: [36m0:00:13[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48735, 48736, 49046, 73000]                                                                                                                                 │
│ Average cumulative reward:       -5.5566888756991295                                                                                                                     │
│ Average rollout reward:          -5.281410633721859                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:24[0m Remaining: [36m0:00:13[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 48735, 48736, 49046, 73000]                                                                                                                                 │
│ Average cumulative reward:       -5.5566888756991295                                                                                                                     │
│ Average rollout reward:          -5.281410633721859                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:24[0m Remaining: [36m0:00:11[0m   1.95 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17712, 17714, 70951, 71274, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.482797265632419                                                                                                                      │
│ Average rollout reward:          -5.195020316811659                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:25[0m Remaining: [36m0:00:11[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17712, 17714, 70951, 71274, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.482797265632419                                                                                                                      │
│ Average rollout reward:          -5.195020316811659                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:25[0m Remaining: [36m0:00:11[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17712, 17714, 70951, 71274, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.482797265632419                                                                                                                      │
│ Average rollout reward:          -5.195020316811659                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:26[0m Remaining: [36m0:00:11[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17712, 17714, 70951, 71274, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.482797265632419                                                                                                                      │
│ Average rollout reward:          -5.195020316811659                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:26[0m Remaining: [36m0:00:11[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 17712, 17714, 70951, 71274, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.482797265632419                                                                                                                      │
│ Average rollout reward:          -5.195020316811659                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:27[0m Remaining: [36m0:00:09[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21404, 21645, 21810, 75000]                                                                                                                          │
│ Average cumulative reward:       -6.037941002688058                                                                                                                      │
│ Average rollout reward:          -5.755972820191759                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:27[0m Remaining: [36m0:00:09[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21404, 21645, 21810, 75000]                                                                                                                          │
│ Average cumulative reward:       -6.037941002688058                                                                                                                      │
│ Average rollout reward:          -5.755972820191759                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:28[0m Remaining: [36m0:00:09[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21404, 21645, 21810, 75000]                                                                                                                          │
│ Average cumulative reward:       -6.037941002688058                                                                                                                      │
│ Average rollout reward:          -5.755972820191759                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:28[0m Remaining: [36m0:00:07[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 74940, 76000]                                                                                                                                 │
│ Average cumulative reward:       -5.554598621060342                                                                                                                      │
│ Average rollout reward:          -5.2712328542023315                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:29[0m Remaining: [36m0:00:07[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 74940, 76000]                                                                                                                                 │
│ Average cumulative reward:       -5.554598621060342                                                                                                                      │
│ Average rollout reward:          -5.2712328542023315                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:29[0m Remaining: [36m0:00:07[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 74940, 76000]                                                                                                                                 │
│ Average cumulative reward:       -5.554598621060342                                                                                                                      │
│ Average rollout reward:          -5.2712328542023315                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:30[0m Remaining: [36m0:00:07[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 74940, 76000]                                                                                                                                 │
│ Average cumulative reward:       -5.554598621060342                                                                                                                      │
│ Average rollout reward:          -5.2712328542023315                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:30[0m Remaining: [36m0:00:07[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 37009, 37011, 74940, 76000]                                                                                                                                 │
│ Average cumulative reward:       -5.554598621060342                                                                                                                      │
│ Average rollout reward:          -5.2712328542023315                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:31[0m Remaining: [36m0:00:05[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59724, 59728, 59743, 77000]                                                                                                                          │
│ Average cumulative reward:       -5.5649373061771055                                                                                                                     │
│ Average rollout reward:          -5.267427635184984                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:31[0m Remaining: [36m0:00:05[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59724, 59728, 59743, 77000]                                                                                                                          │
│ Average cumulative reward:       -5.5649373061771055                                                                                                                     │
│ Average rollout reward:          -5.267427635184984                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:32[0m Remaining: [36m0:00:05[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59724, 59728, 59743, 77000]                                                                                                                          │
│ Average cumulative reward:       -5.5649373061771055                                                                                                                     │
│ Average rollout reward:          -5.267427635184984                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:32[0m Remaining: [36m0:00:05[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 59721, 59724, 59728, 59743, 77000]                                                                                                                          │
│ Average cumulative reward:       -5.5649373061771055                                                                                                                     │
│ Average rollout reward:          -5.267427635184984                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:33[0m Remaining: [36m0:00:03[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21402, 76522, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.544536459795924                                                                                                                      │
│ Average rollout reward:          -5.264086035683368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:33[0m Remaining: [36m0:00:03[0m   1.97 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21402, 76522, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.544536459795924                                                                                                                      │
│ Average rollout reward:          -5.264086035683368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:34[0m Remaining: [36m0:00:03[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21402, 76522, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.544536459795924                                                                                                                      │
│ Average rollout reward:          -5.264086035683368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:34[0m Remaining: [36m0:00:03[0m   1.98 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21402, 76522, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.544536459795924                                                                                                                      │
│ Average rollout reward:          -5.264086035683368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/79 [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100.0%[0m Elapsed: [33m0:02:34[0m Remaining: [36m0:00:00[0m   1.96 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 21401, 21402, 76522, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.544536459795924                                                                                                                      │
│ Average rollout reward:          -5.264086035683368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.8395031877752523                                                                                                                             │
│ Best path: [0, 2, 90, 93]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[?25hNode 0 is not terminal. Continue.
Node 2 is not terminal. Continue.
Node 90 is not terminal. Continue.
Node 93 is not terminal. Continue.
Node 1325 is not terminal. Continue.
Node 1409 is not terminal. Continue.
Node 1841 is not terminal. Continue.
Node 20997 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 1 is not terminal. Continue.
Node 2476 is not terminal. Continue.
Node 2603 is not terminal. Continue.
Node 2806 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 2 is not terminal. Continue.
Node 90 is not terminal. Continue.
Node 93 is not terminal. Continue.
Node 1325 is not terminal. Continue.
Node 1521 is not terminal. Continue.
Node 13528 is not terminal. Continue.
Node 19590 is not terminal. Continue.
No children found. Stop.
=== RESULT ===
By Visits: estimated reward: -2.0753305176478976
sign_newton [19.751123]
sign_newton [6.6581917]
By Value: estimated reward: -2.035516767478583
By Best Value: estimated reward: 0
sign_newton [19.751123]
sign_newton [0.2049071 0.        0.        0.       ]
sign_ns [0.5, 0.5238398329104279]
sign_ns [0.5, 1.1615555604650039]
sign_ns [0.5, 1.0209883425110498]
sign_ns [0.5, 1.0003327864940872]
Best value of root node:
-0.8395031877752523
Best root policy:
sign_newton [19.751123]
sign_newton [0.2049071 0.        0.        0.       ]
sign_ns [0.5, 0.5238398329104279]
sign_ns [0.5, 1.1615555604650039]
sign_ns [0.5, 1.0209883425110498]
sign_ns [0.5, 1.0003327864940872]
=== END ===
Finished making algorithm
