Matrix distribution: unif
Matrix distribution config: {'c': 0.25, 'd': 1500, 'eps': 0.001}
Initial matrix shape: torch.Size([1500, 1500])
Algorithm name: mcts
Algorithm config: {'c_ucb': 5.0, 'alpha_pw': 0.4, 'epsilon': 1e-06, 'EXPLORE_K': 5, 'early_termination_epsilon': 1e-05, 'budget': 80000, 'print_every': 1000, 'max_termination_count': 10, 'tree_initial_capacity': 10000, 'device': 'cuda', 'actions': [['sign_ns', [[0, 0], [5, 5]]], ['sign_newton', [[0], [40]]], ['sign_quintic', [[0, 0, 0], [5, 5, 5]]], ['sign_halley', [[0, 0, 0], [40, 40, 40]]]], 'initialize_with_baselines': True}
Actions: ['sign_halley', 'sign_newton', 'sign_ns', 'sign_quintic']
Action sign_halley took 1.0 times longer than sign_halley
Action sign_newton took 0.5085928023903628 times longer than sign_halley
Action sign_ns took 0.04991601257877109 times longer than sign_halley
Action sign_quintic took 0.07835101299364734 times longer than sign_halley
Skipping sign_newton_variant because not all actions are in the tree
Skipping inv_ns because not all actions are in the tree
Skipping inv_ns_chebyshev because not all actions are in the tree
Skipping sqrt_db because not all actions are in the tree
Skipping sqrt_nsv because not all actions are in the tree
Skipping sqrt_visser because not all actions are in the tree
Skipping sqrt_newton because not all actions are in the tree
Skipping sqrt_visser_coupled because not all actions are in the tree
Skipping sqrt_newton_coupled because not all actions are in the tree
Skipping proot_newton because not all actions are in the tree
Skipping proot_visser because not all actions are in the tree
Skipping proot_iannazzo because not all actions are in the tree
[?25l0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:00[0m Remaining: [36m-:--:--[0m 501525.25 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-1.49916013 -1.49916013]                                                                                                                                                │
│ [-0.49916013 -0.49916013]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1004577.47 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-1.49916013 -1.49916013]                                                                                                                                                │
│ [-0.49916013 -0.49916013]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 989, 992, 1000]                                                                                                                                             │
│ Average cumulative reward:       -5.9343805037103445                                                                                                                     │
│ Average rollout reward:          -5.7924367201945675                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 989, 992, 1000]                                                                                                                                             │
│ Average cumulative reward:       -5.9343805037103445                                                                                                                     │
│ Average rollout reward:          -5.7924367201945675                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:02[0m Remaining: [36m0:01:38[0m   1.26 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 1963, 2000]                                                                                                                                                 │
│ Average cumulative reward:       -6.179552512938531                                                                                                                      │
│ Average rollout reward:          -6.059612525067023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:03[0m Remaining: [36m0:01:38[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 1963, 2000]                                                                                                                                                 │
│ Average cumulative reward:       -6.179552512938531                                                                                                                      │
│ Average rollout reward:          -6.059612525067023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:03[0m Remaining: [36m0:01:38[0m   1.76 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 1963, 2000]                                                                                                                                                 │
│ Average cumulative reward:       -6.179552512938531                                                                                                                      │
│ Average rollout reward:          -6.059612525067023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:01:37[0m   1.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 912, 916, 940, 3000]                                                                                                                                        │
│ Average cumulative reward:       -6.271218587636405                                                                                                                      │
│ Average rollout reward:          -6.167333967156739                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:01:37[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 912, 916, 940, 3000]                                                                                                                                        │
│ Average cumulative reward:       -6.271218587636405                                                                                                                      │
│ Average rollout reward:          -6.167333967156739                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:01:37[0m   1.68 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 912, 916, 940, 3000]                                                                                                                                        │
│ Average cumulative reward:       -6.271218587636405                                                                                                                      │
│ Average rollout reward:          -6.167333967156739                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:01:38[0m   1.38 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 4000]                                                                                                                                                       │
│ Average cumulative reward:       -5.904975077364076                                                                                                                      │
│ Average rollout reward:          -5.804605690787181                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:01:38[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 4000]                                                                                                                                                       │
│ Average cumulative reward:       -5.904975077364076                                                                                                                      │
│ Average rollout reward:          -5.804605690787181                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:01:36[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 4939, 5000]                                                                                                                                                 │
│ Average cumulative reward:       -6.3211672547662205                                                                                                                     │
│ Average rollout reward:          -6.225344152751473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:01:36[0m   1.41 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 4939, 5000]                                                                                                                                                 │
│ Average cumulative reward:       -6.3211672547662205                                                                                                                     │
│ Average rollout reward:          -6.225344152751473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:01:36[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 4939, 5000]                                                                                                                                                 │
│ Average cumulative reward:       -6.3211672547662205                                                                                                                     │
│ Average rollout reward:          -6.225344152751473                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:01:34[0m   1.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 6000]                                                                                                                                                       │
│ Average cumulative reward:       -6.024758134296372                                                                                                                      │
│ Average rollout reward:          -5.92682744732828                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:01:34[0m   1.42 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 6000]                                                                                                                                                       │
│ Average cumulative reward:       -6.024758134296372                                                                                                                      │
│ Average rollout reward:          -5.92682744732828                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:01:33[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 6939, 6998, 7000]                                                                                                                                           │
│ Average cumulative reward:       -5.835297174920502                                                                                                                      │
│ Average rollout reward:          -5.743670945379839                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:01:33[0m   1.36 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 6939, 6998, 7000]                                                                                                                                           │
│ Average cumulative reward:       -5.835297174920502                                                                                                                      │
│ Average rollout reward:          -5.743670945379839                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:01:33[0m   1.44 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 6939, 6998, 7000]                                                                                                                                           │
│ Average cumulative reward:       -5.835297174920502                                                                                                                      │
│ Average rollout reward:          -5.743670945379839                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:01:32[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 7962, 7996, 8000]                                                                                                                                           │
│ Average cumulative reward:       -5.993643117281321                                                                                                                      │
│ Average rollout reward:          -5.909828258222048                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:01:32[0m   1.38 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 7962, 7996, 8000]                                                                                                                                           │
│ Average cumulative reward:       -5.993643117281321                                                                                                                      │
│ Average rollout reward:          -5.909828258222048                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:01:30[0m   1.28 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 2877, 7959, 8239, 9000]                                                                                                                                     │
│ Average cumulative reward:       -5.942015001798723                                                                                                                      │
│ Average rollout reward:          -5.852301693459688                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:01:30[0m   1.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 2877, 7959, 8239, 9000]                                                                                                                                     │
│ Average cumulative reward:       -5.942015001798723                                                                                                                      │
│ Average rollout reward:          -5.852301693459688                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:01:30[0m   1.40 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 2877, 7959, 8239, 9000]                                                                                                                                     │
│ Average cumulative reward:       -5.942015001798723                                                                                                                      │
│ Average rollout reward:          -5.852301693459688                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:01:29[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9961, 9977, 10000]                                                                                                                                          │
│ Average cumulative reward:       -5.687864680426605                                                                                                                      │
│ Average rollout reward:          -5.586027791250583                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:01:29[0m   1.36 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9961, 9977, 10000]                                                                                                                                          │
│ Average cumulative reward:       -5.687864680426605                                                                                                                      │
│ Average rollout reward:          -5.586027791250583                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:01:28[0m   1.28 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10539, 10569, 10721, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.989943736801565                                                                                                                      │
│ Average rollout reward:          -5.903882657176247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:01:28[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10539, 10569, 10721, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.989943736801565                                                                                                                      │
│ Average rollout reward:          -5.903882657176247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:01:28[0m   1.37 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 10539, 10569, 10721, 11000]                                                                                                                                 │
│ Average cumulative reward:       -5.989943736801565                                                                                                                      │
│ Average rollout reward:          -5.903882657176247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:01:26[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 11894, 11952, 11956, 12000]                                                                                                                                 │
│ Average cumulative reward:       -6.105671873562742                                                                                                                      │
│ Average rollout reward:          -6.004802253978928                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:01:26[0m   1.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 11894, 11952, 11956, 12000]                                                                                                                                 │
│ Average cumulative reward:       -6.105671873562742                                                                                                                      │
│ Average rollout reward:          -6.004802253978928                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:01:25[0m   1.28 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1070, 1071, 1082, 3146, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.787468417534641                                                                                                                      │
│ Average rollout reward:          -5.673575049138965                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:01:25[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1070, 1071, 1082, 3146, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.787468417534641                                                                                                                      │
│ Average rollout reward:          -5.673575049138965                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:01:25[0m   1.35 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1070, 1071, 1082, 3146, 13000]                                                                                                                              │
│ Average cumulative reward:       -5.787468417534641                                                                                                                      │
│ Average rollout reward:          -5.673575049138965                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:01:24[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9658, 12975, 13120, 14000]                                                                                                                                  │
│ Average cumulative reward:       -5.804977140541251                                                                                                                      │
│ Average rollout reward:          -5.70144310693756                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:01:24[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9658, 12975, 13120, 14000]                                                                                                                                  │
│ Average cumulative reward:       -5.804977140541251                                                                                                                      │
│ Average rollout reward:          -5.70144310693756                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:01:24[0m   1.36 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9658, 12975, 13120, 14000]                                                                                                                                  │
│ Average cumulative reward:       -5.804977140541251                                                                                                                      │
│ Average rollout reward:          -5.70144310693756                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:01:22[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 14804, 14992, 14995, 15000]                                                                                                                                 │
│ Average cumulative reward:       -5.724855026871131                                                                                                                      │
│ Average rollout reward:          -5.62796375035787                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:01:22[0m   1.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 14804, 14992, 14995, 15000]                                                                                                                                 │
│ Average cumulative reward:       -5.724855026871131                                                                                                                      │
│ Average rollout reward:          -5.62796375035787                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:01:21[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15996, 15998, 16000]                                                                                                                                        │
│ Average cumulative reward:       -6.276170946838902                                                                                                                      │
│ Average rollout reward:          -6.148901664913807                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:01:21[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15996, 15998, 16000]                                                                                                                                        │
│ Average cumulative reward:       -6.276170946838902                                                                                                                      │
│ Average rollout reward:          -6.148901664913807                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:01:21[0m   1.35 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15996, 15998, 16000]                                                                                                                                        │
│ Average cumulative reward:       -6.276170946838902                                                                                                                      │
│ Average rollout reward:          -6.148901664913807                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:01:20[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 16823, 16914, 16917, 17000]                                                                                                                                 │
│ Average cumulative reward:       -5.833576731701424                                                                                                                      │
│ Average rollout reward:          -5.739768062661892                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:01:20[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 16823, 16914, 16917, 17000]                                                                                                                                 │
│ Average cumulative reward:       -5.833576731701424                                                                                                                      │
│ Average rollout reward:          -5.739768062661892                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:01:19[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15593, 16314, 17063, 18000]                                                                                                                                 │
│ Average cumulative reward:       -6.279130141311941                                                                                                                      │
│ Average rollout reward:          -6.175936087119709                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:01:19[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15593, 16314, 17063, 18000]                                                                                                                                 │
│ Average cumulative reward:       -6.279130141311941                                                                                                                      │
│ Average rollout reward:          -6.175936087119709                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:01:19[0m   1.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15593, 16314, 17063, 18000]                                                                                                                                 │
│ Average cumulative reward:       -6.279130141311941                                                                                                                      │
│ Average rollout reward:          -6.175936087119709                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:01:17[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 8, 565, 695, 1827, 19000]                                                                                                                                   │
│ Average cumulative reward:       -5.663927278261734                                                                                                                      │
│ Average rollout reward:          -5.551723869286207                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:01:17[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 8, 565, 695, 1827, 19000]                                                                                                                                   │
│ Average cumulative reward:       -5.663927278261734                                                                                                                      │
│ Average rollout reward:          -5.551723869286207                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:01:16[0m   1.28 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 19917, 19918, 19989, 20000]                                                                                                                                 │
│ Average cumulative reward:       -6.272088102715775                                                                                                                      │
│ Average rollout reward:          -6.158173116337368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:01:16[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 19917, 19918, 19989, 20000]                                                                                                                                 │
│ Average cumulative reward:       -6.272088102715775                                                                                                                      │
│ Average rollout reward:          -6.158173116337368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:01:16[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 19917, 19918, 19989, 20000]                                                                                                                                 │
│ Average cumulative reward:       -6.272088102715775                                                                                                                      │
│ Average rollout reward:          -6.158173116337368                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:01:15[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20861, 20993, 20994, 21000]                                                                                                                                 │
│ Average cumulative reward:       -6.318734750992094                                                                                                                      │
│ Average rollout reward:          -6.182928656128134                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:01:15[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20861, 20993, 20994, 21000]                                                                                                                                 │
│ Average cumulative reward:       -6.318734750992094                                                                                                                      │
│ Average rollout reward:          -6.182928656128134                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:14[0m   1.28 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 21830, 21921, 21924, 22000]                                                                                                                                 │
│ Average cumulative reward:       -5.701431297763182                                                                                                                      │
│ Average rollout reward:          -5.621972491892342                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:14[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 21830, 21921, 21924, 22000]                                                                                                                                 │
│ Average cumulative reward:       -5.701431297763182                                                                                                                      │
│ Average rollout reward:          -5.621972491892342                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:14[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 21830, 21921, 21924, 22000]                                                                                                                                 │
│ Average cumulative reward:       -5.701431297763182                                                                                                                      │
│ Average rollout reward:          -5.621972491892342                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:12[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22828, 22986, 22989, 22992, 23000]                                                                                                                          │
│ Average cumulative reward:       -5.602211400238931                                                                                                                      │
│ Average rollout reward:          -5.506805418796879                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:12[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22828, 22986, 22989, 22992, 23000]                                                                                                                          │
│ Average cumulative reward:       -5.602211400238931                                                                                                                      │
│ Average rollout reward:          -5.506805418796879                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:12[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22828, 22986, 22989, 22992, 23000]                                                                                                                          │
│ Average cumulative reward:       -5.602211400238931                                                                                                                      │
│ Average rollout reward:          -5.506805418796879                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:11[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 23852, 23862, 23877, 23915, 24000]                                                                                                                          │
│ Average cumulative reward:       -6.185248481063707                                                                                                                      │
│ Average rollout reward:          -6.081215287334224                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:11[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 23852, 23862, 23877, 23915, 24000]                                                                                                                          │
│ Average cumulative reward:       -6.185248481063707                                                                                                                      │
│ Average rollout reward:          -6.081215287334224                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:10[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24904, 24994, 24996, 25000]                                                                                                                                 │
│ Average cumulative reward:       -6.114860472177628                                                                                                                      │
│ Average rollout reward:          -5.999139076669764                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:10[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24904, 24994, 24996, 25000]                                                                                                                                 │
│ Average cumulative reward:       -6.114860472177628                                                                                                                      │
│ Average rollout reward:          -5.999139076669764                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:10[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24904, 24994, 24996, 25000]                                                                                                                                 │
│ Average cumulative reward:       -6.114860472177628                                                                                                                      │
│ Average rollout reward:          -5.999139076669764                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:09[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 25983, 25999, 26000]                                                                                                                                        │
│ Average cumulative reward:       -6.266832629697318                                                                                                                      │
│ Average rollout reward:          -6.164075634312137                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:09[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 25983, 25999, 26000]                                                                                                                                        │
│ Average cumulative reward:       -6.266832629697318                                                                                                                      │
│ Average rollout reward:          -6.164075634312137                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:09[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 25983, 25999, 26000]                                                                                                                                        │
│ Average cumulative reward:       -6.266832629697318                                                                                                                      │
│ Average rollout reward:          -6.164075634312137                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:08[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 173, 27000]                                                                                                                                                 │
│ Average cumulative reward:       -6.130265335597271                                                                                                                      │
│ Average rollout reward:          -6.003623343852924                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:08[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 173, 27000]                                                                                                                                                 │
│ Average cumulative reward:       -6.130265335597271                                                                                                                      │
│ Average rollout reward:          -6.003623343852924                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:06[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6000, 21134, 21726, 22243, 28000]                                                                                                                           │
│ Average cumulative reward:       -5.671574038273038                                                                                                                      │
│ Average rollout reward:          -5.579918437821454                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:06[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6000, 21134, 21726, 22243, 28000]                                                                                                                           │
│ Average cumulative reward:       -5.671574038273038                                                                                                                      │
│ Average rollout reward:          -5.579918437821454                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:06[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6000, 21134, 21726, 22243, 28000]                                                                                                                           │
│ Average cumulative reward:       -5.671574038273038                                                                                                                      │
│ Average rollout reward:          -5.579918437821454                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:05[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 28805, 28989, 28992, 28997, 29000]                                                                                                                          │
│ Average cumulative reward:       -6.0346924319361905                                                                                                                     │
│ Average rollout reward:          -5.935320860650363                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:05[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 28805, 28989, 28992, 28997, 29000]                                                                                                                          │
│ Average cumulative reward:       -6.0346924319361905                                                                                                                     │
│ Average rollout reward:          -5.935320860650363                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:04[0m   1.29 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 29984, 30000]                                                                                                                                               │
│ Average cumulative reward:       -5.857840657387669                                                                                                                      │
│ Average rollout reward:          -5.729363958484077                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:04[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 29984, 30000]                                                                                                                                               │
│ Average cumulative reward:       -5.857840657387669                                                                                                                      │
│ Average rollout reward:          -5.729363958484077                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:04[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 29984, 30000]                                                                                                                                               │
│ Average cumulative reward:       -5.857840657387669                                                                                                                      │
│ Average rollout reward:          -5.729363958484077                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:03[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 19001, 19133, 19134, 19202, 31000]                                                                                                                          │
│ Average cumulative reward:       -6.085519836248641                                                                                                                      │
│ Average rollout reward:          -5.980887650368219                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:03[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 19001, 19133, 19134, 19202, 31000]                                                                                                                          │
│ Average cumulative reward:       -6.085519836248641                                                                                                                      │
│ Average rollout reward:          -5.980887650368219                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:03[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 19001, 19133, 19134, 19202, 31000]                                                                                                                          │
│ Average cumulative reward:       -6.085519836248641                                                                                                                      │
│ Average rollout reward:          -5.980887650368219                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:02[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1528, 30314, 30558, 32000]                                                                                                                                  │
│ Average cumulative reward:       -6.108275084975036                                                                                                                      │
│ Average rollout reward:          -5.974197753118729                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:02[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1528, 30314, 30558, 32000]                                                                                                                                  │
│ Average cumulative reward:       -6.108275084975036                                                                                                                      │
│ Average rollout reward:          -5.974197753118729                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:00[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 25983, 25985, 25990, 26013, 33000]                                                                                                                          │
│ Average cumulative reward:       -6.119375488717747                                                                                                                      │
│ Average rollout reward:          -5.991457298629118                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:00[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 25983, 25985, 25990, 26013, 33000]                                                                                                                          │
│ Average cumulative reward:       -6.119375488717747                                                                                                                      │
│ Average rollout reward:          -5.991457298629118                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:00[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 25983, 25985, 25990, 26013, 33000]                                                                                                                          │
│ Average cumulative reward:       -6.119375488717747                                                                                                                      │
│ Average rollout reward:          -5.991457298629118                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:00:59[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 33693, 33749, 33750, 34000]                                                                                                                                 │
│ Average cumulative reward:       -5.9742713588723975                                                                                                                     │
│ Average rollout reward:          -5.883923007424429                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:00:59[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 33693, 33749, 33750, 34000]                                                                                                                                 │
│ Average cumulative reward:       -5.9742713588723975                                                                                                                     │
│ Average rollout reward:          -5.883923007424429                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:00:59[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 33693, 33749, 33750, 34000]                                                                                                                                 │
│ Average cumulative reward:       -5.9742713588723975                                                                                                                     │
│ Average rollout reward:          -5.883923007424429                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:00:58[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 34989, 34999, 35000]                                                                                                                                        │
│ Average cumulative reward:       -5.790440615563437                                                                                                                      │
│ Average rollout reward:          -5.685677546998477                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:00:58[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 34989, 34999, 35000]                                                                                                                                        │
│ Average cumulative reward:       -5.790440615563437                                                                                                                      │
│ Average rollout reward:          -5.685677546998477                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:00:57[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 3825, 21129, 21721, 36000]                                                                                                                                  │
│ Average cumulative reward:       -6.223465352287935                                                                                                                      │
│ Average rollout reward:          -6.091714134788486                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:00:57[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 3825, 21129, 21721, 36000]                                                                                                                                  │
│ Average cumulative reward:       -6.223465352287935                                                                                                                      │
│ Average rollout reward:          -6.091714134788486                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:00:57[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 3825, 21129, 21721, 36000]                                                                                                                                  │
│ Average cumulative reward:       -6.223465352287935                                                                                                                      │
│ Average rollout reward:          -6.091714134788486                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:00:56[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 16407, 16440, 16441, 26595, 37000]                                                                                                                          │
│ Average cumulative reward:       -6.3146938861400015                                                                                                                     │
│ Average rollout reward:          -6.183852269292011                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:00:56[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 16407, 16440, 16441, 26595, 37000]                                                                                                                          │
│ Average cumulative reward:       -6.3146938861400015                                                                                                                     │
│ Average rollout reward:          -6.183852269292011                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:00:56[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 16407, 16440, 16441, 26595, 37000]                                                                                                                          │
│ Average cumulative reward:       -6.3146938861400015                                                                                                                     │
│ Average rollout reward:          -6.183852269292011                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:00:54[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 28805, 38000]                                                                                                                                               │
│ Average cumulative reward:       -5.937250026887042                                                                                                                      │
│ Average rollout reward:          -5.814088290968621                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:00:54[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 28805, 38000]                                                                                                                                               │
│ Average cumulative reward:       -5.937250026887042                                                                                                                      │
│ Average rollout reward:          -5.814088290968621                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:00:53[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10902, 37980, 38200, 39000]                                                                                                                                 │
│ Average cumulative reward:       -6.096617661044555                                                                                                                      │
│ Average rollout reward:          -6.009137315084253                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:00:53[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10902, 37980, 38200, 39000]                                                                                                                                 │
│ Average cumulative reward:       -6.096617661044555                                                                                                                      │
│ Average rollout reward:          -6.009137315084253                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:00:53[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10902, 37980, 38200, 39000]                                                                                                                                 │
│ Average cumulative reward:       -6.096617661044555                                                                                                                      │
│ Average rollout reward:          -6.009137315084253                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:00:52[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 39761, 39892, 39895, 40000]                                                                                                                                 │
│ Average cumulative reward:       -6.119069609159239                                                                                                                      │
│ Average rollout reward:          -6.025776212969149                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:00:52[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 39761, 39892, 39895, 40000]                                                                                                                                 │
│ Average cumulative reward:       -6.119069609159239                                                                                                                      │
│ Average rollout reward:          -6.025776212969149                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:00:51[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 142, 38917, 38919, 40385, 41000]                                                                                                                            │
│ Average cumulative reward:       -6.099087675702986                                                                                                                      │
│ Average rollout reward:          -5.986177689623189                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:00:51[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 142, 38917, 38919, 40385, 41000]                                                                                                                            │
│ Average cumulative reward:       -6.099087675702986                                                                                                                      │
│ Average rollout reward:          -5.986177689623189                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:00:51[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 142, 38917, 38919, 40385, 41000]                                                                                                                            │
│ Average cumulative reward:       -6.099087675702986                                                                                                                      │
│ Average rollout reward:          -5.986177689623189                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:00:49[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 41921, 41993, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.886641842830964                                                                                                                      │
│ Average rollout reward:          -5.7521901603536545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:00:49[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 41921, 41993, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.886641842830964                                                                                                                      │
│ Average rollout reward:          -5.7521901603536545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:00:49[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 41921, 41993, 41997, 42000]                                                                                                                                 │
│ Average cumulative reward:       -5.886641842830964                                                                                                                      │
│ Average rollout reward:          -5.7521901603536545                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:00:48[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 42656, 42680, 42681, 43000]                                                                                                                                 │
│ Average cumulative reward:       -5.861020759299463                                                                                                                      │
│ Average rollout reward:          -5.752546113331621                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:00:48[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 42656, 42680, 42681, 43000]                                                                                                                                 │
│ Average cumulative reward:       -5.861020759299463                                                                                                                      │
│ Average rollout reward:          -5.752546113331621                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:00:47[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 518, 37963, 38286, 44000]                                                                                                                                   │
│ Average cumulative reward:       -6.132388924881794                                                                                                                      │
│ Average rollout reward:          -5.995225665001588                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:00:47[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 518, 37963, 38286, 44000]                                                                                                                                   │
│ Average cumulative reward:       -6.132388924881794                                                                                                                      │
│ Average rollout reward:          -5.995225665001588                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:00:47[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 518, 37963, 38286, 44000]                                                                                                                                   │
│ Average cumulative reward:       -6.132388924881794                                                                                                                      │
│ Average rollout reward:          -5.995225665001588                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:00:45[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44908, 44998, 45000]                                                                                                                                        │
│ Average cumulative reward:       -5.718099922623189                                                                                                                      │
│ Average rollout reward:          -5.599835563103371                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:00:45[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44908, 44998, 45000]                                                                                                                                        │
│ Average cumulative reward:       -5.718099922623189                                                                                                                      │
│ Average rollout reward:          -5.599835563103371                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:00:45[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44908, 44998, 45000]                                                                                                                                        │
│ Average cumulative reward:       -5.718099922623189                                                                                                                      │
│ Average rollout reward:          -5.599835563103371                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:00:44[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 45676, 45682, 45683, 46000]                                                                                                                                 │
│ Average cumulative reward:       -5.9204958882194845                                                                                                                     │
│ Average rollout reward:          -5.806668548560055                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:00:44[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 45676, 45682, 45683, 46000]                                                                                                                                 │
│ Average cumulative reward:       -5.9204958882194845                                                                                                                     │
│ Average rollout reward:          -5.806668548560055                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:00:43[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6, 46852, 47000]                                                                                                                                            │
│ Average cumulative reward:       -6.113020704544247                                                                                                                      │
│ Average rollout reward:          -6.008241523255653                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:00:43[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6, 46852, 47000]                                                                                                                                            │
│ Average cumulative reward:       -6.113020704544247                                                                                                                      │
│ Average rollout reward:          -6.008241523255653                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:00:43[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6, 46852, 47000]                                                                                                                                            │
│ Average cumulative reward:       -6.113020704544247                                                                                                                      │
│ Average rollout reward:          -6.008241523255653                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:00:41[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 912, 916, 922, 26712, 48000]                                                                                                                                │
│ Average cumulative reward:       -5.50374767359268                                                                                                                       │
│ Average rollout reward:          -5.410563542104343                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:00:41[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 912, 916, 922, 26712, 48000]                                                                                                                                │
│ Average cumulative reward:       -5.50374767359268                                                                                                                       │
│ Average rollout reward:          -5.410563542104343                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:00:40[0m   1.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 48822, 48865, 48866, 48882, 49000]                                                                                                                          │
│ Average cumulative reward:       -5.6556787583333055                                                                                                                     │
│ Average rollout reward:          -5.560210675572484                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:00:40[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 48822, 48865, 48866, 48882, 49000]                                                                                                                          │
│ Average cumulative reward:       -5.6556787583333055                                                                                                                     │
│ Average rollout reward:          -5.560210675572484                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:00:40[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 48822, 48865, 48866, 48882, 49000]                                                                                                                          │
│ Average cumulative reward:       -5.6556787583333055                                                                                                                     │
│ Average rollout reward:          -5.560210675572484                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:00:39[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 49629, 49910, 49912, 49923, 50000]                                                                                                                          │
│ Average cumulative reward:       -5.703089827225832                                                                                                                      │
│ Average rollout reward:          -5.599992852429178                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:00:39[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 49629, 49910, 49912, 49923, 50000]                                                                                                                          │
│ Average cumulative reward:       -5.703089827225832                                                                                                                      │
│ Average rollout reward:          -5.599992852429178                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:00:39[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 49629, 49910, 49912, 49923, 50000]                                                                                                                          │
│ Average cumulative reward:       -5.703089827225832                                                                                                                      │
│ Average rollout reward:          -5.599992852429178                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:00:38[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 837, 46864, 47900, 48616, 51000]                                                                                                                            │
│ Average cumulative reward:       -6.0231884778586595                                                                                                                     │
│ Average rollout reward:          -5.904843151666904                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:00:38[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 837, 46864, 47900, 48616, 51000]                                                                                                                            │
│ Average cumulative reward:       -6.0231884778586595                                                                                                                     │
│ Average rollout reward:          -5.904843151666904                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:00:36[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22828, 22885, 22886, 26598, 52000]                                                                                                                          │
│ Average cumulative reward:       -5.667764288393291                                                                                                                      │
│ Average rollout reward:          -5.5499774710164775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:00:36[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22828, 22885, 22886, 26598, 52000]                                                                                                                          │
│ Average cumulative reward:       -5.667764288393291                                                                                                                      │
│ Average rollout reward:          -5.5499774710164775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:00:36[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 22828, 22885, 22886, 26598, 52000]                                                                                                                          │
│ Average cumulative reward:       -5.667764288393291                                                                                                                      │
│ Average rollout reward:          -5.5499774710164775                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:00:35[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 52935, 52937, 52944, 53000]                                                                                                                                 │
│ Average cumulative reward:       -5.970981444262313                                                                                                                      │
│ Average rollout reward:          -5.86554116950402                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:00:35[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 52935, 52937, 52944, 53000]                                                                                                                                 │
│ Average cumulative reward:       -5.970981444262313                                                                                                                      │
│ Average rollout reward:          -5.86554116950402                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:00:35[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 52935, 52937, 52944, 53000]                                                                                                                                 │
│ Average cumulative reward:       -5.970981444262313                                                                                                                      │
│ Average rollout reward:          -5.86554116950402                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:00:34[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53781, 53995, 53996, 54000]                                                                                                                                 │
│ Average cumulative reward:       -6.1536414205828915                                                                                                                     │
│ Average rollout reward:          -6.04450068084671                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:00:34[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53781, 53995, 53996, 54000]                                                                                                                                 │
│ Average cumulative reward:       -6.1536414205828915                                                                                                                     │
│ Average rollout reward:          -6.04450068084671                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:00:32[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 54637, 54997, 55000]                                                                                                                                        │
│ Average cumulative reward:       -5.637831211216352                                                                                                                      │
│ Average rollout reward:          -5.524581245725111                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:00:32[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 54637, 54997, 55000]                                                                                                                                        │
│ Average cumulative reward:       -5.637831211216352                                                                                                                      │
│ Average rollout reward:          -5.524581245725111                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:00:32[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 54637, 54997, 55000]                                                                                                                                        │
│ Average cumulative reward:       -5.637831211216352                                                                                                                      │
│ Average rollout reward:          -5.524581245725111                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:00:31[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1155, 1156, 1171, 6666, 56000]                                                                                                                              │
│ Average cumulative reward:       -5.720519223891528                                                                                                                      │
│ Average rollout reward:          -5.611200438087846                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:00:31[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1155, 1156, 1171, 6666, 56000]                                                                                                                              │
│ Average cumulative reward:       -5.720519223891528                                                                                                                      │
│ Average rollout reward:          -5.611200438087846                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:00:31[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 1155, 1156, 1171, 6666, 56000]                                                                                                                              │
│ Average cumulative reward:       -5.720519223891528                                                                                                                      │
│ Average rollout reward:          -5.611200438087846                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:00:30[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 17245, 17432, 17434, 57000]                                                                                                                                 │
│ Average cumulative reward:       -6.1055062477959225                                                                                                                     │
│ Average rollout reward:          -5.986203084688566                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:00:30[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 17245, 17432, 17434, 57000]                                                                                                                                 │
│ Average cumulative reward:       -6.1055062477959225                                                                                                                     │
│ Average rollout reward:          -5.986203084688566                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:00:28[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53781, 53891, 53894, 54051, 58000]                                                                                                                          │
│ Average cumulative reward:       -5.82676151322682                                                                                                                       │
│ Average rollout reward:          -5.712035685340981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:00:28[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53781, 53891, 53894, 54051, 58000]                                                                                                                          │
│ Average cumulative reward:       -5.82676151322682                                                                                                                       │
│ Average rollout reward:          -5.712035685340981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:00:28[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53781, 53891, 53894, 54051, 58000]                                                                                                                          │
│ Average cumulative reward:       -5.82676151322682                                                                                                                       │
│ Average rollout reward:          -5.712035685340981                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:00:27[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 114, 58231, 58480, 59000]                                                                                                                                   │
│ Average cumulative reward:       -6.216455730144954                                                                                                                      │
│ Average rollout reward:          -6.105908982261131                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:00:27[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 114, 58231, 58480, 59000]                                                                                                                                   │
│ Average cumulative reward:       -6.216455730144954                                                                                                                      │
│ Average rollout reward:          -6.105908982261131                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:00:27[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 114, 58231, 58480, 59000]                                                                                                                                   │
│ Average cumulative reward:       -6.216455730144954                                                                                                                      │
│ Average rollout reward:          -6.105908982261131                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:00:26[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 59941, 59997, 60000]                                                                                                                                        │
│ Average cumulative reward:       -5.851474087419201                                                                                                                      │
│ Average rollout reward:          -5.750601715205842                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:00:26[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 59941, 59997, 60000]                                                                                                                                        │
│ Average cumulative reward:       -5.851474087419201                                                                                                                      │
│ Average rollout reward:          -5.750601715205842                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:00:24[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 60854, 60944, 60946, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.584089149722102                                                                                                                      │
│ Average rollout reward:          -5.4974340782444555                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:00:24[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 60854, 60944, 60946, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.584089149722102                                                                                                                      │
│ Average rollout reward:          -5.4974340782444555                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:00:24[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 60854, 60944, 60946, 61000]                                                                                                                                 │
│ Average cumulative reward:       -5.584089149722102                                                                                                                      │
│ Average rollout reward:          -5.4974340782444555                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:00:23[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 61776, 61932, 61935, 61946, 62000]                                                                                                                          │
│ Average cumulative reward:       -5.908179602765671                                                                                                                      │
│ Average rollout reward:          -5.800963969537326                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:00:23[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 61776, 61932, 61935, 61946, 62000]                                                                                                                          │
│ Average cumulative reward:       -5.908179602765671                                                                                                                      │
│ Average rollout reward:          -5.800963969537326                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:00:22[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62707, 62987, 62990, 63000]                                                                                                                                 │
│ Average cumulative reward:       -5.684408043039784                                                                                                                      │
│ Average rollout reward:          -5.574100369661106                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:00:22[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62707, 62987, 62990, 63000]                                                                                                                                 │
│ Average cumulative reward:       -5.684408043039784                                                                                                                      │
│ Average rollout reward:          -5.574100369661106                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:00:22[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62707, 62987, 62990, 63000]                                                                                                                                 │
│ Average cumulative reward:       -5.684408043039784                                                                                                                      │
│ Average rollout reward:          -5.574100369661106                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:00:20[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 63644, 63701, 63702, 63706, 64000]                                                                                                                          │
│ Average cumulative reward:       -6.193580939973309                                                                                                                      │
│ Average rollout reward:          -6.080221709780293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:00:20[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 63644, 63701, 63702, 63706, 64000]                                                                                                                          │
│ Average cumulative reward:       -6.193580939973309                                                                                                                      │
│ Average rollout reward:          -6.080221709780293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:00:20[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 63644, 63701, 63702, 63706, 64000]                                                                                                                          │
│ Average cumulative reward:       -6.193580939973309                                                                                                                      │
│ Average rollout reward:          -6.080221709780293                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:00:19[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64592, 64602, 64604, 64983, 65000]                                                                                                                          │
│ Average cumulative reward:       -6.139770618957657                                                                                                                      │
│ Average rollout reward:          -6.021993234257479                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:00:19[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64592, 64602, 64604, 64983, 65000]                                                                                                                          │
│ Average cumulative reward:       -6.139770618957657                                                                                                                      │
│ Average rollout reward:          -6.021993234257479                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:00:18[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 65548, 65999, 66000]                                                                                                                                        │
│ Average cumulative reward:       -6.106186116761743                                                                                                                      │
│ Average rollout reward:          -5.986032701692096                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:00:18[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 65548, 65999, 66000]                                                                                                                                        │
│ Average cumulative reward:       -6.106186116761743                                                                                                                      │
│ Average rollout reward:          -5.986032701692096                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:00:18[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 65548, 65999, 66000]                                                                                                                                        │
│ Average cumulative reward:       -6.106186116761743                                                                                                                      │
│ Average rollout reward:          -5.986032701692096                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:00:16[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 66511, 66513, 66519, 67000]                                                                                                                                 │
│ Average cumulative reward:       -5.864321496060885                                                                                                                      │
│ Average rollout reward:          -5.74715253618826                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:00:16[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 66511, 66513, 66519, 67000]                                                                                                                                 │
│ Average cumulative reward:       -5.864321496060885                                                                                                                      │
│ Average rollout reward:          -5.74715253618826                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:00:15[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 61776, 62228, 62231, 65374, 68000]                                                                                                                          │
│ Average cumulative reward:       -6.486848798375455                                                                                                                      │
│ Average rollout reward:          -6.356865186417899                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:00:15[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 61776, 62228, 62231, 65374, 68000]                                                                                                                          │
│ Average cumulative reward:       -6.486848798375455                                                                                                                      │
│ Average rollout reward:          -6.356865186417899                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:00:15[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 61776, 62228, 62231, 65374, 68000]                                                                                                                          │
│ Average cumulative reward:       -6.486848798375455                                                                                                                      │
│ Average rollout reward:          -6.356865186417899                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:00:14[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44908, 44941, 44945, 45016, 69000]                                                                                                                          │
│ Average cumulative reward:       -5.794853064479558                                                                                                                      │
│ Average rollout reward:          -5.686651296305803                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:00:14[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44908, 44941, 44945, 45016, 69000]                                                                                                                          │
│ Average cumulative reward:       -5.794853064479558                                                                                                                      │
│ Average rollout reward:          -5.686651296305803                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:00:14[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44908, 44941, 44945, 45016, 69000]                                                                                                                          │
│ Average cumulative reward:       -5.794853064479558                                                                                                                      │
│ Average rollout reward:          -5.686651296305803                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:00:12[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 636, 19732, 20360, 70000]                                                                                                                                   │
│ Average cumulative reward:       -5.760903766029773                                                                                                                      │
│ Average rollout reward:          -5.647112477328344                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:00:12[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 636, 19732, 20360, 70000]                                                                                                                                   │
│ Average cumulative reward:       -5.760903766029773                                                                                                                      │
│ Average rollout reward:          -5.647112477328344                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:11[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 50442, 50552, 50553, 71000]                                                                                                                                 │
│ Average cumulative reward:       -5.815618514329928                                                                                                                      │
│ Average rollout reward:          -5.704353004035637                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:11[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 50442, 50552, 50553, 71000]                                                                                                                                 │
│ Average cumulative reward:       -5.815618514329928                                                                                                                      │
│ Average rollout reward:          -5.704353004035637                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:11[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 50442, 50552, 50553, 71000]                                                                                                                                 │
│ Average cumulative reward:       -5.815618514329928                                                                                                                      │
│ Average rollout reward:          -5.704353004035637                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:10[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 39056, 68174, 68432, 70296, 72000]                                                                                                                          │
│ Average cumulative reward:       -5.943350028868905                                                                                                                      │
│ Average rollout reward:          -5.815567721763439                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:10[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 39056, 68174, 68432, 70296, 72000]                                                                                                                          │
│ Average cumulative reward:       -5.943350028868905                                                                                                                      │
│ Average rollout reward:          -5.815567721763439                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:10[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 39056, 68174, 68432, 70296, 72000]                                                                                                                          │
│ Average cumulative reward:       -5.943350028868905                                                                                                                      │
│ Average rollout reward:          -5.815567721763439                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:08[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 72475, 72491, 72492, 72963, 73000]                                                                                                                          │
│ Average cumulative reward:       -5.992814853010929                                                                                                                      │
│ Average rollout reward:          -5.8828094283373025                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:08[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 72475, 72491, 72492, 72963, 73000]                                                                                                                          │
│ Average cumulative reward:       -5.992814853010929                                                                                                                      │
│ Average rollout reward:          -5.8828094283373025                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:07[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 73500, 73656, 73658, 73664, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.782698098526196                                                                                                                      │
│ Average rollout reward:          -5.671930068609295                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:07[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 73500, 73656, 73658, 73664, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.782698098526196                                                                                                                      │
│ Average rollout reward:          -5.671930068609295                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:07[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 73500, 73656, 73658, 73664, 74000]                                                                                                                          │
│ Average cumulative reward:       -5.782698098526196                                                                                                                      │
│ Average rollout reward:          -5.671930068609295                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:06[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 74533, 74985, 74988, 75000]                                                                                                                                 │
│ Average cumulative reward:       -6.250179848756222                                                                                                                      │
│ Average rollout reward:          -6.147560416102602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:06[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 74533, 74985, 74988, 75000]                                                                                                                                 │
│ Average cumulative reward:       -6.250179848756222                                                                                                                      │
│ Average rollout reward:          -6.147560416102602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:06[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 74533, 74985, 74988, 75000]                                                                                                                                 │
│ Average cumulative reward:       -6.250179848756222                                                                                                                      │
│ Average rollout reward:          -6.147560416102602                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:04[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 13, 252, 256, 3700, 76000]                                                                                                                                  │
│ Average cumulative reward:       -6.194531336478878                                                                                                                      │
│ Average rollout reward:          -6.071786721133162                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:04[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 13, 252, 256, 3700, 76000]                                                                                                                                  │
│ Average cumulative reward:       -6.194531336478878                                                                                                                      │
│ Average rollout reward:          -6.071786721133162                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:03[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 76626, 76715, 76717, 76941, 77000]                                                                                                                          │
│ Average cumulative reward:       -6.312402083730576                                                                                                                      │
│ Average rollout reward:          -6.186195471044842                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:03[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 76626, 76715, 76717, 76941, 77000]                                                                                                                          │
│ Average cumulative reward:       -6.312402083730576                                                                                                                      │
│ Average rollout reward:          -6.186195471044842                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:03[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 76626, 76715, 76717, 76941, 77000]                                                                                                                          │
│ Average cumulative reward:       -6.312402083730576                                                                                                                      │
│ Average rollout reward:          -6.186195471044842                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:02[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 77685, 77695, 77698, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.741726037172177                                                                                                                      │
│ Average rollout reward:          -5.621904848585813                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:02[0m   1.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 77685, 77695, 77698, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.741726037172177                                                                                                                      │
│ Average rollout reward:          -5.621904848585813                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:02[0m   1.33 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 77685, 77695, 77698, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.741726037172177                                                                                                                      │
│ Average rollout reward:          -5.621904848585813                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/79 [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100.0%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:00[0m   1.31 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 77685, 77695, 77698, 78000]                                                                                                                                 │
│ Average cumulative reward:       -5.741726037172177                                                                                                                      │
│ Average rollout reward:          -5.621904848585813                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -0.4991601257877109                                                                                                                             │
│ Best path: [0, 3]                                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[?25hNode 0 is not terminal. Continue.
Node 3 is not terminal. Continue.
Node 4 is not terminal. Continue.
Node 5327 is not terminal. Continue.
Node 5466 is not terminal. Continue.
Node 5470 is not terminal. Continue.
Node 5514 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 3 is not terminal. Continue.
Node 69455 is not terminal. Continue.
Node 73068 is not terminal. Continue.
Node 74028 is not terminal. Continue.
Node 75712 is not terminal. Continue.
Node 77240 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 3 is not terminal. Continue.
Node 4 is not terminal. Continue.
Node 5327 is not terminal. Continue.
Node 5466 is not terminal. Continue.
Node 28651 is not terminal. Continue.
Node 35908 is not terminal. Continue.
No children found. Stop.
=== RESULT ===
By Visits: estimated reward: -0.8984882264178796
sign_ns [3.7561243 1.2991889]
By Value: estimated reward: -3.179823839914595
sign_ns [4.1711297  0.18950325]
By Best Value: estimated reward: 0
sign_ns [0.5       1.7310436 0.        0.       ]
sign_ns [0.5, 1.7292289847495934]
sign_ns [0.5, 1.7246944424766881]
sign_ns [0.5, 1.712962507317751]
sign_ns [0.5, 1.6827188917965556]
sign_ns [0.5, 1.6069075884726867]
sign_ns [0.5, 1.439191546037524]
sign_ns [0.5, 1.1909606688547993]
sign_ns [0.5, 1.0298080766463207]
sign_ns [0.5, 1.000673375465261]
Best value of root node:
-0.4991601257877109
Best root policy:
sign_ns [0.5       1.7310436 0.        0.       ]
sign_ns [0.5, 1.7292289847495934]
sign_ns [0.5, 1.7246944424766881]
sign_ns [0.5, 1.712962507317751]
sign_ns [0.5, 1.6827188917965556]
sign_ns [0.5, 1.6069075884726867]
sign_ns [0.5, 1.439191546037524]
sign_ns [0.5, 1.1909606688547993]
sign_ns [0.5, 1.0298080766463207]
sign_ns [0.5, 1.000673375465261]
=== END ===
Finished making algorithm
