Matrix distribution: Erdos_Renyi
Matrix distribution config: {'c': 0.4, 'd': 5000, 'eps': 0.001}
Initial matrix shape: torch.Size([5000, 5000])
Algorithm name: mcts
Algorithm config: {'c_ucb': 5.0, 'alpha_pw': 0.4, 'epsilon': 1e-06, 'EXPLORE_K': 5, 'early_termination_epsilon': 1e-05, 'budget': 80000, 'print_every': 1000, 'max_termination_count': 10, 'tree_initial_capacity': 10000, 'device': 'cuda', 'actions': [['proot_newton', [[0, 0], [10, 10]]], ['proot_visser', [[0, 0], [10, 10]]], ['proot_couple', None], ['proot_iannazzo', [[0, 0], [10, 10]]]], 'initialize_with_baselines': True}
Actions: ['proot_couple', 'proot_iannazzo', 'proot_newton', 'proot_visser']
Action proot_couple took 1.0 times longer than proot_couple
Action proot_iannazzo took 1.0431302983565485 times longer than proot_couple
Action proot_newton took 0.8628460880314732 times longer than proot_couple
Action proot_visser took 0.23682154556258883 times longer than proot_couple
Skipping sign_newton because not all actions are in the tree
Skipping sign_scaled_newton because not all actions are in the tree
Skipping sign_ns because not all actions are in the tree
Skipping sign_scaled_ns because not all actions are in the tree
Skipping sign_newton_variant because not all actions are in the tree
Skipping sign_halley because not all actions are in the tree
Skipping inv_ns because not all actions are in the tree
Skipping inv_ns_chebyshev because not all actions are in the tree
Skipping sqrt_db because not all actions are in the tree
Skipping sqrt_nsv because not all actions are in the tree
Skipping sqrt_visser because not all actions are in the tree
Skipping sqrt_newton because not all actions are in the tree
Skipping sqrt_visser_coupled because not all actions are in the tree
Skipping sqrt_newton_coupled because not all actions are in the tree
[?25l0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:00[0m Remaining: [36m-:--:--[0m 501994.61 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-8.34504239 -8.34504239]                                                                                                                                                │
│ [-6.25878179 -6.25878179 -6.25878179]                                                                                                                                    │
│ [-5.21565149 -5.21565149 -5.21565149 -5.21565149]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1005263.34 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-8.34504239 -8.34504239]                                                                                                                                                │
│ [-6.25878179 -6.25878179 -6.25878179]                                                                                                                                    │
│ [-5.21565149 -5.21565149 -5.21565149 -5.21565149]                                                                                                                        │
│ [-4.17252119 -4.17252119 -4.17252119 -4.17252119]                                                                                                                        │
│ [-3.1293909 -3.1293909 -3.1293909 -3.1293909]                                                                                                                            │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1509211.19 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-8.34504239 -8.34504239]                                                                                                                                                │
│ [-6.25878179 -6.25878179 -6.25878179]                                                                                                                                    │
│ [-5.21565149 -5.21565149 -5.21565149 -5.21565149]                                                                                                                        │
│ [-4.17252119 -4.17252119 -4.17252119 -4.17252119]                                                                                                                        │
│ [-3.1293909 -3.1293909 -3.1293909 -3.1293909]                                                                                                                            │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 993, 994, 997, 1000]                                                                                                                                   │
│ Average cumulative reward:       -31.01757000134837                                                                                                                      │
│ Average rollout reward:          -29.749123558546945                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 993, 994, 997, 1000]                                                                                                                                   │
│ Average cumulative reward:       -31.01757000134837                                                                                                                      │
│ Average rollout reward:          -29.749123558546945                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 993, 994, 997, 1000]                                                                                                                                   │
│ Average cumulative reward:       -31.01757000134837                                                                                                                      │
│ Average rollout reward:          -29.749123558546945                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 993, 994, 997, 1000]                                                                                                                                   │
│ Average cumulative reward:       -31.01757000134837                                                                                                                      │
│ Average rollout reward:          -29.749123558546945                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:37[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1125, 1126, 1992, 1996, 2000]                                                                                                                          │
│ Average cumulative reward:       -32.51747333464844                                                                                                                      │
│ Average rollout reward:          -30.805792359161497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:37[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1125, 1126, 1992, 1996, 2000]                                                                                                                          │
│ Average cumulative reward:       -32.51747333464844                                                                                                                      │
│ Average rollout reward:          -30.805792359161497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:37[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1125, 1126, 1992, 1996, 2000]                                                                                                                          │
│ Average cumulative reward:       -32.51747333464844                                                                                                                      │
│ Average rollout reward:          -30.805792359161497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:37[0m   2.77 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1125, 1126, 1992, 1996, 2000]                                                                                                                          │
│ Average cumulative reward:       -32.51747333464844                                                                                                                      │
│ Average rollout reward:          -30.805792359161497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:34[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 2940, 2941, 3000]                                                                                                                                      │
│ Average cumulative reward:       -32.21999375005583                                                                                                                      │
│ Average rollout reward:          -30.326712258538784                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:34[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 2940, 2941, 3000]                                                                                                                                      │
│ Average cumulative reward:       -32.21999375005583                                                                                                                      │
│ Average rollout reward:          -30.326712258538784                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:34[0m   2.35 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 2940, 2941, 3000]                                                                                                                                      │
│ Average cumulative reward:       -32.21999375005583                                                                                                                      │
│ Average rollout reward:          -30.326712258538784                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:34[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 2940, 2941, 3000]                                                                                                                                      │
│ Average cumulative reward:       -32.21999375005583                                                                                                                      │
│ Average rollout reward:          -30.326712258538784                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:31[0m   2.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 3987, 3988, 4000]                                                                                                                                      │
│ Average cumulative reward:       -31.82559122310051                                                                                                                      │
│ Average rollout reward:          -29.99072502829148                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:31[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 3987, 3988, 4000]                                                                                                                                      │
│ Average cumulative reward:       -31.82559122310051                                                                                                                      │
│ Average rollout reward:          -29.99072502829148                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:31[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 3987, 3988, 4000]                                                                                                                                      │
│ Average cumulative reward:       -31.82559122310051                                                                                                                      │
│ Average rollout reward:          -29.99072502829148                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:31[0m   2.39 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 3987, 3988, 4000]                                                                                                                                      │
│ Average cumulative reward:       -31.82559122310051                                                                                                                      │
│ Average rollout reward:          -29.99072502829148                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:31[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 3987, 3988, 4000]                                                                                                                                      │
│ Average cumulative reward:       -31.82559122310051                                                                                                                      │
│ Average rollout reward:          -29.99072502829148                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:31[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1349, 1350, 1353, 1356, 5000]                                                                                                                          │
│ Average cumulative reward:       -33.55957810425638                                                                                                                      │
│ Average rollout reward:          -31.723668779148902                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:31[0m   2.22 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1349, 1350, 1353, 1356, 5000]                                                                                                                          │
│ Average cumulative reward:       -33.55957810425638                                                                                                                      │
│ Average rollout reward:          -31.723668779148902                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:31[0m   2.32 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1349, 1350, 1353, 1356, 5000]                                                                                                                          │
│ Average cumulative reward:       -33.55957810425638                                                                                                                      │
│ Average rollout reward:          -31.723668779148902                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:31[0m   2.42 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 1349, 1350, 1353, 1356, 5000]                                                                                                                          │
│ Average cumulative reward:       -33.55957810425638                                                                                                                      │
│ Average rollout reward:          -31.723668779148902                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:30[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 5979, 5980, 5984, 6000]                                                                                                                                │
│ Average cumulative reward:       -33.66902320187215                                                                                                                      │
│ Average rollout reward:          -31.787216143637092                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:30[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 5979, 5980, 5984, 6000]                                                                                                                                │
│ Average cumulative reward:       -33.66902320187215                                                                                                                      │
│ Average rollout reward:          -31.787216143637092                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:30[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 5979, 5980, 5984, 6000]                                                                                                                                │
│ Average cumulative reward:       -33.66902320187215                                                                                                                      │
│ Average rollout reward:          -31.787216143637092                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:30[0m   2.35 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 5979, 5980, 5984, 6000]                                                                                                                                │
│ Average cumulative reward:       -33.66902320187215                                                                                                                      │
│ Average rollout reward:          -31.787216143637092                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:28[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 6995, 6996, 7000]                                                                                                                                           │
│ Average cumulative reward:       -33.36735502528944                                                                                                                      │
│ Average rollout reward:          -31.43547771273316                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:28[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 6995, 6996, 7000]                                                                                                                                           │
│ Average cumulative reward:       -33.36735502528944                                                                                                                      │
│ Average rollout reward:          -31.43547771273316                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:02:28[0m   2.23 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 6995, 6996, 7000]                                                                                                                                           │
│ Average cumulative reward:       -33.36735502528944                                                                                                                      │
│ Average rollout reward:          -31.43547771273316                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:02:28[0m   2.30 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 6995, 6996, 7000]                                                                                                                                           │
│ Average cumulative reward:       -33.36735502528944                                                                                                                      │
│ Average rollout reward:          -31.43547771273316                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:02:26[0m   2.08 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 7905, 7908, 8000]                                                                                                        │
│ Average cumulative reward:       -32.378077732817374                                                                                                                     │
│ Average rollout reward:          -30.352318693409057                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:02:26[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 7905, 7908, 8000]                                                                                                        │
│ Average cumulative reward:       -32.378077732817374                                                                                                                     │
│ Average rollout reward:          -30.352318693409057                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:02:26[0m   2.20 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 7905, 7908, 8000]                                                                                                        │
│ Average cumulative reward:       -32.378077732817374                                                                                                                     │
│ Average rollout reward:          -30.352318693409057                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:02:26[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 7905, 7908, 8000]                                                                                                        │
│ Average cumulative reward:       -32.378077732817374                                                                                                                     │
│ Average rollout reward:          -30.352318693409057                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:02:25[0m   2.07 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 8983, 8999, 9000]                                                                                                                          │
│ Average cumulative reward:       -33.972439890447156                                                                                                                     │
│ Average rollout reward:          -31.869489208960424                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:02:25[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 8983, 8999, 9000]                                                                                                                          │
│ Average cumulative reward:       -33.972439890447156                                                                                                                     │
│ Average rollout reward:          -31.869489208960424                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:02:25[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 8983, 8999, 9000]                                                                                                                          │
│ Average cumulative reward:       -33.972439890447156                                                                                                                     │
│ Average rollout reward:          -31.869489208960424                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:02:25[0m   2.24 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 1, 320, 7698, 7699, 8983, 8999, 9000]                                                                                                                          │
│ Average cumulative reward:       -33.972439890447156                                                                                                                     │
│ Average rollout reward:          -31.869489208960424                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:02:23[0m   2.07 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 6191, 6192, 6222, 6229, 10000]                                                                                                                         │
│ Average cumulative reward:       -32.869606849588905                                                                                                                     │
│ Average rollout reward:          -31.02326622149786                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:02:23[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 6191, 6192, 6222, 6229, 10000]                                                                                                                         │
│ Average cumulative reward:       -32.869606849588905                                                                                                                     │
│ Average rollout reward:          -31.02326622149786                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:02:23[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 6191, 6192, 6222, 6229, 10000]                                                                                                                         │
│ Average cumulative reward:       -32.869606849588905                                                                                                                     │
│ Average rollout reward:          -31.02326622149786                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:02:23[0m   2.22 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 6191, 6192, 6222, 6229, 10000]                                                                                                                         │
│ Average cumulative reward:       -32.869606849588905                                                                                                                     │
│ Average rollout reward:          -31.02326622149786                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:02:21[0m   2.06 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 10927, 11000]                                                                                                                            │
│ Average cumulative reward:       -33.84419592016878                                                                                                                      │
│ Average rollout reward:          -31.770452887036036                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:02:21[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 10927, 11000]                                                                                                                            │
│ Average cumulative reward:       -33.84419592016878                                                                                                                      │
│ Average rollout reward:          -31.770452887036036                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:02:21[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 10927, 11000]                                                                                                                            │
│ Average cumulative reward:       -33.84419592016878                                                                                                                      │
│ Average rollout reward:          -31.770452887036036                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:02:21[0m   2.20 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 10927, 11000]                                                                                                                            │
│ Average cumulative reward:       -33.84419592016878                                                                                                                      │
│ Average rollout reward:          -31.770452887036036                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:02:21[0m   2.24 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 10927, 11000]                                                                                                                            │
│ Average cumulative reward:       -33.84419592016878                                                                                                                      │
│ Average rollout reward:          -31.770452887036036                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:02:20[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 11992, 11998, 12000]                                                                                                                     │
│ Average cumulative reward:       -33.44398536185869                                                                                                                      │
│ Average rollout reward:          -31.279489992769065                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:02:20[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 11992, 11998, 12000]                                                                                                                     │
│ Average cumulative reward:       -33.44398536185869                                                                                                                      │
│ Average rollout reward:          -31.279489992769065                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:02:20[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 11992, 11998, 12000]                                                                                                                     │
│ Average cumulative reward:       -33.44398536185869                                                                                                                      │
│ Average rollout reward:          -31.279489992769065                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:02:20[0m   2.23 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 11992, 11998, 12000]                                                                                                                     │
│ Average cumulative reward:       -33.44398536185869                                                                                                                      │
│ Average rollout reward:          -31.279489992769065                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:02:18[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 12802, 12884, 12889, 12909, 12980, 13000]                                                                                                  │
│ Average cumulative reward:       -32.31914301602715                                                                                                                      │
│ Average rollout reward:          -30.33197979765806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:02:18[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 12802, 12884, 12889, 12909, 12980, 13000]                                                                                                  │
│ Average cumulative reward:       -32.31914301602715                                                                                                                      │
│ Average rollout reward:          -30.33197979765806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:02:18[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 12802, 12884, 12889, 12909, 12980, 13000]                                                                                                  │
│ Average cumulative reward:       -32.31914301602715                                                                                                                      │
│ Average rollout reward:          -30.33197979765806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:02:18[0m   2.21 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 12802, 12884, 12889, 12909, 12980, 13000]                                                                                                  │
│ Average cumulative reward:       -32.31914301602715                                                                                                                      │
│ Average rollout reward:          -30.33197979765806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:02:16[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 13968, 14000]                                                                                                                                               │
│ Average cumulative reward:       -33.429357886543926                                                                                                                     │
│ Average rollout reward:          -31.379787134483703                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:02:16[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 13968, 14000]                                                                                                                                               │
│ Average cumulative reward:       -33.429357886543926                                                                                                                     │
│ Average rollout reward:          -31.379787134483703                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:02:16[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 13968, 14000]                                                                                                                                               │
│ Average cumulative reward:       -33.429357886543926                                                                                                                     │
│ Average rollout reward:          -31.379787134483703                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:02:16[0m   2.20 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 13968, 14000]                                                                                                                                               │
│ Average cumulative reward:       -33.429357886543926                                                                                                                     │
│ Average rollout reward:          -31.379787134483703                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:02:14[0m   2.08 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 12942, 14695, 15000]                                                                                                │
│ Average cumulative reward:       -33.50964680012252                                                                                                                      │
│ Average rollout reward:          -31.17407806210228                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:02:14[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 12942, 14695, 15000]                                                                                                │
│ Average cumulative reward:       -33.50964680012252                                                                                                                      │
│ Average rollout reward:          -31.17407806210228                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:02:14[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 12942, 14695, 15000]                                                                                                │
│ Average cumulative reward:       -33.50964680012252                                                                                                                      │
│ Average rollout reward:          -31.17407806210228                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:02:14[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 12942, 14695, 15000]                                                                                                │
│ Average cumulative reward:       -33.50964680012252                                                                                                                      │
│ Average rollout reward:          -31.17407806210228                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:02:14[0m   2.22 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 12942, 14695, 15000]                                                                                                │
│ Average cumulative reward:       -33.50964680012252                                                                                                                      │
│ Average rollout reward:          -31.17407806210228                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:02:13[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1256, 1414, 1415, 9749, 10226, 10354, 16000]                                                                                               │
│ Average cumulative reward:       -33.70669506607196                                                                                                                      │
│ Average rollout reward:          -31.439972927743273                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:02:13[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1256, 1414, 1415, 9749, 10226, 10354, 16000]                                                                                               │
│ Average cumulative reward:       -33.70669506607196                                                                                                                      │
│ Average rollout reward:          -31.439972927743273                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:02:13[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1256, 1414, 1415, 9749, 10226, 10354, 16000]                                                                                               │
│ Average cumulative reward:       -33.70669506607196                                                                                                                      │
│ Average rollout reward:          -31.439972927743273                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:02:13[0m   2.21 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1256, 1414, 1415, 9749, 10226, 10354, 16000]                                                                                               │
│ Average cumulative reward:       -33.70669506607196                                                                                                                      │
│ Average rollout reward:          -31.439972927743273                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:02:10[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 13541, 15438, 15722, 17000]                                                                                                                │
│ Average cumulative reward:       -32.30612158802035                                                                                                                      │
│ Average rollout reward:          -30.207343427726922                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:02:10[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 13541, 15438, 15722, 17000]                                                                                                                │
│ Average cumulative reward:       -32.30612158802035                                                                                                                      │
│ Average rollout reward:          -30.207343427726922                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:02:10[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 13541, 15438, 15722, 17000]                                                                                                                │
│ Average cumulative reward:       -32.30612158802035                                                                                                                      │
│ Average rollout reward:          -30.207343427726922                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:02:10[0m   2.19 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 13541, 15438, 15722, 17000]                                                                                                                │
│ Average cumulative reward:       -32.30612158802035                                                                                                                      │
│ Average rollout reward:          -30.207343427726922                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:02:09[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 474, 475, 10653, 18000]                                                                                                                 │
│ Average cumulative reward:       -32.57521072296485                                                                                                                      │
│ Average rollout reward:          -30.45974247789794                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:02:09[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 474, 475, 10653, 18000]                                                                                                                 │
│ Average cumulative reward:       -32.57521072296485                                                                                                                      │
│ Average rollout reward:          -30.45974247789794                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:02:09[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 474, 475, 10653, 18000]                                                                                                                 │
│ Average cumulative reward:       -32.57521072296485                                                                                                                      │
│ Average rollout reward:          -30.45974247789794                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:02:09[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 474, 475, 10653, 18000]                                                                                                                 │
│ Average cumulative reward:       -32.57521072296485                                                                                                                      │
│ Average rollout reward:          -30.45974247789794                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:02:07[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18064, 18985, 19000]                                                                                                │
│ Average cumulative reward:       -33.585606234024844                                                                                                                     │
│ Average rollout reward:          -31.20413976287695                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:02:07[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18064, 18985, 19000]                                                                                                │
│ Average cumulative reward:       -33.585606234024844                                                                                                                     │
│ Average rollout reward:          -31.20413976287695                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:02:07[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18064, 18985, 19000]                                                                                                │
│ Average cumulative reward:       -33.585606234024844                                                                                                                     │
│ Average rollout reward:          -31.20413976287695                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:02:07[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18064, 18985, 19000]                                                                                                │
│ Average cumulative reward:       -33.585606234024844                                                                                                                     │
│ Average rollout reward:          -31.20413976287695                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:02:05[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 19467, 19477, 19535, 19621, 19651, 20000]                                                                                              │
│ Average cumulative reward:       -32.3580189632657                                                                                                                       │
│ Average rollout reward:          -29.95986240734412                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:02:05[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 19467, 19477, 19535, 19621, 19651, 20000]                                                                                              │
│ Average cumulative reward:       -32.3580189632657                                                                                                                       │
│ Average rollout reward:          -29.95986240734412                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:02:05[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 19467, 19477, 19535, 19621, 19651, 20000]                                                                                              │
│ Average cumulative reward:       -32.3580189632657                                                                                                                       │
│ Average rollout reward:          -29.95986240734412                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:02:05[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 19467, 19477, 19535, 19621, 19651, 20000]                                                                                              │
│ Average cumulative reward:       -32.3580189632657                                                                                                                       │
│ Average rollout reward:          -29.95986240734412                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:02:03[0m   2.09 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 20969, 20970, 20979, 20983, 20985, 21000]                                                                                                              │
│ Average cumulative reward:       -32.70883304034556                                                                                                                      │
│ Average rollout reward:          -30.390997517397313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:02:03[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 20969, 20970, 20979, 20983, 20985, 21000]                                                                                                              │
│ Average cumulative reward:       -32.70883304034556                                                                                                                      │
│ Average rollout reward:          -30.390997517397313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:02:03[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 20969, 20970, 20979, 20983, 20985, 21000]                                                                                                              │
│ Average cumulative reward:       -32.70883304034556                                                                                                                      │
│ Average rollout reward:          -30.390997517397313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:02:03[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 20969, 20970, 20979, 20983, 20985, 21000]                                                                                                              │
│ Average cumulative reward:       -32.70883304034556                                                                                                                      │
│ Average rollout reward:          -30.390997517397313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:02:03[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 20969, 20970, 20979, 20983, 20985, 21000]                                                                                                              │
│ Average cumulative reward:       -32.70883304034556                                                                                                                      │
│ Average rollout reward:          -30.390997517397313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:02:01[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 21624, 21634, 21824, 21925, 22000]                                                                                                      │
│ Average cumulative reward:       -32.86010246612543                                                                                                                      │
│ Average rollout reward:          -30.400401222600827                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:02:01[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 21624, 21634, 21824, 21925, 22000]                                                                                                      │
│ Average cumulative reward:       -32.86010246612543                                                                                                                      │
│ Average rollout reward:          -30.400401222600827                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:02:01[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 21624, 21634, 21824, 21925, 22000]                                                                                                      │
│ Average cumulative reward:       -32.86010246612543                                                                                                                      │
│ Average rollout reward:          -30.400401222600827                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:02:01[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 349, 21624, 21634, 21824, 21925, 22000]                                                                                                      │
│ Average cumulative reward:       -32.86010246612543                                                                                                                      │
│ Average rollout reward:          -30.400401222600827                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:59[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 22916, 22917, 22999, 23000]                                                                                                                            │
│ Average cumulative reward:       -33.277438587358944                                                                                                                     │
│ Average rollout reward:          -30.845901861889924                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:59[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 22916, 22917, 22999, 23000]                                                                                                                            │
│ Average cumulative reward:       -33.277438587358944                                                                                                                     │
│ Average rollout reward:          -30.845901861889924                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:59[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 22916, 22917, 22999, 23000]                                                                                                                            │
│ Average cumulative reward:       -33.277438587358944                                                                                                                     │
│ Average rollout reward:          -30.845901861889924                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:59[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 22916, 22917, 22999, 23000]                                                                                                                            │
│ Average cumulative reward:       -33.277438587358944                                                                                                                     │
│ Average rollout reward:          -30.845901861889924                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:57[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 23249, 23793, 23798, 23833, 24000]                                                                                                         │
│ Average cumulative reward:       -33.065830493586844                                                                                                                     │
│ Average rollout reward:          -30.80849652794339                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:57[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 23249, 23793, 23798, 23833, 24000]                                                                                                         │
│ Average cumulative reward:       -33.065830493586844                                                                                                                     │
│ Average rollout reward:          -30.80849652794339                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:57[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 23249, 23793, 23798, 23833, 24000]                                                                                                         │
│ Average cumulative reward:       -33.065830493586844                                                                                                                     │
│ Average rollout reward:          -30.80849652794339                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:57[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 23249, 23793, 23798, 23833, 24000]                                                                                                         │
│ Average cumulative reward:       -33.065830493586844                                                                                                                     │
│ Average rollout reward:          -30.80849652794339                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:55[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 20457, 20467, 25000]                                                                                                                   │
│ Average cumulative reward:       -33.35157226833325                                                                                                                      │
│ Average rollout reward:          -30.808420600940153                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:55[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 20457, 20467, 25000]                                                                                                                   │
│ Average cumulative reward:       -33.35157226833325                                                                                                                      │
│ Average rollout reward:          -30.808420600940153                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:55[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 20457, 20467, 25000]                                                                                                                   │
│ Average cumulative reward:       -33.35157226833325                                                                                                                      │
│ Average rollout reward:          -30.808420600940153                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:55[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 20457, 20467, 25000]                                                                                                                   │
│ Average cumulative reward:       -33.35157226833325                                                                                                                      │
│ Average rollout reward:          -30.808420600940153                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:55[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 2924, 20457, 20467, 25000]                                                                                                                   │
│ Average cumulative reward:       -33.35157226833325                                                                                                                      │
│ Average rollout reward:          -30.808420600940153                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:53[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 17181, 17186, 17598, 17608, 26000]                                                                                                │
│ Average cumulative reward:       -33.58379259035885                                                                                                                      │
│ Average rollout reward:          -31.034382141175477                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:53[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 17181, 17186, 17598, 17608, 26000]                                                                                                │
│ Average cumulative reward:       -33.58379259035885                                                                                                                      │
│ Average rollout reward:          -31.034382141175477                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:53[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 17181, 17186, 17598, 17608, 26000]                                                                                                │
│ Average cumulative reward:       -33.58379259035885                                                                                                                      │
│ Average rollout reward:          -31.034382141175477                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:53[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 17181, 17186, 17598, 17608, 26000]                                                                                                │
│ Average cumulative reward:       -33.58379259035885                                                                                                                      │
│ Average rollout reward:          -31.034382141175477                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:51[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 26995, 27000]                                                                                                                     │
│ Average cumulative reward:       -34.545129105959525                                                                                                                     │
│ Average rollout reward:          -32.26171688285704                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:51[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 26995, 27000]                                                                                                                     │
│ Average cumulative reward:       -34.545129105959525                                                                                                                     │
│ Average rollout reward:          -32.26171688285704                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:51[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 26995, 27000]                                                                                                                     │
│ Average cumulative reward:       -34.545129105959525                                                                                                                     │
│ Average rollout reward:          -32.26171688285704                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:51[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 26995, 27000]                                                                                                                     │
│ Average cumulative reward:       -34.545129105959525                                                                                                                     │
│ Average rollout reward:          -32.26171688285704                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:49[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 27889, 27932, 27933, 27938, 27996, 28000]                                                                                                                   │
│ Average cumulative reward:       -32.694376376910675                                                                                                                     │
│ Average rollout reward:          -30.565347437965073                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:49[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 27889, 27932, 27933, 27938, 27996, 28000]                                                                                                                   │
│ Average cumulative reward:       -32.694376376910675                                                                                                                     │
│ Average rollout reward:          -30.565347437965073                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:49[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 27889, 27932, 27933, 27938, 27996, 28000]                                                                                                                   │
│ Average cumulative reward:       -32.694376376910675                                                                                                                     │
│ Average rollout reward:          -30.565347437965073                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:49[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 27889, 27932, 27933, 27938, 27996, 28000]                                                                                                                   │
│ Average cumulative reward:       -32.694376376910675                                                                                                                     │
│ Average rollout reward:          -30.565347437965073                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:47[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10422, 10423, 10458, 10461, 10493, 29000]                                                                                                              │
│ Average cumulative reward:       -32.97832665201332                                                                                                                      │
│ Average rollout reward:          -30.669879301750427                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:47[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10422, 10423, 10458, 10461, 10493, 29000]                                                                                                              │
│ Average cumulative reward:       -32.97832665201332                                                                                                                      │
│ Average rollout reward:          -30.669879301750427                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:47[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10422, 10423, 10458, 10461, 10493, 29000]                                                                                                              │
│ Average cumulative reward:       -32.97832665201332                                                                                                                      │
│ Average rollout reward:          -30.669879301750427                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:47[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10422, 10423, 10458, 10461, 10493, 29000]                                                                                                              │
│ Average cumulative reward:       -32.97832665201332                                                                                                                      │
│ Average rollout reward:          -30.669879301750427                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:44[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18555, 29965, 30000]                                                                                                │
│ Average cumulative reward:       -32.249495591125594                                                                                                                     │
│ Average rollout reward:          -29.536313685100364                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:44[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18555, 29965, 30000]                                                                                                │
│ Average cumulative reward:       -32.249495591125594                                                                                                                     │
│ Average rollout reward:          -29.536313685100364                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:44[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18555, 29965, 30000]                                                                                                │
│ Average cumulative reward:       -32.249495591125594                                                                                                                     │
│ Average rollout reward:          -29.536313685100364                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:01:44[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18555, 29965, 30000]                                                                                                │
│ Average cumulative reward:       -32.249495591125594                                                                                                                     │
│ Average rollout reward:          -29.536313685100364                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:44[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 18555, 29965, 30000]                                                                                                │
│ Average cumulative reward:       -32.249495591125594                                                                                                                     │
│ Average rollout reward:          -29.536313685100364                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:01:42[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 25389, 25394, 25952, 27862, 28856, 30521, 30889, 30899, 31000]                                                                      │
│ Average cumulative reward:       -32.78705524062231                                                                                                                      │
│ Average rollout reward:          -29.98625038953513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:42[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 25389, 25394, 25952, 27862, 28856, 30521, 30889, 30899, 31000]                                                                      │
│ Average cumulative reward:       -32.78705524062231                                                                                                                      │
│ Average rollout reward:          -29.98625038953513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:01:42[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 25389, 25394, 25952, 27862, 28856, 30521, 30889, 30899, 31000]                                                                      │
│ Average cumulative reward:       -32.78705524062231                                                                                                                      │
│ Average rollout reward:          -29.98625038953513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:42[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 25389, 25394, 25952, 27862, 28856, 30521, 30889, 30899, 31000]                                                                      │
│ Average cumulative reward:       -32.78705524062231                                                                                                                      │
│ Average rollout reward:          -29.98625038953513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:01:40[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 31658, 31663, 31896, 32000]                                                                                                         │
│ Average cumulative reward:       -33.24588909143803                                                                                                                      │
│ Average rollout reward:          -30.629718303159972                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:40[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 31658, 31663, 31896, 32000]                                                                                                         │
│ Average cumulative reward:       -33.24588909143803                                                                                                                      │
│ Average rollout reward:          -30.629718303159972                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:01:40[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 31658, 31663, 31896, 32000]                                                                                                         │
│ Average cumulative reward:       -33.24588909143803                                                                                                                      │
│ Average rollout reward:          -30.629718303159972                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:40[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 25359, 31658, 31663, 31896, 32000]                                                                                                         │
│ Average cumulative reward:       -33.24588909143803                                                                                                                      │
│ Average rollout reward:          -30.629718303159972                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:01:38[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 27719, 28280, 28285, 31808, 31933, 31938, 31997, 33000]                                                                                    │
│ Average cumulative reward:       -33.3624707721278                                                                                                                       │
│ Average rollout reward:          -30.998500694506415                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:38[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 27719, 28280, 28285, 31808, 31933, 31938, 31997, 33000]                                                                                    │
│ Average cumulative reward:       -33.3624707721278                                                                                                                       │
│ Average rollout reward:          -30.998500694506415                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:01:38[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 27719, 28280, 28285, 31808, 31933, 31938, 31997, 33000]                                                                                    │
│ Average cumulative reward:       -33.3624707721278                                                                                                                       │
│ Average rollout reward:          -30.998500694506415                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:38[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 27719, 28280, 28285, 31808, 31933, 31938, 31997, 33000]                                                                                    │
│ Average cumulative reward:       -33.3624707721278                                                                                                                       │
│ Average rollout reward:          -30.998500694506415                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:01:36[0m   2.10 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7719, 33469, 33474, 33985, 34000]                                                                                                          │
│ Average cumulative reward:       -33.312738747756825                                                                                                                     │
│ Average rollout reward:          -30.677791614108354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:36[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7719, 33469, 33474, 33985, 34000]                                                                                                          │
│ Average cumulative reward:       -33.312738747756825                                                                                                                     │
│ Average rollout reward:          -30.677791614108354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:01:36[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7719, 33469, 33474, 33985, 34000]                                                                                                          │
│ Average cumulative reward:       -33.312738747756825                                                                                                                     │
│ Average rollout reward:          -30.677791614108354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:36[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7719, 33469, 33474, 33985, 34000]                                                                                                          │
│ Average cumulative reward:       -33.312738747756825                                                                                                                     │
│ Average rollout reward:          -30.677791614108354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:01:36[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7719, 33469, 33474, 33985, 34000]                                                                                                          │
│ Average cumulative reward:       -33.312738747756825                                                                                                                     │
│ Average rollout reward:          -30.677791614108354                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:34[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 16152, 16157, 16226, 16231, 16251, 35000]                                                                                         │
│ Average cumulative reward:       -33.122213934747855                                                                                                                     │
│ Average rollout reward:          -30.421549592302824                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:01:34[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 16152, 16157, 16226, 16231, 16251, 35000]                                                                                         │
│ Average cumulative reward:       -33.122213934747855                                                                                                                     │
│ Average rollout reward:          -30.421549592302824                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:34[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 16152, 16157, 16226, 16231, 16251, 35000]                                                                                         │
│ Average cumulative reward:       -33.122213934747855                                                                                                                     │
│ Average rollout reward:          -30.421549592302824                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:01:34[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 16152, 16157, 16226, 16231, 16251, 35000]                                                                                         │
│ Average cumulative reward:       -33.122213934747855                                                                                                                     │
│ Average rollout reward:          -30.421549592302824                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:01:32[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 35945, 35946, 35980, 35982, 35984, 35999, 36000]                                                                                                       │
│ Average cumulative reward:       -33.58085732185911                                                                                                                      │
│ Average rollout reward:          -30.866632285535577                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:01:32[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 35945, 35946, 35980, 35982, 35984, 35999, 36000]                                                                                                       │
│ Average cumulative reward:       -33.58085732185911                                                                                                                      │
│ Average rollout reward:          -30.866632285535577                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:01:32[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 35945, 35946, 35980, 35982, 35984, 35999, 36000]                                                                                                       │
│ Average cumulative reward:       -33.58085732185911                                                                                                                      │
│ Average rollout reward:          -30.866632285535577                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:01:32[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 35945, 35946, 35980, 35982, 35984, 35999, 36000]                                                                                                       │
│ Average cumulative reward:       -33.58085732185911                                                                                                                      │
│ Average rollout reward:          -30.866632285535577                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:01:30[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 12470, 12471, 34791, 37000]                                                                                                                            │
│ Average cumulative reward:       -33.50904275264073                                                                                                                      │
│ Average rollout reward:          -30.905389527942933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:01:30[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 12470, 12471, 34791, 37000]                                                                                                                            │
│ Average cumulative reward:       -33.50904275264073                                                                                                                      │
│ Average rollout reward:          -30.905389527942933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:01:30[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 12470, 12471, 34791, 37000]                                                                                                                            │
│ Average cumulative reward:       -33.50904275264073                                                                                                                      │
│ Average rollout reward:          -30.905389527942933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:01:30[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 12470, 12471, 34791, 37000]                                                                                                                            │
│ Average cumulative reward:       -33.50904275264073                                                                                                                      │
│ Average rollout reward:          -30.905389527942933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:01:30[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 12470, 12471, 34791, 37000]                                                                                                                            │
│ Average cumulative reward:       -33.50904275264073                                                                                                                      │
│ Average rollout reward:          -30.905389527942933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:01:28[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 37672, 37677, 37827, 37980, 38000]                                                                                                  │
│ Average cumulative reward:       -33.831312159040294                                                                                                                     │
│ Average rollout reward:          -31.42481056073194                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:01:28[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 37672, 37677, 37827, 37980, 38000]                                                                                                  │
│ Average cumulative reward:       -33.831312159040294                                                                                                                     │
│ Average rollout reward:          -31.42481056073194                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:01:28[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 37672, 37677, 37827, 37980, 38000]                                                                                                  │
│ Average cumulative reward:       -33.831312159040294                                                                                                                     │
│ Average rollout reward:          -31.42481056073194                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:01:28[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 37672, 37677, 37827, 37980, 38000]                                                                                                  │
│ Average cumulative reward:       -33.831312159040294                                                                                                                     │
│ Average rollout reward:          -31.42481056073194                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:01:26[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 7677, 39000]                                                                                                                               │
│ Average cumulative reward:       -33.076938697699646                                                                                                                     │
│ Average rollout reward:          -30.760146305049933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:01:26[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 7677, 39000]                                                                                                                               │
│ Average cumulative reward:       -33.076938697699646                                                                                                                     │
│ Average rollout reward:          -30.760146305049933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:01:26[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 7677, 39000]                                                                                                                               │
│ Average cumulative reward:       -33.076938697699646                                                                                                                     │
│ Average rollout reward:          -30.760146305049933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:01:26[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 7677, 39000]                                                                                                                               │
│ Average cumulative reward:       -33.076938697699646                                                                                                                     │
│ Average rollout reward:          -30.760146305049933                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:01:24[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 39654, 39925, 40000]                                                                                                     │
│ Average cumulative reward:       -33.05793485567439                                                                                                                      │
│ Average rollout reward:          -30.481403018733815                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:01:24[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 39654, 39925, 40000]                                                                                                     │
│ Average cumulative reward:       -33.05793485567439                                                                                                                      │
│ Average rollout reward:          -30.481403018733815                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:01:24[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 39654, 39925, 40000]                                                                                                     │
│ Average cumulative reward:       -33.05793485567439                                                                                                                      │
│ Average rollout reward:          -30.481403018733815                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:01:24[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7850, 7852, 39654, 39925, 40000]                                                                                                     │
│ Average cumulative reward:       -33.05793485567439                                                                                                                      │
│ Average rollout reward:          -30.481403018733815                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:01:22[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 41000]                                                                                                                                                      │
│ Average cumulative reward:       -33.20433399959555                                                                                                                      │
│ Average rollout reward:          -30.40561540910513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:01:22[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 41000]                                                                                                                                                      │
│ Average cumulative reward:       -33.20433399959555                                                                                                                      │
│ Average rollout reward:          -30.40561540910513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:01:22[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 41000]                                                                                                                                                      │
│ Average cumulative reward:       -33.20433399959555                                                                                                                      │
│ Average rollout reward:          -30.40561540910513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:01:22[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 41000]                                                                                                                                                      │
│ Average cumulative reward:       -33.20433399959555                                                                                                                      │
│ Average rollout reward:          -30.40561540910513                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:01:20[0m   2.11 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1130, 2719, 2723, 14189, 14272, 14282, 25294, 40667, 40907, 42000]                                                                         │
│ Average cumulative reward:       -33.68363664952201                                                                                                                      │
│ Average rollout reward:          -31.472200417006178                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:01:20[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1130, 2719, 2723, 14189, 14272, 14282, 25294, 40667, 40907, 42000]                                                                         │
│ Average cumulative reward:       -33.68363664952201                                                                                                                      │
│ Average rollout reward:          -31.472200417006178                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:01:20[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1130, 2719, 2723, 14189, 14272, 14282, 25294, 40667, 40907, 42000]                                                                         │
│ Average cumulative reward:       -33.68363664952201                                                                                                                      │
│ Average rollout reward:          -31.472200417006178                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:01:20[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1130, 2719, 2723, 14189, 14272, 14282, 25294, 40667, 40907, 42000]                                                                         │
│ Average cumulative reward:       -33.68363664952201                                                                                                                      │
│ Average rollout reward:          -31.472200417006178                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:01:20[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 1130, 2719, 2723, 14189, 14272, 14282, 25294, 40667, 40907, 42000]                                                                         │
│ Average cumulative reward:       -33.68363664952201                                                                                                                      │
│ Average rollout reward:          -31.472200417006178                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:01:17[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 42895, 42900, 42935, 42945, 43000]                                                                                                  │
│ Average cumulative reward:       -33.16222224599866                                                                                                                      │
│ Average rollout reward:          -30.480334248924205                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:01:17[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 42895, 42900, 42935, 42945, 43000]                                                                                                  │
│ Average cumulative reward:       -33.16222224599866                                                                                                                      │
│ Average rollout reward:          -30.480334248924205                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:01:17[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 42895, 42900, 42935, 42945, 43000]                                                                                                  │
│ Average cumulative reward:       -33.16222224599866                                                                                                                      │
│ Average rollout reward:          -30.480334248924205                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:01:17[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 42895, 42900, 42935, 42945, 43000]                                                                                                  │
│ Average cumulative reward:       -33.16222224599866                                                                                                                      │
│ Average rollout reward:          -30.480334248924205                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:01:15[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43848, 43868, 43888, 44000]                                                                                             │
│ Average cumulative reward:       -33.0804139858359                                                                                                                       │
│ Average rollout reward:          -30.540391709337857                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:01:15[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43848, 43868, 43888, 44000]                                                                                             │
│ Average cumulative reward:       -33.0804139858359                                                                                                                       │
│ Average rollout reward:          -30.540391709337857                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:01:15[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43848, 43868, 43888, 44000]                                                                                             │
│ Average cumulative reward:       -33.0804139858359                                                                                                                       │
│ Average rollout reward:          -30.540391709337857                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:01:15[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43848, 43868, 43888, 44000]                                                                                             │
│ Average cumulative reward:       -33.0804139858359                                                                                                                       │
│ Average rollout reward:          -30.540391709337857                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:01:14[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 29330, 41192, 41197, 42430, 42668, 42673, 44860, 44990, 45000]                                                      │
│ Average cumulative reward:       -33.91784712612287                                                                                                                      │
│ Average rollout reward:          -31.19632017771074                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:01:14[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 29330, 41192, 41197, 42430, 42668, 42673, 44860, 44990, 45000]                                                      │
│ Average cumulative reward:       -33.91784712612287                                                                                                                      │
│ Average rollout reward:          -31.19632017771074                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:01:14[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 29330, 41192, 41197, 42430, 42668, 42673, 44860, 44990, 45000]                                                      │
│ Average cumulative reward:       -33.91784712612287                                                                                                                      │
│ Average rollout reward:          -31.19632017771074                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:01:14[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 29330, 41192, 41197, 42430, 42668, 42673, 44860, 44990, 45000]                                                      │
│ Average cumulative reward:       -33.91784712612287                                                                                                                      │
│ Average rollout reward:          -31.19632017771074                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:01:14[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12423, 12614, 12615, 29330, 41192, 41197, 42430, 42668, 42673, 44860, 44990, 45000]                                                      │
│ Average cumulative reward:       -33.91784712612287                                                                                                                      │
│ Average rollout reward:          -31.19632017771074                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:01:11[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 15003, 27133, 27143, 27184, 27218, 46000]                                                                                                  │
│ Average cumulative reward:       -33.8325535939556                                                                                                                       │
│ Average rollout reward:          -31.503243637725497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:01:11[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 15003, 27133, 27143, 27184, 27218, 46000]                                                                                                  │
│ Average cumulative reward:       -33.8325535939556                                                                                                                       │
│ Average rollout reward:          -31.503243637725497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:01:11[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 15003, 27133, 27143, 27184, 27218, 46000]                                                                                                  │
│ Average cumulative reward:       -33.8325535939556                                                                                                                       │
│ Average rollout reward:          -31.503243637725497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:01:11[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 15003, 27133, 27143, 27184, 27218, 46000]                                                                                                  │
│ Average cumulative reward:       -33.8325535939556                                                                                                                       │
│ Average rollout reward:          -31.503243637725497                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:01:09[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43653, 44668, 44688, 44851, 45127, 47000]                                                                               │
│ Average cumulative reward:       -33.559202438331766                                                                                                                     │
│ Average rollout reward:          -30.672860902779433                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:01:09[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43653, 44668, 44688, 44851, 45127, 47000]                                                                               │
│ Average cumulative reward:       -33.559202438331766                                                                                                                     │
│ Average rollout reward:          -30.672860902779433                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:01:09[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43653, 44668, 44688, 44851, 45127, 47000]                                                                               │
│ Average cumulative reward:       -33.559202438331766                                                                                                                     │
│ Average rollout reward:          -30.672860902779433                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:01:09[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 43464, 43469, 43653, 44668, 44688, 44851, 45127, 47000]                                                                               │
│ Average cumulative reward:       -33.559202438331766                                                                                                                     │
│ Average rollout reward:          -30.672860902779433                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:01:07[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 43569, 43574, 46400, 47935, 48000]                                                                                                  │
│ Average cumulative reward:       -33.959285885898176                                                                                                                     │
│ Average rollout reward:          -31.16891233779459                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:01:07[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 43569, 43574, 46400, 47935, 48000]                                                                                                  │
│ Average cumulative reward:       -33.959285885898176                                                                                                                     │
│ Average rollout reward:          -31.16891233779459                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:01:07[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 43569, 43574, 46400, 47935, 48000]                                                                                                  │
│ Average cumulative reward:       -33.959285885898176                                                                                                                     │
│ Average rollout reward:          -31.16891233779459                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:01:07[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 43569, 43574, 46400, 47935, 48000]                                                                                                  │
│ Average cumulative reward:       -33.959285885898176                                                                                                                     │
│ Average rollout reward:          -31.16891233779459                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:01:05[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 48494, 48499, 48995, 49000]                                                                                                         │
│ Average cumulative reward:       -33.26915095197864                                                                                                                      │
│ Average rollout reward:          -30.784414581293397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:01:05[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 48494, 48499, 48995, 49000]                                                                                                         │
│ Average cumulative reward:       -33.26915095197864                                                                                                                      │
│ Average rollout reward:          -30.784414581293397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:01:05[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 48494, 48499, 48995, 49000]                                                                                                         │
│ Average cumulative reward:       -33.26915095197864                                                                                                                      │
│ Average rollout reward:          -30.784414581293397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:01:05[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 48494, 48499, 48995, 49000]                                                                                                         │
│ Average cumulative reward:       -33.26915095197864                                                                                                                      │
│ Average rollout reward:          -30.784414581293397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:01:05[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 42845, 48494, 48499, 48995, 49000]                                                                                                         │
│ Average cumulative reward:       -33.26915095197864                                                                                                                      │
│ Average rollout reward:          -30.784414581293397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:01:03[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7889, 7890, 7897, 7901, 8315, 50000]                                                                                                 │
│ Average cumulative reward:       -33.27535421732282                                                                                                                      │
│ Average rollout reward:          -30.50480014488795                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:01:03[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7889, 7890, 7897, 7901, 8315, 50000]                                                                                                 │
│ Average cumulative reward:       -33.27535421732282                                                                                                                      │
│ Average rollout reward:          -30.50480014488795                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:01:03[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7889, 7890, 7897, 7901, 8315, 50000]                                                                                                 │
│ Average cumulative reward:       -33.27535421732282                                                                                                                      │
│ Average rollout reward:          -30.50480014488795                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:01:03[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 7709, 7889, 7890, 7897, 7901, 8315, 50000]                                                                                                 │
│ Average cumulative reward:       -33.27535421732282                                                                                                                      │
│ Average rollout reward:          -30.50480014488795                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:01:01[0m   2.12 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 8656, 15413, 15674, 50684, 51000]                                                                                                          │
│ Average cumulative reward:       -33.50448180013336                                                                                                                      │
│ Average rollout reward:          -30.9738476963205                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:01:01[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 8656, 15413, 15674, 50684, 51000]                                                                                                          │
│ Average cumulative reward:       -33.50448180013336                                                                                                                      │
│ Average rollout reward:          -30.9738476963205                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:01:01[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 8656, 15413, 15674, 50684, 51000]                                                                                                          │
│ Average cumulative reward:       -33.50448180013336                                                                                                                      │
│ Average rollout reward:          -30.9738476963205                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:01:01[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 8656, 15413, 15674, 50684, 51000]                                                                                                          │
│ Average cumulative reward:       -33.50448180013336                                                                                                                      │
│ Average rollout reward:          -30.9738476963205                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:01:01[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 8656, 15413, 15674, 50684, 51000]                                                                                                          │
│ Average cumulative reward:       -33.50448180013336                                                                                                                      │
│ Average rollout reward:          -30.9738476963205                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:59[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 32468, 32473, 36504, 39364, 39369, 51103, 52000]                                                                                    │
│ Average cumulative reward:       -33.93176468938234                                                                                                                      │
│ Average rollout reward:          -31.220669043953812                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:59[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 32468, 32473, 36504, 39364, 39369, 51103, 52000]                                                                                    │
│ Average cumulative reward:       -33.93176468938234                                                                                                                      │
│ Average rollout reward:          -31.220669043953812                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:59[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 32468, 32473, 36504, 39364, 39369, 51103, 52000]                                                                                    │
│ Average cumulative reward:       -33.93176468938234                                                                                                                      │
│ Average rollout reward:          -31.220669043953812                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:59[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 32453, 32468, 32473, 36504, 39364, 39369, 51103, 52000]                                                                                    │
│ Average cumulative reward:       -33.93176468938234                                                                                                                      │
│ Average rollout reward:          -31.220669043953812                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:57[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 37488, 37493, 49329, 49429, 49434, 52680, 53000]                                                                                      │
│ Average cumulative reward:       -33.47382618740703                                                                                                                      │
│ Average rollout reward:          -30.771075584365313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:57[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 37488, 37493, 49329, 49429, 49434, 52680, 53000]                                                                                      │
│ Average cumulative reward:       -33.47382618740703                                                                                                                      │
│ Average rollout reward:          -30.771075584365313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:57[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 37488, 37493, 49329, 49429, 49434, 52680, 53000]                                                                                      │
│ Average cumulative reward:       -33.47382618740703                                                                                                                      │
│ Average rollout reward:          -30.771075584365313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:57[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 343, 344, 37248, 37488, 37493, 49329, 49429, 49434, 52680, 53000]                                                                                      │
│ Average cumulative reward:       -33.47382618740703                                                                                                                      │
│ Average rollout reward:          -30.771075584365313                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:55[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 34833, 34913, 34918, 36553, 49599, 49624, 49874, 53502, 54000]                                                                             │
│ Average cumulative reward:       -34.63901440258083                                                                                                                      │
│ Average rollout reward:          -31.914358063273653                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:55[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 34833, 34913, 34918, 36553, 49599, 49624, 49874, 53502, 54000]                                                                             │
│ Average cumulative reward:       -34.63901440258083                                                                                                                      │
│ Average rollout reward:          -31.914358063273653                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:55[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 34833, 34913, 34918, 36553, 49599, 49624, 49874, 53502, 54000]                                                                             │
│ Average cumulative reward:       -34.63901440258083                                                                                                                      │
│ Average rollout reward:          -31.914358063273653                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:55[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 34833, 34913, 34918, 36553, 49599, 49624, 49874, 53502, 54000]                                                                             │
│ Average cumulative reward:       -34.63901440258083                                                                                                                      │
│ Average rollout reward:          -31.914358063273653                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:53[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 53084, 53089, 54222, 54729, 54734, 55000]                                                                                         │
│ Average cumulative reward:       -33.67735760249788                                                                                                                      │
│ Average rollout reward:          -31.029892905269016                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:53[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 53084, 53089, 54222, 54729, 54734, 55000]                                                                                         │
│ Average cumulative reward:       -33.67735760249788                                                                                                                      │
│ Average rollout reward:          -31.029892905269016                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:57[0m Remaining: [36m0:00:53[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 53084, 53089, 54222, 54729, 54734, 55000]                                                                                         │
│ Average cumulative reward:       -33.67735760249788                                                                                                                      │
│ Average rollout reward:          -31.029892905269016                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:53[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 53084, 53089, 54222, 54729, 54734, 55000]                                                                                         │
│ Average cumulative reward:       -33.67735760249788                                                                                                                      │
│ Average rollout reward:          -31.029892905269016                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:58[0m Remaining: [36m0:00:53[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 53084, 53089, 54222, 54729, 54734, 55000]                                                                                         │
│ Average cumulative reward:       -33.67735760249788                                                                                                                      │
│ Average rollout reward:          -31.029892905269016                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:50[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 54998, 55003, 56000]                                                                                                                │
│ Average cumulative reward:       -33.070993185669664                                                                                                                     │
│ Average rollout reward:          -30.592278775229104                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:59[0m Remaining: [36m0:00:50[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 54998, 55003, 56000]                                                                                                                │
│ Average cumulative reward:       -33.070993185669664                                                                                                                     │
│ Average rollout reward:          -30.592278775229104                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:50[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 54998, 55003, 56000]                                                                                                                │
│ Average cumulative reward:       -33.070993185669664                                                                                                                     │
│ Average rollout reward:          -30.592278775229104                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:02:00[0m Remaining: [36m0:00:50[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 54998, 55003, 56000]                                                                                                                │
│ Average cumulative reward:       -33.070993185669664                                                                                                                     │
│ Average rollout reward:          -30.592278775229104                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:48[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55890, 56484, 57000]                                                                                                  │
│ Average cumulative reward:       -33.458832205786635                                                                                                                     │
│ Average rollout reward:          -30.99078591987509                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:02:01[0m Remaining: [36m0:00:48[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55890, 56484, 57000]                                                                                                  │
│ Average cumulative reward:       -33.458832205786635                                                                                                                     │
│ Average rollout reward:          -30.99078591987509                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:48[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55890, 56484, 57000]                                                                                                  │
│ Average cumulative reward:       -33.458832205786635                                                                                                                     │
│ Average rollout reward:          -30.99078591987509                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:02:02[0m Remaining: [36m0:00:48[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55890, 56484, 57000]                                                                                                  │
│ Average cumulative reward:       -33.458832205786635                                                                                                                     │
│ Average rollout reward:          -30.99078591987509                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:46[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55272, 58000]                                                                                                         │
│ Average cumulative reward:       -33.74347528063444                                                                                                                      │
│ Average rollout reward:          -30.87382382985579                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:02:03[0m Remaining: [36m0:00:46[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55272, 58000]                                                                                                         │
│ Average cumulative reward:       -33.74347528063444                                                                                                                      │
│ Average rollout reward:          -30.87382382985579                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:46[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55272, 58000]                                                                                                         │
│ Average cumulative reward:       -33.74347528063444                                                                                                                      │
│ Average rollout reward:          -30.87382382985579                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:02:04[0m Remaining: [36m0:00:46[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55272, 58000]                                                                                                         │
│ Average cumulative reward:       -33.74347528063444                                                                                                                      │
│ Average rollout reward:          -30.87382382985579                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:46[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 55247, 55252, 55272, 58000]                                                                                                         │
│ Average cumulative reward:       -33.74347528063444                                                                                                                      │
│ Average rollout reward:          -30.87382382985579                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:02:05[0m Remaining: [36m0:00:44[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 59000]                                                                                                                                   │
│ Average cumulative reward:       -33.8306159830475                                                                                                                       │
│ Average rollout reward:          -31.418898733247172                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:44[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 59000]                                                                                                                                   │
│ Average cumulative reward:       -33.8306159830475                                                                                                                       │
│ Average rollout reward:          -31.418898733247172                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:02:06[0m Remaining: [36m0:00:44[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 59000]                                                                                                                                   │
│ Average cumulative reward:       -33.8306159830475                                                                                                                       │
│ Average rollout reward:          -31.418898733247172                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:44[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 59000]                                                                                                                                   │
│ Average cumulative reward:       -33.8306159830475                                                                                                                       │
│ Average rollout reward:          -31.418898733247172                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:02:07[0m Remaining: [36m0:00:42[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 59815, 59816, 59999, 60000]                                                                                                       │
│ Average cumulative reward:       -33.30219080882936                                                                                                                      │
│ Average rollout reward:          -30.959320158720626                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:02:08[0m Remaining: [36m0:00:42[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 59815, 59816, 59999, 60000]                                                                                                       │
│ Average cumulative reward:       -33.30219080882936                                                                                                                      │
│ Average rollout reward:          -30.959320158720626                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:42[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 59815, 59816, 59999, 60000]                                                                                                       │
│ Average cumulative reward:       -33.30219080882936                                                                                                                      │
│ Average rollout reward:          -30.959320158720626                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:02:09[0m Remaining: [36m0:00:42[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 59815, 59816, 59999, 60000]                                                                                                       │
│ Average cumulative reward:       -33.30219080882936                                                                                                                      │
│ Average rollout reward:          -30.959320158720626                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:40[0m   2.13 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60990, 60991, 60994, 61000]                                                                                  │
│ Average cumulative reward:       -33.062055036646406                                                                                                                     │
│ Average rollout reward:          -30.33948495793596                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:02:10[0m Remaining: [36m0:00:40[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60990, 60991, 60994, 61000]                                                                                  │
│ Average cumulative reward:       -33.062055036646406                                                                                                                     │
│ Average rollout reward:          -30.33948495793596                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:40[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60990, 60991, 60994, 61000]                                                                                  │
│ Average cumulative reward:       -33.062055036646406                                                                                                                     │
│ Average rollout reward:          -30.33948495793596                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:02:11[0m Remaining: [36m0:00:40[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60990, 60991, 60994, 61000]                                                                                  │
│ Average cumulative reward:       -33.062055036646406                                                                                                                     │
│ Average rollout reward:          -30.33948495793596                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:40[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60990, 60991, 60994, 61000]                                                                                  │
│ Average cumulative reward:       -33.062055036646406                                                                                                                     │
│ Average rollout reward:          -30.33948495793596                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:12[0m Remaining: [36m0:00:38[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 5354, 5364, 5365, 5375, 5381, 62000]                                                                                                                        │
│ Average cumulative reward:       -35.06269877413431                                                                                                                      │
│ Average rollout reward:          -31.93643726995983                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:38[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 5354, 5364, 5365, 5375, 5381, 62000]                                                                                                                        │
│ Average cumulative reward:       -35.06269877413431                                                                                                                      │
│ Average rollout reward:          -31.93643726995983                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:13[0m Remaining: [36m0:00:38[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 5354, 5364, 5365, 5375, 5381, 62000]                                                                                                                        │
│ Average cumulative reward:       -35.06269877413431                                                                                                                      │
│ Average rollout reward:          -31.93643726995983                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:38[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 5354, 5364, 5365, 5375, 5381, 62000]                                                                                                                        │
│ Average cumulative reward:       -35.06269877413431                                                                                                                      │
│ Average rollout reward:          -31.93643726995983                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:14[0m Remaining: [36m0:00:36[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 62590, 62914, 62915, 63000]                                                                    │
│ Average cumulative reward:       -34.91388982816366                                                                                                                      │
│ Average rollout reward:          -31.71356607280605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:36[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 62590, 62914, 62915, 63000]                                                                    │
│ Average cumulative reward:       -34.91388982816366                                                                                                                      │
│ Average rollout reward:          -31.71356607280605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:15[0m Remaining: [36m0:00:36[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 62590, 62914, 62915, 63000]                                                                    │
│ Average cumulative reward:       -34.91388982816366                                                                                                                      │
│ Average rollout reward:          -31.71356607280605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:16[0m Remaining: [36m0:00:36[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 62590, 62914, 62915, 63000]                                                                    │
│ Average cumulative reward:       -34.91388982816366                                                                                                                      │
│ Average rollout reward:          -31.71356607280605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:02:16[0m Remaining: [36m0:00:36[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 62590, 62914, 62915, 63000]                                                                    │
│ Average cumulative reward:       -34.91388982816366                                                                                                                      │
│ Average rollout reward:          -31.71356607280605                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:17[0m Remaining: [36m0:00:33[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 63039, 63040, 63133, 63139, 63140, 64000]                                                                    │
│ Average cumulative reward:       -34.91923212107676                                                                                                                      │
│ Average rollout reward:          -31.292268073691268                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:17[0m Remaining: [36m0:00:33[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 63039, 63040, 63133, 63139, 63140, 64000]                                                                    │
│ Average cumulative reward:       -34.91923212107676                                                                                                                      │
│ Average rollout reward:          -31.292268073691268                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:18[0m Remaining: [36m0:00:33[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 63039, 63040, 63133, 63139, 63140, 64000]                                                                    │
│ Average cumulative reward:       -34.91923212107676                                                                                                                      │
│ Average rollout reward:          -31.292268073691268                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:02:18[0m Remaining: [36m0:00:33[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 63039, 63040, 63133, 63139, 63140, 64000]                                                                    │
│ Average cumulative reward:       -34.91923212107676                                                                                                                      │
│ Average rollout reward:          -31.292268073691268                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:19[0m Remaining: [36m0:00:31[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 64927, 64928, 64938, 65000]                                                                    │
│ Average cumulative reward:       -34.63900797316453                                                                                                                      │
│ Average rollout reward:          -30.48839251600417                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:19[0m Remaining: [36m0:00:31[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 64927, 64928, 64938, 65000]                                                                    │
│ Average cumulative reward:       -34.63900797316453                                                                                                                      │
│ Average rollout reward:          -30.48839251600417                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:20[0m Remaining: [36m0:00:31[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 64927, 64928, 64938, 65000]                                                                    │
│ Average cumulative reward:       -34.63900797316453                                                                                                                      │
│ Average rollout reward:          -30.48839251600417                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:20[0m Remaining: [36m0:00:31[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 64927, 64928, 64938, 65000]                                                                    │
│ Average cumulative reward:       -34.63900797316453                                                                                                                      │
│ Average rollout reward:          -30.48839251600417                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:02:21[0m Remaining: [36m0:00:31[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 64927, 64928, 64938, 65000]                                                                    │
│ Average cumulative reward:       -34.63900797316453                                                                                                                      │
│ Average rollout reward:          -30.48839251600417                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:21[0m Remaining: [36m0:00:29[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 65801, 65998, 66000]                                                                           │
│ Average cumulative reward:       -34.94199699270583                                                                                                                      │
│ Average rollout reward:          -31.2910409484582                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:22[0m Remaining: [36m0:00:29[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 65801, 65998, 66000]                                                                           │
│ Average cumulative reward:       -34.94199699270583                                                                                                                      │
│ Average rollout reward:          -31.2910409484582                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:22[0m Remaining: [36m0:00:29[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 65801, 65998, 66000]                                                                           │
│ Average cumulative reward:       -34.94199699270583                                                                                                                      │
│ Average rollout reward:          -31.2910409484582                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:02:23[0m Remaining: [36m0:00:29[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 65801, 65998, 66000]                                                                           │
│ Average cumulative reward:       -34.94199699270583                                                                                                                      │
│ Average rollout reward:          -31.2910409484582                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:23[0m Remaining: [36m0:00:27[0m   2.14 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 66549, 66550, 66692, 66695, 67000]                                                                           │
│ Average cumulative reward:       -34.90918431559539                                                                                                                      │
│ Average rollout reward:          -31.391748949537416                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:24[0m Remaining: [36m0:00:27[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 66549, 66550, 66692, 66695, 67000]                                                                           │
│ Average cumulative reward:       -34.90918431559539                                                                                                                      │
│ Average rollout reward:          -31.391748949537416                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:24[0m Remaining: [36m0:00:27[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 66549, 66550, 66692, 66695, 67000]                                                                           │
│ Average cumulative reward:       -34.90918431559539                                                                                                                      │
│ Average rollout reward:          -31.391748949537416                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:25[0m Remaining: [36m0:00:27[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 66549, 66550, 66692, 66695, 67000]                                                                           │
│ Average cumulative reward:       -34.90918431559539                                                                                                                      │
│ Average rollout reward:          -31.391748949537416                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:02:25[0m Remaining: [36m0:00:27[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 66549, 66550, 66692, 66695, 67000]                                                                           │
│ Average cumulative reward:       -34.90918431559539                                                                                                                      │
│ Average rollout reward:          -31.391748949537416                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:26[0m Remaining: [36m0:00:25[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 67111, 67542, 67543, 67987, 67991, 68000]                                                      │
│ Average cumulative reward:       -33.55433068124152                                                                                                                      │
│ Average rollout reward:          -30.244478244556397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:26[0m Remaining: [36m0:00:25[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 67111, 67542, 67543, 67987, 67991, 68000]                                                      │
│ Average cumulative reward:       -33.55433068124152                                                                                                                      │
│ Average rollout reward:          -30.244478244556397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:27[0m Remaining: [36m0:00:25[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 67111, 67542, 67543, 67987, 67991, 68000]                                                      │
│ Average cumulative reward:       -33.55433068124152                                                                                                                      │
│ Average rollout reward:          -30.244478244556397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:02:27[0m Remaining: [36m0:00:25[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 67111, 67542, 67543, 67987, 67991, 68000]                                                      │
│ Average cumulative reward:       -33.55433068124152                                                                                                                      │
│ Average rollout reward:          -30.244478244556397                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:28[0m Remaining: [36m0:00:23[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 56541, 56546, 56561, 57051, 57624, 69000]                                                                                           │
│ Average cumulative reward:       -33.64724867144808                                                                                                                      │
│ Average rollout reward:          -30.177797299114385                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:28[0m Remaining: [36m0:00:23[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 56541, 56546, 56561, 57051, 57624, 69000]                                                                                           │
│ Average cumulative reward:       -33.64724867144808                                                                                                                      │
│ Average rollout reward:          -30.177797299114385                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:29[0m Remaining: [36m0:00:23[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 56541, 56546, 56561, 57051, 57624, 69000]                                                                                           │
│ Average cumulative reward:       -33.64724867144808                                                                                                                      │
│ Average rollout reward:          -30.177797299114385                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:02:29[0m Remaining: [36m0:00:23[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 54842, 56541, 56546, 56561, 57051, 57624, 69000]                                                                                           │
│ Average cumulative reward:       -33.64724867144808                                                                                                                      │
│ Average rollout reward:          -30.177797299114385                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯7m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:30[0m Remaining: [36m0:00:20[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 69890, 69896, 69920, 69952, 70000]                                                                                                  │
│ Average cumulative reward:       -32.91916098032679                                                                                                                      │
│ Average rollout reward:          -30.23831611355065                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:30[0m Remaining: [36m0:00:20[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 69890, 69896, 69920, 69952, 70000]                                                                                                  │
│ Average cumulative reward:       -32.91916098032679                                                                                                                      │
│ Average rollout reward:          -30.23831611355065                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:31[0m Remaining: [36m0:00:20[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 69890, 69896, 69920, 69952, 70000]                                                                                                  │
│ Average cumulative reward:       -32.91916098032679                                                                                                                      │
│ Average rollout reward:          -30.23831611355065                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:31[0m Remaining: [36m0:00:20[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 69890, 69896, 69920, 69952, 70000]                                                                                                  │
│ Average cumulative reward:       -32.91916098032679                                                                                                                      │
│ Average rollout reward:          -30.23831611355065                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:02:32[0m Remaining: [36m0:00:20[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 69890, 69896, 69920, 69952, 70000]                                                                                                  │
│ Average cumulative reward:       -32.91916098032679                                                                                                                      │
│ Average rollout reward:          -30.23831611355065                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:32[0m Remaining: [36m0:00:18[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 70958, 70964, 70988, 71000]                                                                                                         │
│ Average cumulative reward:       -33.45124593077002                                                                                                                      │
│ Average rollout reward:          -30.800651842646285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:33[0m Remaining: [36m0:00:18[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 70958, 70964, 70988, 71000]                                                                                                         │
│ Average cumulative reward:       -33.45124593077002                                                                                                                      │
│ Average rollout reward:          -30.800651842646285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:33[0m Remaining: [36m0:00:18[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 70958, 70964, 70988, 71000]                                                                                                         │
│ Average cumulative reward:       -33.45124593077002                                                                                                                      │
│ Average rollout reward:          -30.800651842646285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:02:34[0m Remaining: [36m0:00:18[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 70958, 70964, 70988, 71000]                                                                                                         │
│ Average cumulative reward:       -33.45124593077002                                                                                                                      │
│ Average rollout reward:          -30.800651842646285                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:34[0m Remaining: [36m0:00:16[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 71712, 71718, 71994, 72000]                                                                                                         │
│ Average cumulative reward:       -32.99291653851972                                                                                                                      │
│ Average rollout reward:          -29.921940940158258                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:35[0m Remaining: [36m0:00:16[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 71712, 71718, 71994, 72000]                                                                                                         │
│ Average cumulative reward:       -32.99291653851972                                                                                                                      │
│ Average rollout reward:          -29.921940940158258                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:35[0m Remaining: [36m0:00:16[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 71712, 71718, 71994, 72000]                                                                                                         │
│ Average cumulative reward:       -32.99291653851972                                                                                                                      │
│ Average rollout reward:          -29.921940940158258                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:02:36[0m Remaining: [36m0:00:16[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 7698, 7699, 58071, 71712, 71718, 71994, 72000]                                                                                                         │
│ Average cumulative reward:       -32.99291653851972                                                                                                                      │
│ Average rollout reward:          -29.921940940158258                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:36[0m Remaining: [36m0:00:14[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 72838, 72844, 73000]                                                                                                              │
│ Average cumulative reward:       -34.54559178323207                                                                                                                      │
│ Average rollout reward:          -31.899170216301606                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:37[0m Remaining: [36m0:00:14[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 72838, 72844, 73000]                                                                                                              │
│ Average cumulative reward:       -34.54559178323207                                                                                                                      │
│ Average rollout reward:          -31.899170216301606                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:37[0m Remaining: [36m0:00:14[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 72838, 72844, 73000]                                                                                                              │
│ Average cumulative reward:       -34.54559178323207                                                                                                                      │
│ Average rollout reward:          -31.899170216301606                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:38[0m Remaining: [36m0:00:14[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 72838, 72844, 73000]                                                                                                              │
│ Average cumulative reward:       -34.54559178323207                                                                                                                      │
│ Average rollout reward:          -31.899170216301606                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:02:38[0m Remaining: [36m0:00:14[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 10727, 10728, 12227, 72838, 72844, 73000]                                                                                                              │
│ Average cumulative reward:       -34.54559178323207                                                                                                                      │
│ Average rollout reward:          -31.899170216301606                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:39[0m Remaining: [36m0:00:12[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 73501, 73573, 73585, 73928, 74000]                                                             │
│ Average cumulative reward:       -34.92239492039496                                                                                                                      │
│ Average rollout reward:          -32.06943355438986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:39[0m Remaining: [36m0:00:12[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 73501, 73573, 73585, 73928, 74000]                                                             │
│ Average cumulative reward:       -34.92239492039496                                                                                                                      │
│ Average rollout reward:          -32.06943355438986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:40[0m Remaining: [36m0:00:12[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 73501, 73573, 73585, 73928, 74000]                                                             │
│ Average cumulative reward:       -34.92239492039496                                                                                                                      │
│ Average rollout reward:          -32.06943355438986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:02:40[0m Remaining: [36m0:00:12[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 64806, 64807, 73501, 73573, 73585, 73928, 74000]                                                             │
│ Average cumulative reward:       -34.92239492039496                                                                                                                      │
│ Average rollout reward:          -32.06943355438986                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:41[0m Remaining: [36m0:00:09[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 73495, 74963, 75000]                                                                           │
│ Average cumulative reward:       -32.54322739267404                                                                                                                      │
│ Average rollout reward:          -29.922884083202582                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:41[0m Remaining: [36m0:00:09[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 73495, 74963, 75000]                                                                           │
│ Average cumulative reward:       -32.54322739267404                                                                                                                      │
│ Average rollout reward:          -29.922884083202582                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:42[0m Remaining: [36m0:00:09[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 73495, 74963, 75000]                                                                           │
│ Average cumulative reward:       -32.54322739267404                                                                                                                      │
│ Average rollout reward:          -29.922884083202582                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:42[0m Remaining: [36m0:00:09[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 73495, 74963, 75000]                                                                           │
│ Average cumulative reward:       -32.54322739267404                                                                                                                      │
│ Average rollout reward:          -29.922884083202582                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:02:43[0m Remaining: [36m0:00:09[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 58772, 58773, 58860, 58907, 58909, 58930, 60620, 60621, 73495, 74963, 75000]                                                                           │
│ Average cumulative reward:       -32.54322739267404                                                                                                                      │
│ Average rollout reward:          -29.922884083202582                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:43[0m Remaining: [36m0:00:07[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 75946, 75952, 76000]                                                                                                                │
│ Average cumulative reward:       -34.253946025691874                                                                                                                     │
│ Average rollout reward:          -31.25390328761856                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:44[0m Remaining: [36m0:00:07[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 75946, 75952, 76000]                                                                                                                │
│ Average cumulative reward:       -34.253946025691874                                                                                                                     │
│ Average rollout reward:          -31.25390328761856                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:44[0m Remaining: [36m0:00:07[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 75946, 75952, 76000]                                                                                                                │
│ Average cumulative reward:       -34.253946025691874                                                                                                                     │
│ Average rollout reward:          -31.25390328761856                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:02:45[0m Remaining: [36m0:00:07[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 1125, 1126, 70396, 75946, 75952, 76000]                                                                                                                │
│ Average cumulative reward:       -34.253946025691874                                                                                                                     │
│ Average rollout reward:          -31.25390328761856                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:45[0m Remaining: [36m0:00:05[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 76982, 77000]                                                                                                                       │
│ Average cumulative reward:       -33.37826576206035                                                                                                                      │
│ Average rollout reward:          -30.823639661385407                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:46[0m Remaining: [36m0:00:05[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 76982, 77000]                                                                                                                       │
│ Average cumulative reward:       -33.37826576206035                                                                                                                      │
│ Average rollout reward:          -30.823639661385407                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:46[0m Remaining: [36m0:00:05[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 76982, 77000]                                                                                                                       │
│ Average cumulative reward:       -33.37826576206035                                                                                                                      │
│ Average rollout reward:          -30.823639661385407                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:02:47[0m Remaining: [36m0:00:05[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 76982, 77000]                                                                                                                       │
│ Average cumulative reward:       -33.37826576206035                                                                                                                      │
│ Average rollout reward:          -30.823639661385407                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:47[0m Remaining: [36m0:00:03[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 77397, 77403, 77439, 78000]                                                                                                         │
│ Average cumulative reward:       -35.31350133320566                                                                                                                      │
│ Average rollout reward:          -32.388563976614115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:48[0m Remaining: [36m0:00:03[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 77397, 77403, 77439, 78000]                                                                                                         │
│ Average cumulative reward:       -35.31350133320566                                                                                                                      │
│ Average rollout reward:          -32.388563976614115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:48[0m Remaining: [36m0:00:03[0m   2.16 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 77397, 77403, 77439, 78000]                                                                                                         │
│ Average cumulative reward:       -35.31350133320566                                                                                                                      │
│ Average rollout reward:          -32.388563976614115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:49[0m Remaining: [36m0:00:03[0m   2.17 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 77397, 77403, 77439, 78000]                                                                                                         │
│ Average cumulative reward:       -35.31350133320566                                                                                                                      │
│ Average rollout reward:          -32.388563976614115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:02:49[0m Remaining: [36m0:00:03[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 77397, 77403, 77439, 78000]                                                                                                         │
│ Average cumulative reward:       -35.31350133320566                                                                                                                      │
│ Average rollout reward:          -32.388563976614115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/79 [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100.0%[0m Elapsed: [33m0:02:49[0m Remaining: [36m0:00:00[0m   2.15 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 1, 320, 5567, 5568, 76970, 77397, 77403, 77439, 78000]                                                                                                         │
│ Average cumulative reward:       -35.31350133320566                                                                                                                      │
│ Average rollout reward:          -32.388563976614115                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -3.1293908950696454                                                                                                                             │
│ Best path: [0, 1, 320, 343]                                                                                                                                              │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[?25hNode 0 is not terminal. Continue.
Node 1 is not terminal. Continue.
Node 320 is not terminal. Continue.
Node 343 is not terminal. Continue.
Node 344 is not terminal. Continue.
Node 349 is not terminal. Continue.
Node 565 is not terminal. Continue.
Node 566 is not terminal. Continue.
Node 19332 is not terminal. Continue.
Node 19726 is not terminal. Continue.
Node 19736 is not terminal. Continue.
Node 29381 is not terminal. Continue.
Node 29401 is not terminal. Continue.
Node 29441 is not terminal. Continue.
Node 31004 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 1 is not terminal. Continue.
Node 78999 is not terminal. Continue.
Node 79000 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 1 is not terminal. Continue.
Node 320 is not terminal. Continue.
Node 343 is not terminal. Continue.
Node 344 is not terminal. Continue.
Node 349 is not terminal. Continue.
Node 565 is not terminal. Continue.
Node 566 is not terminal. Continue.
Node 19332 is not terminal. Continue.
Node 19726 is not terminal. Continue.
Node 19736 is not terminal. Continue.
Node 29381 is not terminal. Continue.
Node 29401 is not terminal. Continue.
Node 29441 is not terminal. Continue.
Node 72792 is not terminal. Continue.
No children found. Stop.
=== RESULT ===
By Visits: estimated reward: -48.35795589335645
proot_iannazzo [0.276881  4.1856103]
proot_iannazzo [2.7392838 0.31385  ]
proot_iannazzo [6.915397 2.683457]
proot_iannazzo [0.6212663 6.123971 ]
By Value: estimated reward: -5.215651491782742
By Best Value: estimated reward: 0
proot_iannazzo [0.276881  4.1856103]
proot_iannazzo [2, 1]
proot_iannazzo [2, 1]
Best value of root node:
-3.1293908950696454
Best root policy:
proot_iannazzo [0.276881  4.1856103]
proot_iannazzo [2, 1]
proot_iannazzo [2, 1]
=== END ===
Finished making algorithm
