Matrix distribution: quartic_saddle
Matrix distribution config: {'c': 0.25, 'd': 5000, 'eps': 0.001}
Initial matrix shape: torch.Size([5000, 5000])
Algorithm name: mcts
Algorithm config: {'c_ucb': 5.0, 'alpha_pw': 0.4, 'epsilon': 1e-06, 'EXPLORE_K': 5, 'early_termination_epsilon': 1e-05, 'budget': 80000, 'print_every': 1000, 'max_termination_count': 10, 'tree_initial_capacity': 10000, 'device': 'cuda', 'actions': [['sign_ns', [[0, 0], [5, 5]]], ['sign_newton', [[0], [40]]], ['sign_quintic', [[0, 0, 0], [5, 5, 5]]], ['sign_halley', [[0, 0, 0], [40, 40, 40]]]], 'initialize_with_baselines': True}
Actions: ['sign_halley', 'sign_newton', 'sign_ns', 'sign_quintic']
Action sign_halley took 1.0 times longer than sign_halley
Action sign_newton took 0.4001862568461517 times longer than sign_halley
Action sign_ns took 0.17346748376456303 times longer than sign_halley
Action sign_quintic took 0.2545852417877044 times longer than sign_halley
Skipping sign_newton_variant because not all actions are in the tree
Skipping inv_ns because not all actions are in the tree
Skipping inv_ns_chebyshev because not all actions are in the tree
Skipping sqrt_db because not all actions are in the tree
Skipping sqrt_nsv because not all actions are in the tree
Skipping sqrt_visser because not all actions are in the tree
Skipping sqrt_newton because not all actions are in the tree
Skipping sqrt_visser_coupled because not all actions are in the tree
Skipping sqrt_newton_coupled because not all actions are in the tree
Skipping proot_newton because not all actions are in the tree
Skipping proot_visser because not all actions are in the tree
Skipping proot_iannazzo because not all actions are in the tree
[?25l0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:00[0m Remaining: [36m-:--:--[0m 501739.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-21. -21.]                                                                                                                                                              │
│ [-2.82873103 -2.82873103]                                                                                                                                                │
│ [-2.8013038 -2.8013038 -2.8013038]                                                                                                                                       │
│ [-2.60201226 -2.60201226 -2.60201226]                                                                                                                                    │
│ [-2.57458502 -2.57458502 -2.57458502]                                                                                                                                    │
│ [-2.40111754 -2.40111754 -2.40111754]                                                                                                                                    │
│ [-2.18811238 -2.18811238 -2.18811238 -1.78792613]                                                                                                                        │
│ [-2.0146449  -2.0146449  -2.0146449  -1.61445864]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1004082.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-21. -21.]                                                                                                                                                              │
│ [-2.82873103 -2.82873103]                                                                                                                                                │
│ [-2.8013038 -2.8013038 -2.8013038]                                                                                                                                       │
│ [-2.60201226 -2.60201226 -2.60201226]                                                                                                                                    │
│ [-2.57458502 -2.57458502 -2.57458502]                                                                                                                                    │
│ [-2.40111754 -2.40111754 -2.40111754]                                                                                                                                    │
│ [-2.18811238 -2.18811238 -2.18811238 -1.78792613]                                                                                                                        │
│ [-2.0146449  -2.0146449  -2.0146449  -1.61445864]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K0/79 [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m0.0%[0m Elapsed: [33m0:00:01[0m Remaining: [36m-:--:--[0m 1507860.94 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 0 ===                                                                                                                                                      │
│ 1  nodes in tree                                                                                                                                                         │
│ [-21. -21.]                                                                                                                                                              │
│ [-2.82873103 -2.82873103]                                                                                                                                                │
│ [-2.8013038 -2.8013038 -2.8013038]                                                                                                                                       │
│ [-2.60201226 -2.60201226 -2.60201226]                                                                                                                                    │
│ [-2.57458502 -2.57458502 -2.57458502]                                                                                                                                    │
│ [-2.40111754 -2.40111754 -2.40111754]                                                                                                                                    │
│ [-2.18811238 -2.18811238 -2.18811238 -1.78792613]                                                                                                                        │
│ [-2.0146449  -2.0146449  -2.0146449  -1.61445864]                                                                                                                        │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 288, 290, 301, 1000]                                                                                                                                        │
│ Average cumulative reward:       -6.967210287478496                                                                                                                      │
│ Average rollout reward:          -6.575060880747034                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:02[0m Remaining: [36m-:--:--[0m   2.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 288, 290, 301, 1000]                                                                                                                                        │
│ Average cumulative reward:       -6.967210287478496                                                                                                                      │
│ Average rollout reward:          -6.575060880747034                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K1/79 [38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m1.3%[0m Elapsed: [33m0:00:03[0m Remaining: [36m-:--:--[0m   3.02 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 1000 ===                                                                                                                                                   │
│ 1001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 288, 290, 301, 1000]                                                                                                                                        │
│ Average cumulative reward:       -6.967210287478496                                                                                                                      │
│ Average rollout reward:          -6.575060880747034                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:03[0m Remaining: [36m0:02:11[0m   1.76 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 465, 467, 1859, 1911, 2000]                                                                                                                                 │
│ Average cumulative reward:       -7.550343845792796                                                                                                                      │
│ Average rollout reward:          -7.035084715378789                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:11[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 465, 467, 1859, 1911, 2000]                                                                                                                                 │
│ Average cumulative reward:       -7.550343845792796                                                                                                                      │
│ Average rollout reward:          -7.035084715378789                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K2/79 [38;2;249;38;114m━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m2.5%[0m Elapsed: [33m0:00:04[0m Remaining: [36m0:02:11[0m   2.27 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 2000 ===                                                                                                                                                   │
│ 2001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 465, 467, 1859, 1911, 2000]                                                                                                                                 │
│ Average cumulative reward:       -7.550343845792796                                                                                                                      │
│ Average rollout reward:          -7.035084715378789                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:08[0m   1.68 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2954, 2955, 2969, 2986, 3000]                                                                                                                               │
│ Average cumulative reward:       -7.959199625115052                                                                                                                      │
│ Average rollout reward:          -7.432652997814491                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:05[0m Remaining: [36m0:02:08[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2954, 2955, 2969, 2986, 3000]                                                                                                                               │
│ Average cumulative reward:       -7.959199625115052                                                                                                                      │
│ Average rollout reward:          -7.432652997814491                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:08[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2954, 2955, 2969, 2986, 3000]                                                                                                                               │
│ Average cumulative reward:       -7.959199625115052                                                                                                                      │
│ Average rollout reward:          -7.432652997814491                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K3/79 [38;2;249;38;114m━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m3.8%[0m Elapsed: [33m0:00:06[0m Remaining: [36m0:02:08[0m   2.18 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 3000 ===                                                                                                                                                   │
│ 3001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 2954, 2955, 2969, 2986, 3000]                                                                                                                               │
│ Average cumulative reward:       -7.959199625115052                                                                                                                      │
│ Average rollout reward:          -7.432652997814491                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:07[0m   1.76 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3926, 3928, 3998, 4000]                                                                                                                                     │
│ Average cumulative reward:       -7.601459635538885                                                                                                                      │
│ Average rollout reward:          -7.021447803688864                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:07[0m Remaining: [36m0:02:07[0m   1.89 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3926, 3928, 3998, 4000]                                                                                                                                     │
│ Average cumulative reward:       -7.601459635538885                                                                                                                      │
│ Average rollout reward:          -7.021447803688864                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K4/79 [38;2;249;38;114m━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m5.1%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:07[0m   2.01 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 4000 ===                                                                                                                                                   │
│ 4001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 3926, 3928, 3998, 4000]                                                                                                                                     │
│ Average cumulative reward:       -7.601459635538885                                                                                                                      │
│ Average rollout reward:          -7.021447803688864                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:08[0m Remaining: [36m0:02:06[0m   1.71 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 56, 59, 4725, 4949, 5000]                                                                                                                                   │
│ Average cumulative reward:       -7.443880321266852                                                                                                                      │
│ Average rollout reward:          -6.8529432442875375                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:06[0m   1.81 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 56, 59, 4725, 4949, 5000]                                                                                                                                   │
│ Average cumulative reward:       -7.443880321266852                                                                                                                      │
│ Average rollout reward:          -6.8529432442875375                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K5/79 [38;2;249;38;114m━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m6.3%[0m Elapsed: [33m0:00:09[0m Remaining: [36m0:02:06[0m   1.91 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 5000 ===                                                                                                                                                   │
│ 5001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 56, 59, 4725, 4949, 5000]                                                                                                                                   │
│ Average cumulative reward:       -7.443880321266852                                                                                                                      │
│ Average rollout reward:          -6.8529432442875375                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:04[0m   1.68 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5927, 5928, 5994, 5998, 6000]                                                                                                                               │
│ Average cumulative reward:       -7.384396741651995                                                                                                                      │
│ Average rollout reward:          -6.7596067930565855                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:10[0m Remaining: [36m0:02:04[0m   1.76 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5927, 5928, 5994, 5998, 6000]                                                                                                                               │
│ Average cumulative reward:       -7.384396741651995                                                                                                                      │
│ Average rollout reward:          -6.7596067930565855                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:04[0m   1.85 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5927, 5928, 5994, 5998, 6000]                                                                                                                               │
│ Average cumulative reward:       -7.384396741651995                                                                                                                      │
│ Average rollout reward:          -6.7596067930565855                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K6/79 [38;2;249;38;114m━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m7.6%[0m Elapsed: [33m0:00:11[0m Remaining: [36m0:02:04[0m   1.93 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 6000 ===                                                                                                                                                   │
│ 6001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 5927, 5928, 5994, 5998, 6000]                                                                                                                               │
│ Average cumulative reward:       -7.384396741651995                                                                                                                      │
│ Average rollout reward:          -6.7596067930565855                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:03[0m   1.73 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 218, 221, 228, 232, 4487, 7000]                                                                                                                             │
│ Average cumulative reward:       -7.600395071738114                                                                                                                      │
│ Average rollout reward:          -6.999240333180401                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:12[0m Remaining: [36m0:02:03[0m   1.80 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 218, 221, 228, 232, 4487, 7000]                                                                                                                             │
│ Average cumulative reward:       -7.600395071738114                                                                                                                      │
│ Average rollout reward:          -6.999240333180401                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K7/79 [38;2;249;38;114m━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m8.9%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:03[0m   1.87 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 7000 ===                                                                                                                                                   │
│ 7001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 218, 221, 228, 232, 4487, 7000]                                                                                                                             │
│ Average cumulative reward:       -7.600395071738114                                                                                                                      │
│ Average rollout reward:          -6.999240333180401                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:13[0m Remaining: [36m0:02:02[0m   1.70 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 9, 27, 2390, 3040, 8000]                                                                                                                                    │
│ Average cumulative reward:       -7.565580681055834                                                                                                                      │
│ Average rollout reward:          -6.9229267879805585                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:02[0m   1.76 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 9, 27, 2390, 3040, 8000]                                                                                                                                    │
│ Average cumulative reward:       -7.565580681055834                                                                                                                      │
│ Average rollout reward:          -6.9229267879805585                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K8/79 [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m10.1%[0m Elapsed: [33m0:00:14[0m Remaining: [36m0:02:02[0m   1.83 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 8000 ===                                                                                                                                                   │
│ 8001  nodes in tree                                                                                                                                                      │
│ Path: [0, 2, 9, 27, 2390, 3040, 8000]                                                                                                                                    │
│ Average cumulative reward:       -7.565580681055834                                                                                                                      │
│ Average rollout reward:          -6.9229267879805585                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:01:59[0m   1.68 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 936, 950, 3730, 9000]                                                                                                                                       │
│ Average cumulative reward:       -7.953972109597171                                                                                                                      │
│ Average rollout reward:          -7.533900135725737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:15[0m Remaining: [36m0:01:59[0m   1.74 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 936, 950, 3730, 9000]                                                                                                                                       │
│ Average cumulative reward:       -7.953972109597171                                                                                                                      │
│ Average rollout reward:          -7.533900135725737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K9/79 [38;2;249;38;114m━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m11.4%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:01:59[0m   1.79 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 9000 ===                                                                                                                                                   │
│ 9001  nodes in tree                                                                                                                                                      │
│ Path: [0, 3, 936, 950, 3730, 9000]                                                                                                                                       │
│ Average cumulative reward:       -7.953972109597171                                                                                                                      │
│ Average rollout reward:          -7.533900135725737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:16[0m Remaining: [36m0:01:54[0m   1.66 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9935, 9936, 9954, 10000]                                                                                                                                    │
│ Average cumulative reward:       -7.062512125660831                                                                                                                      │
│ Average rollout reward:          -6.899428470038395                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:01:54[0m   1.71 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9935, 9936, 9954, 10000]                                                                                                                                    │
│ Average cumulative reward:       -7.062512125660831                                                                                                                      │
│ Average rollout reward:          -6.899428470038395                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K10/79 [38;2;249;38;114m━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m12.7%[0m Elapsed: [33m0:00:17[0m Remaining: [36m0:01:54[0m   1.76 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 10000 ===                                                                                                                                                  │
│ 10001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 9935, 9936, 9954, 10000]                                                                                                                                    │
│ Average cumulative reward:       -7.062512125660831                                                                                                                      │
│ Average rollout reward:          -6.899428470038395                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:01:50[0m   1.65 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10058, 10273, 10276, 10710, 11000]                                                                                                                          │
│ Average cumulative reward:       -6.4552917992664325                                                                                                                     │
│ Average rollout reward:          -6.288886437394762                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K11/79 [38;2;249;38;114m━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m13.9%[0m Elapsed: [33m0:00:18[0m Remaining: [36m0:01:50[0m   1.69 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 11000 ===                                                                                                                                                  │
│ 11001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10058, 10273, 10276, 10710, 11000]                                                                                                                          │
│ Average cumulative reward:       -6.4552917992664325                                                                                                                     │
│ Average rollout reward:          -6.288886437394762                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:01:47[0m   1.60 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6872, 11824, 12000]                                                                                                                                         │
│ Average cumulative reward:       -6.844440818121843                                                                                                                      │
│ Average rollout reward:          -6.669990590265806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K12/79 [38;2;249;38;114m━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:19[0m Remaining: [36m0:01:47[0m   1.64 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6872, 11824, 12000]                                                                                                                                         │
│ Average cumulative reward:       -6.844440818121843                                                                                                                      │
│ Average rollout reward:          -6.669990590265806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m15.2%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:01:47[0m   1.68 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 12000 ===                                                                                                                                                  │
│ 12001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 6872, 11824, 12000]                                                                                                                                         │
│ Average cumulative reward:       -6.844440818121843                                                                                                                      │
│ Average rollout reward:          -6.669990590265806                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:20[0m Remaining: [36m0:01:44[0m   1.59 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10886, 12215, 12432, 13000]                                                                                                                                 │
│ Average cumulative reward:       -6.584581201288338                                                                                                                      │
│ Average rollout reward:          -6.4029017967513635                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:01:44[0m   1.63 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10886, 12215, 12432, 13000]                                                                                                                                 │
│ Average cumulative reward:       -6.584581201288338                                                                                                                      │
│ Average rollout reward:          -6.4029017967513635                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K13/79 [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m16.5%[0m Elapsed: [33m0:00:21[0m Remaining: [36m0:01:44[0m   1.67 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 13000 ===                                                                                                                                                  │
│ 13001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10886, 12215, 12432, 13000]                                                                                                                                 │
│ Average cumulative reward:       -6.584581201288338                                                                                                                      │
│ Average rollout reward:          -6.4029017967513635                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:01:41[0m   1.58 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 13936, 13939, 13967, 14000]                                                                                                                                 │
│ Average cumulative reward:       -6.746198465211138                                                                                                                      │
│ Average rollout reward:          -6.571454553676905                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K14/79 [38;2;249;38;114m━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m17.7%[0m Elapsed: [33m0:00:22[0m Remaining: [36m0:01:41[0m   1.62 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 14000 ===                                                                                                                                                  │
│ 14001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 13936, 13939, 13967, 14000]                                                                                                                                 │
│ Average cumulative reward:       -6.746198465211138                                                                                                                      │
│ Average rollout reward:          -6.571454553676905                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:01:38[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 14906, 14995, 14996, 15000]                                                                                                                                 │
│ Average cumulative reward:       -7.013189827489786                                                                                                                      │
│ Average rollout reward:          -6.81224662816095                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:23[0m Remaining: [36m0:01:38[0m   1.58 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 14906, 14995, 14996, 15000]                                                                                                                                 │
│ Average cumulative reward:       -7.013189827489786                                                                                                                      │
│ Average rollout reward:          -6.81224662816095                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K15/79 [38;2;249;38;114m━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m19.0%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:01:38[0m   1.61 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 15000 ===                                                                                                                                                  │
│ 15001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 14906, 14995, 14996, 15000]                                                                                                                                 │
│ Average cumulative reward:       -7.013189827489786                                                                                                                      │
│ Average rollout reward:          -6.81224662816095                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:24[0m Remaining: [36m0:01:36[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8440, 8442, 8501, 8503, 8507, 16000]                                                                                                                        │
│ Average cumulative reward:       -6.527166418197224                                                                                                                      │
│ Average rollout reward:          -6.277551345307121                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:01:36[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8440, 8442, 8501, 8503, 8507, 16000]                                                                                                                        │
│ Average cumulative reward:       -6.527166418197224                                                                                                                      │
│ Average rollout reward:          -6.277551345307121                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K16/79 [38;2;249;38;114m━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m20.3%[0m Elapsed: [33m0:00:25[0m Remaining: [36m0:01:36[0m   1.61 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 16000 ===                                                                                                                                                  │
│ 16001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8440, 8442, 8501, 8503, 8507, 16000]                                                                                                                        │
│ Average cumulative reward:       -6.527166418197224                                                                                                                      │
│ Average rollout reward:          -6.277551345307121                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:01:34[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16658, 16660, 16928, 16938, 17000]                                                                                                                          │
│ Average cumulative reward:       -7.062940539403871                                                                                                                      │
│ Average rollout reward:          -6.6016598883452255                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:26[0m Remaining: [36m0:01:34[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16658, 16660, 16928, 16938, 17000]                                                                                                                          │
│ Average cumulative reward:       -7.062940539403871                                                                                                                      │
│ Average rollout reward:          -6.6016598883452255                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K17/79 [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m21.5%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:01:34[0m   1.60 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 17000 ===                                                                                                                                                  │
│ 17001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16658, 16660, 16928, 16938, 17000]                                                                                                                          │
│ Average cumulative reward:       -7.062940539403871                                                                                                                      │
│ Average rollout reward:          -6.6016598883452255                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:27[0m Remaining: [36m0:01:33[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1682, 1684, 1707, 1714, 1752, 18000]                                                                                                                        │
│ Average cumulative reward:       -7.150632741116906                                                                                                                      │
│ Average rollout reward:          -6.697712818193557                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:33[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1682, 1684, 1707, 1714, 1752, 18000]                                                                                                                        │
│ Average cumulative reward:       -7.150632741116906                                                                                                                      │
│ Average rollout reward:          -6.697712818193557                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K18/79 [38;2;249;38;114m━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m22.8%[0m Elapsed: [33m0:00:28[0m Remaining: [36m0:01:33[0m   1.60 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 18000 ===                                                                                                                                                  │
│ 18001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 1682, 1684, 1707, 1714, 1752, 18000]                                                                                                                        │
│ Average cumulative reward:       -7.150632741116906                                                                                                                      │
│ Average rollout reward:          -6.697712818193557                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:31[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19000]                                                                                                                                                      │
│ Average cumulative reward:       -7.250056176800177                                                                                                                      │
│ Average rollout reward:          -6.8020454307747125                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:29[0m Remaining: [36m0:01:31[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19000]                                                                                                                                                      │
│ Average cumulative reward:       -7.250056176800177                                                                                                                      │
│ Average rollout reward:          -6.8020454307747125                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K19/79 [38;2;249;38;114m━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m24.1%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:31[0m   1.59 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 19000 ===                                                                                                                                                  │
│ 19001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 19000]                                                                                                                                                      │
│ Average cumulative reward:       -7.250056176800177                                                                                                                      │
│ Average rollout reward:          -6.8020454307747125                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:30[0m Remaining: [36m0:01:30[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 465, 467, 476, 481, 20000]                                                                                                                                  │
│ Average cumulative reward:       -7.083065893083822                                                                                                                      │
│ Average rollout reward:          -6.623676464382889                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:30[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 465, 467, 476, 481, 20000]                                                                                                                                  │
│ Average cumulative reward:       -7.083065893083822                                                                                                                      │
│ Average rollout reward:          -6.623676464382889                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K20/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m25.3%[0m Elapsed: [33m0:00:31[0m Remaining: [36m0:01:30[0m   1.59 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 20000 ===                                                                                                                                                  │
│ 20001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 465, 467, 476, 481, 20000]                                                                                                                                  │
│ Average cumulative reward:       -7.083065893083822                                                                                                                      │
│ Average rollout reward:          -6.623676464382889                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:28[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 20880, 20884, 20960, 21000]                                                                                                                                 │
│ Average cumulative reward:       -6.7490523215576275                                                                                                                     │
│ Average rollout reward:          -6.291265241058024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:32[0m Remaining: [36m0:01:28[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 20880, 20884, 20960, 21000]                                                                                                                                 │
│ Average cumulative reward:       -6.7490523215576275                                                                                                                     │
│ Average rollout reward:          -6.291265241058024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K21/79 [38;2;249;38;114m━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m26.6%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:28[0m   1.58 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 21000 ===                                                                                                                                                  │
│ 21001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 20880, 20884, 20960, 21000]                                                                                                                                 │
│ Average cumulative reward:       -6.7490523215576275                                                                                                                     │
│ Average rollout reward:          -6.291265241058024                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:33[0m Remaining: [36m0:01:26[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8440, 8442, 8476, 8482, 8512, 22000]                                                                                                                        │
│ Average cumulative reward:       -7.124615112349891                                                                                                                      │
│ Average rollout reward:          -6.651902527326452                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:26[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8440, 8442, 8476, 8482, 8512, 22000]                                                                                                                        │
│ Average cumulative reward:       -7.124615112349891                                                                                                                      │
│ Average rollout reward:          -6.651902527326452                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K22/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m27.8%[0m Elapsed: [33m0:00:34[0m Remaining: [36m0:01:26[0m   1.58 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 22000 ===                                                                                                                                                  │
│ 22001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 8440, 8442, 8476, 8482, 8512, 22000]                                                                                                                        │
│ Average cumulative reward:       -7.124615112349891                                                                                                                      │
│ Average rollout reward:          -6.651902527326452                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:24[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 22868, 22870, 22990, 22996, 23000]                                                                                                                          │
│ Average cumulative reward:       -7.044868112482173                                                                                                                      │
│ Average rollout reward:          -6.577327315218241                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:35[0m Remaining: [36m0:01:24[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 22868, 22870, 22990, 22996, 23000]                                                                                                                          │
│ Average cumulative reward:       -7.044868112482173                                                                                                                      │
│ Average rollout reward:          -6.577327315218241                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K23/79 [38;2;249;38;114m━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m29.1%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:24[0m   1.58 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 23000 ===                                                                                                                                                  │
│ 23001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 22868, 22870, 22990, 22996, 23000]                                                                                                                          │
│ Average cumulative reward:       -7.044868112482173                                                                                                                      │
│ Average rollout reward:          -6.577327315218241                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:36[0m Remaining: [36m0:01:22[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23554, 23556, 23958, 23982, 24000]                                                                                                                          │
│ Average cumulative reward:       -7.070908133636386                                                                                                                      │
│ Average rollout reward:          -6.64750964399391                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:22[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23554, 23556, 23958, 23982, 24000]                                                                                                                          │
│ Average cumulative reward:       -7.070908133636386                                                                                                                      │
│ Average rollout reward:          -6.64750964399391                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K24/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m30.4%[0m Elapsed: [33m0:00:37[0m Remaining: [36m0:01:22[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 24000 ===                                                                                                                                                  │
│ 24001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 23554, 23556, 23958, 23982, 24000]                                                                                                                          │
│ Average cumulative reward:       -7.070908133636386                                                                                                                      │
│ Average rollout reward:          -6.64750964399391                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:20[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 669, 670, 24600, 24794, 25000]                                                                                                                              │
│ Average cumulative reward:       -7.641091967947017                                                                                                                      │
│ Average rollout reward:          -7.2022185409238615                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:38[0m Remaining: [36m0:01:20[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 669, 670, 24600, 24794, 25000]                                                                                                                              │
│ Average cumulative reward:       -7.641091967947017                                                                                                                      │
│ Average rollout reward:          -7.2022185409238615                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K25/79 [38;2;249;38;114m━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m31.6%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:20[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 25000 ===                                                                                                                                                  │
│ 25001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 669, 670, 24600, 24794, 25000]                                                                                                                              │
│ Average cumulative reward:       -7.641091967947017                                                                                                                      │
│ Average rollout reward:          -7.2022185409238615                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:39[0m Remaining: [36m0:01:19[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6625, 6627, 6664, 6666, 26000]                                                                                                                              │
│ Average cumulative reward:       -6.620576262073011                                                                                                                      │
│ Average rollout reward:          -6.17831576706946                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:19[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6625, 6627, 6664, 6666, 26000]                                                                                                                              │
│ Average cumulative reward:       -6.620576262073011                                                                                                                      │
│ Average rollout reward:          -6.17831576706946                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K26/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m32.9%[0m Elapsed: [33m0:00:40[0m Remaining: [36m0:01:19[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 26000 ===                                                                                                                                                  │
│ 26001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 6625, 6627, 6664, 6666, 26000]                                                                                                                              │
│ Average cumulative reward:       -6.620576262073011                                                                                                                      │
│ Average rollout reward:          -6.17831576706946                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:17[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 25670, 25980, 26340, 27000]                                                                                                                   │
│ Average cumulative reward:       -7.25270545572384                                                                                                                       │
│ Average rollout reward:          -6.776482380177676                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:41[0m Remaining: [36m0:01:17[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 25670, 25980, 26340, 27000]                                                                                                                   │
│ Average cumulative reward:       -7.25270545572384                                                                                                                       │
│ Average rollout reward:          -6.776482380177676                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K27/79 [38;2;249;38;114m━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m34.2%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:17[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 27000 ===                                                                                                                                                  │
│ 27001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 25670, 25980, 26340, 27000]                                                                                                                   │
│ Average cumulative reward:       -7.25270545572384                                                                                                                       │
│ Average rollout reward:          -6.776482380177676                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:42[0m Remaining: [36m0:01:15[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 27932, 27940, 27974, 28000]                                                                                                                                 │
│ Average cumulative reward:       -7.318414563148854                                                                                                                      │
│ Average rollout reward:          -6.866561708422864                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:15[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 27932, 27940, 27974, 28000]                                                                                                                                 │
│ Average cumulative reward:       -7.318414563148854                                                                                                                      │
│ Average rollout reward:          -6.866561708422864                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K28/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m35.4%[0m Elapsed: [33m0:00:43[0m Remaining: [36m0:01:15[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 28000 ===                                                                                                                                                  │
│ 28001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 27932, 27940, 27974, 28000]                                                                                                                                 │
│ Average cumulative reward:       -7.318414563148854                                                                                                                      │
│ Average rollout reward:          -6.866561708422864                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:13[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28706, 28710, 28772, 28774, 28808, 29000]                                                                                                                   │
│ Average cumulative reward:       -7.439451730327215                                                                                                                      │
│ Average rollout reward:          -6.974128075852636                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:44[0m Remaining: [36m0:01:13[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28706, 28710, 28772, 28774, 28808, 29000]                                                                                                                   │
│ Average cumulative reward:       -7.439451730327215                                                                                                                      │
│ Average rollout reward:          -6.974128075852636                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K29/79 [38;2;249;38;114m━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m36.7%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:13[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 29000 ===                                                                                                                                                  │
│ 29001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 28706, 28710, 28772, 28774, 28808, 29000]                                                                                                                   │
│ Average cumulative reward:       -7.439451730327215                                                                                                                      │
│ Average rollout reward:          -6.974128075852636                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:45[0m Remaining: [36m0:01:12[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 4105, 4106, 24880, 25384, 30000]                                                                                                                            │
│ Average cumulative reward:       -7.04227402513908                                                                                                                       │
│ Average rollout reward:          -6.523979744508681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:12[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 4105, 4106, 24880, 25384, 30000]                                                                                                                            │
│ Average cumulative reward:       -7.04227402513908                                                                                                                       │
│ Average rollout reward:          -6.523979744508681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K30/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m38.0%[0m Elapsed: [33m0:00:46[0m Remaining: [36m0:01:12[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 30000 ===                                                                                                                                                  │
│ 30001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 4105, 4106, 24880, 25384, 30000]                                                                                                                            │
│ Average cumulative reward:       -7.04227402513908                                                                                                                       │
│ Average rollout reward:          -6.523979744508681                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:11[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 30292, 30296, 30478, 30486, 30516, 31000]                                                                                                                   │
│ Average cumulative reward:       -7.215367346553577                                                                                                                      │
│ Average rollout reward:          -6.714093893867985                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:47[0m Remaining: [36m0:01:11[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 30292, 30296, 30478, 30486, 30516, 31000]                                                                                                                   │
│ Average cumulative reward:       -7.215367346553577                                                                                                                      │
│ Average rollout reward:          -6.714093893867985                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K31/79 [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m39.2%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:11[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 31000 ===                                                                                                                                                  │
│ 31001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 30292, 30296, 30478, 30486, 30516, 31000]                                                                                                                   │
│ Average cumulative reward:       -7.215367346553577                                                                                                                      │
│ Average rollout reward:          -6.714093893867985                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:00:48[0m Remaining: [36m0:01:10[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31928, 31936, 31992, 32000]                                                                                                                                 │
│ Average cumulative reward:       -7.79402077568611                                                                                                                       │
│ Average rollout reward:          -7.35172074300905                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K32/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:10[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31928, 31936, 31992, 32000]                                                                                                                                 │
│ Average cumulative reward:       -7.79402077568611                                                                                                                       │
│ Average rollout reward:          -7.35172074300905                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m40.5%[0m Elapsed: [33m0:00:49[0m Remaining: [36m0:01:10[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 32000 ===                                                                                                                                                  │
│ 32001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 31928, 31936, 31992, 32000]                                                                                                                                 │
│ Average cumulative reward:       -7.79402077568611                                                                                                                       │
│ Average rollout reward:          -7.35172074300905                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:09[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 32766, 32770, 32984, 33000]                                                                                                                                 │
│ Average cumulative reward:       -6.864191665491827                                                                                                                      │
│ Average rollout reward:          -6.428533442139267                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:00:50[0m Remaining: [36m0:01:09[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 32766, 32770, 32984, 33000]                                                                                                                                 │
│ Average cumulative reward:       -6.864191665491827                                                                                                                      │
│ Average rollout reward:          -6.428533442139267                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K33/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [35m41.8%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:09[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 33000 ===                                                                                                                                                  │
│ 33001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 32766, 32770, 32984, 33000]                                                                                                                                 │
│ Average cumulative reward:       -6.864191665491827                                                                                                                      │
│ Average rollout reward:          -6.428533442139267                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:51[0m Remaining: [36m0:01:09[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 31722, 32292, 33170, 34000]                                                                                                                   │
│ Average cumulative reward:       -6.989986250469872                                                                                                                      │
│ Average rollout reward:          -6.533090289035737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:09[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 31722, 32292, 33170, 34000]                                                                                                                   │
│ Average cumulative reward:       -6.989986250469872                                                                                                                      │
│ Average rollout reward:          -6.533090289035737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:52[0m Remaining: [36m0:01:09[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 31722, 32292, 33170, 34000]                                                                                                                   │
│ Average cumulative reward:       -6.989986250469872                                                                                                                      │
│ Average rollout reward:          -6.533090289035737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K34/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m43.0%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:09[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 34000 ===                                                                                                                                                  │
│ 34001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 16100, 16104, 31722, 32292, 33170, 34000]                                                                                                                   │
│ Average cumulative reward:       -6.989986250469872                                                                                                                      │
│ Average rollout reward:          -6.533090289035737                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:53[0m Remaining: [36m0:01:08[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5071, 32738, 35000]                                                                                                                                   │
│ Average cumulative reward:       -6.864252518561674                                                                                                                      │
│ Average rollout reward:          -6.397939707299915                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:08[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5071, 32738, 35000]                                                                                                                                   │
│ Average cumulative reward:       -6.864252518561674                                                                                                                      │
│ Average rollout reward:          -6.397939707299915                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K35/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━[0m [35m44.3%[0m Elapsed: [33m0:00:54[0m Remaining: [36m0:01:08[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 35000 ===                                                                                                                                                  │
│ 35001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5071, 32738, 35000]                                                                                                                                   │
│ Average cumulative reward:       -6.864252518561674                                                                                                                      │
│ Average rollout reward:          -6.397939707299915                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:07[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5068, 17766, 36000]                                                                                                                                   │
│ Average cumulative reward:       -7.403572493016144                                                                                                                      │
│ Average rollout reward:          -6.882593918434397                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:00:55[0m Remaining: [36m0:01:07[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5068, 17766, 36000]                                                                                                                                   │
│ Average cumulative reward:       -7.403572493016144                                                                                                                      │
│ Average rollout reward:          -6.882593918434397                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K36/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m45.6%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:07[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 36000 ===                                                                                                                                                  │
│ 36001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5068, 17766, 36000]                                                                                                                                   │
│ Average cumulative reward:       -7.403572493016144                                                                                                                      │
│ Average rollout reward:          -6.882593918434397                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:00:56[0m Remaining: [36m0:01:05[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 465, 466, 1940, 23470, 37000]                                                                                                                               │
│ Average cumulative reward:       -7.290722449093204                                                                                                                      │
│ Average rollout reward:          -6.817379752691956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:05[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 465, 466, 1940, 23470, 37000]                                                                                                                               │
│ Average cumulative reward:       -7.290722449093204                                                                                                                      │
│ Average rollout reward:          -6.817379752691956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K37/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━[0m [35m46.8%[0m Elapsed: [33m0:00:57[0m Remaining: [36m0:01:05[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 37000 ===                                                                                                                                                  │
│ 37001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 465, 466, 1940, 23470, 37000]                                                                                                                               │
│ Average cumulative reward:       -7.290722449093204                                                                                                                      │
│ Average rollout reward:          -6.817379752691956                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:04[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 4, 29, 3457, 3515, 38000]                                                                                                                                   │
│ Average cumulative reward:       -7.153086751434607                                                                                                                      │
│ Average rollout reward:          -6.688895087651514                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:00:58[0m Remaining: [36m0:01:04[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 4, 29, 3457, 3515, 38000]                                                                                                                                   │
│ Average cumulative reward:       -7.153086751434607                                                                                                                      │
│ Average rollout reward:          -6.688895087651514                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K38/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m48.1%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:04[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 38000 ===                                                                                                                                                  │
│ 38001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 4, 29, 3457, 3515, 38000]                                                                                                                                   │
│ Average cumulative reward:       -7.153086751434607                                                                                                                      │
│ Average rollout reward:          -6.688895087651514                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:00:59[0m Remaining: [36m0:01:02[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39000]                                                                                                                                                      │
│ Average cumulative reward:       -7.036331485954965                                                                                                                      │
│ Average rollout reward:          -6.553357720149814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:02[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39000]                                                                                                                                                      │
│ Average cumulative reward:       -7.036331485954965                                                                                                                      │
│ Average rollout reward:          -6.553357720149814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K39/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━[0m [35m49.4%[0m Elapsed: [33m0:01:00[0m Remaining: [36m0:01:02[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 39000 ===                                                                                                                                                  │
│ 39001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 39000]                                                                                                                                                      │
│ Average cumulative reward:       -7.036331485954965                                                                                                                      │
│ Average rollout reward:          -6.553357720149814                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:01[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56, 59, 102, 334, 29660, 40000]                                                                                                                             │
│ Average cumulative reward:       -7.000858930183985                                                                                                                      │
│ Average rollout reward:          -6.550247775075982                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:01[0m Remaining: [36m0:01:01[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56, 59, 102, 334, 29660, 40000]                                                                                                                             │
│ Average cumulative reward:       -7.000858930183985                                                                                                                      │
│ Average rollout reward:          -6.550247775075982                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K40/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m50.6%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:01[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 40000 ===                                                                                                                                                  │
│ 40001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56, 59, 102, 334, 29660, 40000]                                                                                                                             │
│ Average cumulative reward:       -7.000858930183985                                                                                                                      │
│ Average rollout reward:          -6.550247775075982                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:02[0m Remaining: [36m0:01:00[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5071, 38996, 41000]                                                                                                                                   │
│ Average cumulative reward:       -6.940664268508102                                                                                                                      │
│ Average rollout reward:          -6.436814627623728                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:00[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5071, 38996, 41000]                                                                                                                                   │
│ Average cumulative reward:       -6.940664268508102                                                                                                                      │
│ Average rollout reward:          -6.436814627623728                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K41/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━[0m [35m51.9%[0m Elapsed: [33m0:01:03[0m Remaining: [36m0:01:00[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 41000 ===                                                                                                                                                  │
│ 41001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 5067, 5071, 38996, 41000]                                                                                                                                   │
│ Average cumulative reward:       -6.940664268508102                                                                                                                      │
│ Average rollout reward:          -6.436814627623728                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:00:58[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56, 59, 40568, 40750, 42000]                                                                                                                                │
│ Average cumulative reward:       -7.102736419775619                                                                                                                      │
│ Average rollout reward:          -6.598780276312611                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:04[0m Remaining: [36m0:00:58[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56, 59, 40568, 40750, 42000]                                                                                                                                │
│ Average cumulative reward:       -7.102736419775619                                                                                                                      │
│ Average rollout reward:          -6.598780276312611                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K42/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m53.2%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:00:58[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 42000 ===                                                                                                                                                  │
│ 42001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 56, 59, 40568, 40750, 42000]                                                                                                                                │
│ Average cumulative reward:       -7.102736419775619                                                                                                                      │
│ Average rollout reward:          -6.598780276312611                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:05[0m Remaining: [36m0:00:56[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42860, 42866, 42958, 43000]                                                                                                                                 │
│ Average cumulative reward:       -7.300180244682261                                                                                                                      │
│ Average rollout reward:          -6.818287596635134                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:06[0m Remaining: [36m0:00:56[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42860, 42866, 42958, 43000]                                                                                                                                 │
│ Average cumulative reward:       -7.300180244682261                                                                                                                      │
│ Average rollout reward:          -6.818287596635134                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K43/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [35m54.4%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:00:56[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 43000 ===                                                                                                                                                  │
│ 43001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 42860, 42866, 42958, 43000]                                                                                                                                 │
│ Average cumulative reward:       -7.300180244682261                                                                                                                      │
│ Average rollout reward:          -6.818287596635134                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:07[0m Remaining: [36m0:00:55[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 43858, 43862, 43902, 44000]                                                                                                                                 │
│ Average cumulative reward:       -7.205028747498485                                                                                                                      │
│ Average rollout reward:          -6.728111802017247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:00:55[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 43858, 43862, 43902, 44000]                                                                                                                                 │
│ Average cumulative reward:       -7.205028747498485                                                                                                                      │
│ Average rollout reward:          -6.728111802017247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:08[0m Remaining: [36m0:00:55[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 43858, 43862, 43902, 44000]                                                                                                                                 │
│ Average cumulative reward:       -7.205028747498485                                                                                                                      │
│ Average rollout reward:          -6.728111802017247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K44/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m55.7%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:00:55[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 44000 ===                                                                                                                                                  │
│ 44001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 43858, 43862, 43902, 44000]                                                                                                                                 │
│ Average cumulative reward:       -7.205028747498485                                                                                                                      │
│ Average rollout reward:          -6.728111802017247                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:09[0m Remaining: [36m0:00:53[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44872, 44876, 44898, 44920, 45000]                                                                                                                          │
│ Average cumulative reward:       -7.176448136374262                                                                                                                      │
│ Average rollout reward:          -6.6862971779484015                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:00:53[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44872, 44876, 44898, 44920, 45000]                                                                                                                          │
│ Average cumulative reward:       -7.176448136374262                                                                                                                      │
│ Average rollout reward:          -6.6862971779484015                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K45/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━[0m [35m57.0%[0m Elapsed: [33m0:01:10[0m Remaining: [36m0:00:53[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 45000 ===                                                                                                                                                  │
│ 45001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 44872, 44876, 44898, 44920, 45000]                                                                                                                          │
│ Average cumulative reward:       -7.176448136374262                                                                                                                      │
│ Average rollout reward:          -6.6862971779484015                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:00:52[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45898, 45900, 45940, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.092420362790838                                                                                                                      │
│ Average rollout reward:          -6.602473944665321                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:11[0m Remaining: [36m0:00:52[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45898, 45900, 45940, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.092420362790838                                                                                                                      │
│ Average rollout reward:          -6.602473944665321                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K46/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m58.2%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:00:52[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 46000 ===                                                                                                                                                  │
│ 46001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 45898, 45900, 45940, 46000]                                                                                                                                 │
│ Average cumulative reward:       -7.092420362790838                                                                                                                      │
│ Average rollout reward:          -6.602473944665321                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:12[0m Remaining: [36m0:00:50[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 46938, 46946, 46992, 47000]                                                                                                                                 │
│ Average cumulative reward:       -6.870047885475802                                                                                                                      │
│ Average rollout reward:          -6.393091402321068                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:00:50[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 46938, 46946, 46992, 47000]                                                                                                                                 │
│ Average cumulative reward:       -6.870047885475802                                                                                                                      │
│ Average rollout reward:          -6.393091402321068                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K47/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━[0m [35m59.5%[0m Elapsed: [33m0:01:13[0m Remaining: [36m0:00:50[0m   1.57 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 47000 ===                                                                                                                                                  │
│ 47001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 46938, 46946, 46992, 47000]                                                                                                                                 │
│ Average cumulative reward:       -6.870047885475802                                                                                                                      │
│ Average rollout reward:          -6.393091402321068                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:00:49[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2394, 2397, 42042, 44368, 48000]                                                                                                                            │
│ Average cumulative reward:       -7.39071248744465                                                                                                                       │
│ Average rollout reward:          -6.906377581008988                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:14[0m Remaining: [36m0:00:49[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2394, 2397, 42042, 44368, 48000]                                                                                                                            │
│ Average cumulative reward:       -7.39071248744465                                                                                                                       │
│ Average rollout reward:          -6.906377581008988                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K48/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m60.8%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:00:49[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 48000 ===                                                                                                                                                  │
│ 48001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 2394, 2397, 42042, 44368, 48000]                                                                                                                            │
│ Average cumulative reward:       -7.39071248744465                                                                                                                       │
│ Average rollout reward:          -6.906377581008988                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:15[0m Remaining: [36m0:00:47[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 669, 672, 47586, 47984, 48700, 49000]                                                                                                                       │
│ Average cumulative reward:       -7.559809705347467                                                                                                                      │
│ Average rollout reward:          -7.058854157223857                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:00:47[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 669, 672, 47586, 47984, 48700, 49000]                                                                                                                       │
│ Average cumulative reward:       -7.559809705347467                                                                                                                      │
│ Average rollout reward:          -7.058854157223857                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K49/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━[0m [35m62.0%[0m Elapsed: [33m0:01:16[0m Remaining: [36m0:00:47[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 49000 ===                                                                                                                                                  │
│ 49001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 669, 672, 47586, 47984, 48700, 49000]                                                                                                                       │
│ Average cumulative reward:       -7.559809705347467                                                                                                                      │
│ Average rollout reward:          -7.058854157223857                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:00:45[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49062, 49066, 49094, 49120, 49236, 50000]                                                                                                                   │
│ Average cumulative reward:       -6.728584645554392                                                                                                                      │
│ Average rollout reward:          -6.268382733897887                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:17[0m Remaining: [36m0:00:45[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49062, 49066, 49094, 49120, 49236, 50000]                                                                                                                   │
│ Average cumulative reward:       -6.728584645554392                                                                                                                      │
│ Average rollout reward:          -6.268382733897887                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K50/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m63.3%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:00:45[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 50000 ===                                                                                                                                                  │
│ 50001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 49062, 49066, 49094, 49120, 49236, 50000]                                                                                                                   │
│ Average cumulative reward:       -6.728584645554392                                                                                                                      │
│ Average rollout reward:          -6.268382733897887                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:18[0m Remaining: [36m0:00:44[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 46938, 46942, 47242, 47286, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.379829337935101                                                                                                                      │
│ Average rollout reward:          -6.90827456766083                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:00:44[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 46938, 46942, 47242, 47286, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.379829337935101                                                                                                                      │
│ Average rollout reward:          -6.90827456766083                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K51/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━[0m [35m64.6%[0m Elapsed: [33m0:01:19[0m Remaining: [36m0:00:44[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 51000 ===                                                                                                                                                  │
│ 51001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 46938, 46942, 47242, 47286, 51000]                                                                                                                          │
│ Average cumulative reward:       -7.379829337935101                                                                                                                      │
│ Average rollout reward:          -6.90827456766083                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:00:42[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 186, 187, 192, 7839, 52000]                                                                                                                                 │
│ Average cumulative reward:       -7.286418974937174                                                                                                                      │
│ Average rollout reward:          -6.832499827195764                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:20[0m Remaining: [36m0:00:42[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 186, 187, 192, 7839, 52000]                                                                                                                                 │
│ Average cumulative reward:       -7.286418974937174                                                                                                                      │
│ Average rollout reward:          -6.832499827195764                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
│ [-1.84117742 -1.84117742 -1.84117742 -1.84117742 -1.66770993 -1.66770993                                                                                                 │
│  -1.26752368]                                                                                                                                                            │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K52/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━[0m [35m65.8%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:00:42[0m   1.56 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 52000 ===                                                                                                                                                  │
│ 52001  nodes in tree                                                                                                                                                     │
│ Path: [0, 2, 186, 187, 192, 7839, 52000]                                                                                                                                 │
│ Average cumulative reward:       -7.286418974937174                                                                                                                      │
│ Average rollout reward:          -6.832499827195764                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -2.014644900044245                                                                                                                              │
│ Best path: [0, 2, 56, 59]                                                                                                                                                │
│ [-1.84117742 -1.84117742 -1.84117742 -1.84117742 -1.66770993 -1.66770993                                                                                                 │
│  -1.26752368]                                                                                                                                                            │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:21[0m Remaining: [36m0:00:41[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 52843, 52998, 53000]                                                                                                                                        │
│ Average cumulative reward:       -6.332996231869005                                                                                                                      │
│ Average rollout reward:          -6.053205832673554                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K53/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━[0m [35m67.1%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:00:41[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 53000 ===                                                                                                                                                  │
│ 53001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 52843, 52998, 53000]                                                                                                                                        │
│ Average cumulative reward:       -6.332996231869005                                                                                                                      │
│ Average rollout reward:          -6.053205832673554                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:22[0m Remaining: [36m0:00:39[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53990, 54000]                                                                                                                                               │
│ Average cumulative reward:       -6.577108785800313                                                                                                                      │
│ Average rollout reward:          -6.341747965208787                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:00:39[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53990, 54000]                                                                                                                                               │
│ Average cumulative reward:       -6.577108785800313                                                                                                                      │
│ Average rollout reward:          -6.341747965208787                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K54/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [35m68.4%[0m Elapsed: [33m0:01:23[0m Remaining: [36m0:00:39[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 54000 ===                                                                                                                                                  │
│ 54001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53990, 54000]                                                                                                                                               │
│ Average cumulative reward:       -6.577108785800313                                                                                                                      │
│ Average rollout reward:          -6.341747965208787                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:00:37[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24965, 54996, 55000]                                                                                                                                        │
│ Average cumulative reward:       -6.4740726270530775                                                                                                                     │
│ Average rollout reward:          -6.246971280777119                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:24[0m Remaining: [36m0:00:37[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24965, 54996, 55000]                                                                                                                                        │
│ Average cumulative reward:       -6.4740726270530775                                                                                                                     │
│ Average rollout reward:          -6.246971280777119                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K55/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━[0m [35m69.6%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:00:37[0m   1.55 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 55000 ===                                                                                                                                                  │
│ 55001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 24965, 54996, 55000]                                                                                                                                        │
│ Average cumulative reward:       -6.4740726270530775                                                                                                                     │
│ Average rollout reward:          -6.246971280777119                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:25[0m Remaining: [36m0:00:35[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 55766, 55767, 55770, 55832, 56000]                                                                                                                          │
│ Average cumulative reward:       -6.367284311683795                                                                                                                      │
│ Average rollout reward:          -6.125840018718613                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K56/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━[0m [35m70.9%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:00:35[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 56000 ===                                                                                                                                                  │
│ 56001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 55766, 55767, 55770, 55832, 56000]                                                                                                                          │
│ Average cumulative reward:       -6.367284311683795                                                                                                                      │
│ Average rollout reward:          -6.125840018718613                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:26[0m Remaining: [36m0:00:33[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 56986, 56992, 57000]                                                                                                                                        │
│ Average cumulative reward:       -6.936282712092995                                                                                                                      │
│ Average rollout reward:          -6.697493682673626                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:00:33[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 56986, 56992, 57000]                                                                                                                                        │
│ Average cumulative reward:       -6.936282712092995                                                                                                                      │
│ Average rollout reward:          -6.697493682673626                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K57/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━[0m [35m72.2%[0m Elapsed: [33m0:01:27[0m Remaining: [36m0:00:33[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 57000 ===                                                                                                                                                  │
│ 57001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 56986, 56992, 57000]                                                                                                                                        │
│ Average cumulative reward:       -6.936282712092995                                                                                                                      │
│ Average rollout reward:          -6.697493682673626                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:00:32[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 57607, 57924, 57926, 57946, 58000]                                                                                                                          │
│ Average cumulative reward:       -6.71004456309895                                                                                                                       │
│ Average rollout reward:          -6.486862797696569                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:28[0m Remaining: [36m0:00:32[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 57607, 57924, 57926, 57946, 58000]                                                                                                                          │
│ Average cumulative reward:       -6.71004456309895                                                                                                                       │
│ Average rollout reward:          -6.486862797696569                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K58/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━[0m [35m73.4%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:00:32[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 58000 ===                                                                                                                                                  │
│ 58001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 57607, 57924, 57926, 57946, 58000]                                                                                                                          │
│ Average cumulative reward:       -6.71004456309895                                                                                                                       │
│ Average rollout reward:          -6.486862797696569                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:29[0m Remaining: [36m0:00:30[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53990, 54172, 54175, 59000]                                                                                                                                 │
│ Average cumulative reward:       -6.494582597390385                                                                                                                      │
│ Average rollout reward:          -6.250283749337994                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯37m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:00:30[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53990, 54172, 54175, 59000]                                                                                                                                 │
│ Average cumulative reward:       -6.494582597390385                                                                                                                      │
│ Average rollout reward:          -6.250283749337994                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K59/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━[0m [35m74.7%[0m Elapsed: [33m0:01:30[0m Remaining: [36m0:00:30[0m   1.54 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 59000 ===                                                                                                                                                  │
│ 59001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 53990, 54172, 54175, 59000]                                                                                                                                 │
│ Average cumulative reward:       -6.494582597390385                                                                                                                      │
│ Average rollout reward:          -6.250283749337994                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:00:28[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15423, 15455, 15458, 60000]                                                                                                                                 │
│ Average cumulative reward:       -6.736536720569836                                                                                                                      │
│ Average rollout reward:          -6.4840606798967295                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K60/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━[0m [35m75.9%[0m Elapsed: [33m0:01:31[0m Remaining: [36m0:00:28[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 60000 ===                                                                                                                                                  │
│ 60001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 15423, 15455, 15458, 60000]                                                                                                                                 │
│ Average cumulative reward:       -6.736536720569836                                                                                                                      │
│ Average rollout reward:          -6.4840606798967295                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:00:27[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44873, 45545, 45549, 45595, 61000]                                                                                                                          │
│ Average cumulative reward:       -6.674054613885894                                                                                                                      │
│ Average rollout reward:          -6.382332892810244                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:32[0m Remaining: [36m0:00:27[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44873, 45545, 45549, 45595, 61000]                                                                                                                          │
│ Average cumulative reward:       -6.674054613885894                                                                                                                      │
│ Average rollout reward:          -6.382332892810244                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K61/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━[0m [35m77.2%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:27[0m   1.53 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 61000 ===                                                                                                                                                  │
│ 61001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 44873, 45545, 45549, 45595, 61000]                                                                                                                          │
│ Average cumulative reward:       -6.674054613885894                                                                                                                      │
│ Average rollout reward:          -6.382332892810244                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:33[0m Remaining: [36m0:00:25[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 13936, 61985, 61988, 61994, 62000]                                                                                                                          │
│ Average cumulative reward:       -6.678151278523129                                                                                                                      │
│ Average rollout reward:          -6.416403989037458                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K62/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━[0m [35m78.5%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:25[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 62000 ===                                                                                                                                                  │
│ 62001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 13936, 61985, 61988, 61994, 62000]                                                                                                                          │
│ Average cumulative reward:       -6.678151278523129                                                                                                                      │
│ Average rollout reward:          -6.416403989037458                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:01:34[0m Remaining: [36m0:00:23[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 62938, 62942, 63000]                                                                                                                                 │
│ Average cumulative reward:       -6.835794887855853                                                                                                                      │
│ Average rollout reward:          -6.579450914424577                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:23[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 62938, 62942, 63000]                                                                                                                                 │
│ Average cumulative reward:       -6.835794887855853                                                                                                                      │
│ Average rollout reward:          -6.579450914424577                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K63/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━[0m [35m79.7%[0m Elapsed: [33m0:01:35[0m Remaining: [36m0:00:23[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 63000 ===                                                                                                                                                  │
│ 63001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 62938, 62942, 63000]                                                                                                                                 │
│ Average cumulative reward:       -6.835794887855853                                                                                                                      │
│ Average rollout reward:          -6.579450914424577                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:22[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 17807, 17853, 17857, 64000]                                                                                                                                 │
│ Average cumulative reward:       -6.397993108722132                                                                                                                      │
│ Average rollout reward:          -6.13918105284462                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:01:36[0m Remaining: [36m0:00:22[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 17807, 17853, 17857, 64000]                                                                                                                                 │
│ Average cumulative reward:       -6.397993108722132                                                                                                                      │
│ Average rollout reward:          -6.13918105284462                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K64/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━[0m [35m81.0%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:22[0m   1.52 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 64000 ===                                                                                                                                                  │
│ 64001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 17807, 17853, 17857, 64000]                                                                                                                                 │
│ Average cumulative reward:       -6.397993108722132                                                                                                                      │
│ Average rollout reward:          -6.13918105284462                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:01:37[0m Remaining: [36m0:00:20[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64940, 64943, 64957, 65000]                                                                                                                                 │
│ Average cumulative reward:       -6.645573036486405                                                                                                                      │
│ Average rollout reward:          -6.403755984496023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K65/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━[0m [35m82.3%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:20[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 65000 ===                                                                                                                                                  │
│ 65001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64940, 64943, 64957, 65000]                                                                                                                                 │
│ Average cumulative reward:       -6.645573036486405                                                                                                                      │
│ Average rollout reward:          -6.403755984496023                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:01:38[0m Remaining: [36m0:00:19[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64940, 65029, 65031, 65554, 66000]                                                                                                                          │
│ Average cumulative reward:       -6.4324312101588905                                                                                                                     │
│ Average rollout reward:          -6.1808225069046                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:19[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64940, 65029, 65031, 65554, 66000]                                                                                                                          │
│ Average cumulative reward:       -6.4324312101588905                                                                                                                     │
│ Average rollout reward:          -6.1808225069046                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K66/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━[0m [35m83.5%[0m Elapsed: [33m0:01:39[0m Remaining: [36m0:00:19[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 66000 ===                                                                                                                                                  │
│ 66001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 64940, 65029, 65031, 65554, 66000]                                                                                                                          │
│ Average cumulative reward:       -6.4324312101588905                                                                                                                     │
│ Average rollout reward:          -6.1808225069046                                                                                                                        │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:17[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20243, 66988, 66991, 66995, 67000]                                                                                                                          │
│ Average cumulative reward:       -6.427688606918836                                                                                                                      │
│ Average rollout reward:          -6.19551876911415                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:01:40[0m Remaining: [36m0:00:17[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20243, 66988, 66991, 66995, 67000]                                                                                                                          │
│ Average cumulative reward:       -6.427688606918836                                                                                                                      │
│ Average rollout reward:          -6.19551876911415                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K67/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [35m84.8%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:17[0m   1.51 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 67000 ===                                                                                                                                                  │
│ 67001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 20243, 66988, 66991, 66995, 67000]                                                                                                                          │
│ Average cumulative reward:       -6.427688606918836                                                                                                                      │
│ Average rollout reward:          -6.19551876911415                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:01:41[0m Remaining: [36m0:00:16[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 31929, 66803, 67728, 68000]                                                                                                                                 │
│ Average cumulative reward:       -6.489863284910989                                                                                                                      │
│ Average rollout reward:          -6.266307157309447                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K68/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━[0m [35m86.1%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:16[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 68000 ===                                                                                                                                                  │
│ 68001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 31929, 66803, 67728, 68000]                                                                                                                                 │
│ Average cumulative reward:       -6.489863284910989                                                                                                                      │
│ Average rollout reward:          -6.266307157309447                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:01:42[0m Remaining: [36m0:00:14[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 68581, 68736, 68738, 68884, 69000]                                                                                                                          │
│ Average cumulative reward:       -6.504022441346894                                                                                                                      │
│ Average rollout reward:          -6.263325269606103                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:14[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 68581, 68736, 68738, 68884, 69000]                                                                                                                          │
│ Average cumulative reward:       -6.504022441346894                                                                                                                      │
│ Average rollout reward:          -6.263325269606103                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K69/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━[0m [35m87.3%[0m Elapsed: [33m0:01:43[0m Remaining: [36m0:00:14[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 69000 ===                                                                                                                                                  │
│ 69001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 68581, 68736, 68738, 68884, 69000]                                                                                                                          │
│ Average cumulative reward:       -6.504022441346894                                                                                                                      │
│ Average rollout reward:          -6.263325269606103                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:13[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 66825, 67667, 70000]                                                                                                                                 │
│ Average cumulative reward:       -6.840489447575662                                                                                                                      │
│ Average rollout reward:          -6.585385570588424                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:01:44[0m Remaining: [36m0:00:13[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 66825, 67667, 70000]                                                                                                                                 │
│ Average cumulative reward:       -6.840489447575662                                                                                                                      │
│ Average rollout reward:          -6.585385570588424                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K70/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━[0m [35m88.6%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:13[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 70000 ===                                                                                                                                                  │
│ 70001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 66825, 67667, 70000]                                                                                                                                 │
│ Average cumulative reward:       -6.840489447575662                                                                                                                      │
│ Average rollout reward:          -6.585385570588424                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:01:45[0m Remaining: [36m0:00:11[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 197, 70974, 70977, 71000]                                                                                                                                   │
│ Average cumulative reward:       -6.8683000498324995                                                                                                                     │
│ Average rollout reward:          -6.615437536518369                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K71/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━[0m [35m89.9%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:11[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 71000 ===                                                                                                                                                  │
│ 71001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 197, 70974, 70977, 71000]                                                                                                                                   │
│ Average cumulative reward:       -6.8683000498324995                                                                                                                     │
│ Average rollout reward:          -6.615437536518369                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:01:46[0m Remaining: [36m0:00:10[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10185, 56745, 56749, 56795, 72000]                                                                                                                          │
│ Average cumulative reward:       -6.333259201595237                                                                                                                      │
│ Average rollout reward:          -6.058905243431936                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:10[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10185, 56745, 56749, 56795, 72000]                                                                                                                          │
│ Average cumulative reward:       -6.333259201595237                                                                                                                      │
│ Average rollout reward:          -6.058905243431936                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K72/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [35m91.1%[0m Elapsed: [33m0:01:47[0m Remaining: [36m0:00:10[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 72000 ===                                                                                                                                                  │
│ 72001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 10185, 56745, 56749, 56795, 72000]                                                                                                                          │
│ Average cumulative reward:       -6.333259201595237                                                                                                                      │
│ Average rollout reward:          -6.058905243431936                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:09[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 72419, 72698, 72702, 72713, 73000]                                                                                                                          │
│ Average cumulative reward:       -6.553665251078043                                                                                                                      │
│ Average rollout reward:          -6.2918921375346875                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:01:48[0m Remaining: [36m0:00:09[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 72419, 72698, 72702, 72713, 73000]                                                                                                                          │
│ Average cumulative reward:       -6.553665251078043                                                                                                                      │
│ Average rollout reward:          -6.2918921375346875                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K73/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━[0m [35m92.4%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:09[0m   1.50 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 73000 ===                                                                                                                                                  │
│ 73001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 72419, 72698, 72702, 72713, 73000]                                                                                                                          │
│ Average cumulative reward:       -6.553665251078043                                                                                                                      │
│ Average rollout reward:          -6.2918921375346875                                                                                                                     │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K74/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━[0m [35m93.7%[0m Elapsed: [33m0:01:49[0m Remaining: [36m0:00:07[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 11039, 73994, 73998, 74000]                                                                                                                                 │
│ Average cumulative reward:       -6.817208063937701                                                                                                                      │
│ Average rollout reward:          -6.542094874108149                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯237m━━[0m [35m93.7%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:07[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 74000 ===                                                                                                                                                  │
│ 74001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 11039, 73994, 73998, 74000]                                                                                                                                 │
│ Average cumulative reward:       -6.817208063937701                                                                                                                      │
│ Average rollout reward:          -6.542094874108149                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:01:50[0m Remaining: [36m0:00:06[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 12641, 23443, 23447, 23455, 75000]                                                                                                                          │
│ Average cumulative reward:       -6.786562934729891                                                                                                                      │
│ Average rollout reward:          -6.517905976299175                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:06[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 12641, 23443, 23447, 23455, 75000]                                                                                                                          │
│ Average cumulative reward:       -6.786562934729891                                                                                                                      │
│ Average rollout reward:          -6.517905976299175                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K75/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━[0m [35m94.9%[0m Elapsed: [33m0:01:51[0m Remaining: [36m0:00:06[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 75000 ===                                                                                                                                                  │
│ 75001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 12641, 23443, 23447, 23455, 75000]                                                                                                                          │
│ Average cumulative reward:       -6.786562934729891                                                                                                                      │
│ Average rollout reward:          -6.517905976299175                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:05[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 74011, 74290, 74292, 74302, 76000]                                                                                                                          │
│ Average cumulative reward:       -6.396124908610799                                                                                                                      │
│ Average rollout reward:          -6.097373302395706                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:01:52[0m Remaining: [36m0:00:05[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 74011, 74290, 74292, 74302, 76000]                                                                                                                          │
│ Average cumulative reward:       -6.396124908610799                                                                                                                      │
│ Average rollout reward:          -6.097373302395706                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K76/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━[0m [35m96.2%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:05[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 76000 ===                                                                                                                                                  │
│ 76001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 74011, 74290, 74292, 74302, 76000]                                                                                                                          │
│ Average cumulative reward:       -6.396124908610799                                                                                                                      │
│ Average rollout reward:          -6.097373302395706                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:01:53[0m Remaining: [36m0:00:03[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 62920, 62922, 62925, 77000]                                                                                                                          │
│ Average cumulative reward:       -6.551742628246796                                                                                                                      │
│ Average rollout reward:          -6.268558748375207                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K77/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━[0m [35m97.5%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:03[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 77000 ===                                                                                                                                                  │
│ 77001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 62849, 62920, 62922, 62925, 77000]                                                                                                                          │
│ Average cumulative reward:       -6.551742628246796                                                                                                                      │
│ Average rollout reward:          -6.268558748375207                                                                                                                      │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:01:54[0m Remaining: [36m0:00:02[0m   1.47 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 46939, 46941, 46965, 78000]                                                                                                                                 │
│ Average cumulative reward:       -6.706261372879231                                                                                                                      │
│ Average rollout reward:          -6.42970719189024                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:02[0m   1.48 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 46939, 46941, 46965, 78000]                                                                                                                                 │
│ Average cumulative reward:       -6.706261372879231                                                                                                                      │
│ Average rollout reward:          -6.42970719189024                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K78/79 [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [35m98.7%[0m Elapsed: [33m0:01:55[0m Remaining: [36m0:00:02[0m   1.49 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 46939, 46941, 46965, 78000]                                                                                                                                 │
│ Average cumulative reward:       -6.706261372879231                                                                                                                      │
│ Average rollout reward:          -6.42970719189024                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K[1A[2K79/79 [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [35m100.0%[0m Elapsed: [33m0:01:56[0m Remaining: [36m0:00:00[0m   1.47 s/iter
╭────────────────────────────────────────────────────────────────────────────────── MCTS ──────────────────────────────────────────────────────────────────────────────────╮
│ === Iteration 78000 ===                                                                                                                                                  │
│ 78001  nodes in tree                                                                                                                                                     │
│ Path: [0, 3, 46939, 46941, 46965, 78000]                                                                                                                                 │
│ Average cumulative reward:       -6.706261372879231                                                                                                                      │
│ Average rollout reward:          -6.42970719189024                                                                                                                       │
│ Termination count: 0                                                                                                                                                     │
│ Best value of root node: -1.8411774162796817                                                                                                                             │
│ Best path: [0, 3, 8, 14553, 14789, 17051, 52205]                                                                                                                         │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[?25hNode 0 is not terminal. Continue.
Node 3 is not terminal. Continue.
Node 8 is not terminal. Continue.
Node 14553 is not terminal. Continue.
Node 14789 is not terminal. Continue.
Node 17051 is not terminal. Continue.
Node 52205 is not terminal. Continue.
Node 71708 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 3 is not terminal. Continue.
Node 20881 is not terminal. Continue.
Node 20927 is not terminal. Continue.
Node 20931 is not terminal. Continue.
Node 21055 is not terminal. Continue.
Node 21355 is not terminal. Continue.
No children found. Stop.
Node 0 is not terminal. Continue.
Node 3 is not terminal. Continue.
Node 8 is not terminal. Continue.
Node 14553 is not terminal. Continue.
Node 14789 is not terminal. Continue.
Node 17051 is not terminal. Continue.
Node 52205 is not terminal. Continue.
Node 71708 is not terminal. Continue.
No children found. Stop.
=== RESULT ===
By Visits: estimated reward: -2.974771281687625
sign_ns [4.8595634 4.9853897]
sign_newton [30.344688]
By Value: estimated reward: -3.148238765452188
sign_ns [3.0475628 3.2186093]
sign_newton [30.65772]
By Best Value: estimated reward: 0
sign_ns [4.8595634 4.9853897]
sign_newton [30.344688]
sign_newton [0.07307187 0.         0.         0.        ]
sign_ns [0.5, 0.23312340381083163]
sign_ns [0.5, 1.432847203513038]
sign_ns [0.5, 1.1843748542947055]
sign_ns [0.5, 1.0276827083502333]
sign_ns [0.5, 1.0005803328269756]
Best value of root node:
-1.8411774162796817
Best root policy:
sign_ns [4.8595634 4.9853897]
sign_newton [30.344688]
sign_newton [0.07307187 0.         0.         0.        ]
sign_ns [0.5, 0.23312340381083163]
sign_ns [0.5, 1.432847203513038]
sign_ns [0.5, 1.1843748542947055]
sign_ns [0.5, 1.0276827083502333]
sign_ns [0.5, 1.0005803328269756]
=== END ===
Finished making algorithm
