2025-09-11 19:54:45,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc25-walker2d/ExtremeClogL1U23-mbpac_memdelay
2025-09-11 19:54:45,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc25-walker2d/ExtremeClogL1U23-mbpac_memdelay
2025-09-11 19:54:45,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x145f31f01550>}
2025-09-11 19:54:45,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1111 [DEBUG]: using device: cuda
2025-09-11 19:54:45,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1133 [INFO]: Creating new trainer
2025-09-11 19:54:45,434 baseline-mbpac-noiseperc25-walker2d:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=384, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=6, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(6,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=6, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(6,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 19:54:45,435 baseline-mbpac-noiseperc25-walker2d:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=23, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 19:54:45,442 baseline-mbpac-noiseperc25-walker2d:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=384, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=17, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=17, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=17, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=384, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=6, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 384, batch_first=True)
)
2025-09-11 19:54:46,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1194 [DEBUG]: Starting training session...
2025-09-11 19:54:46,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 1/100
2025-09-11 20:05:01,337 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:05:01,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:05:25,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: -22.60447 ± 12.798
2025-09-11 20:05:25,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [-26.252012, -26.457384, -35.0517, -30.512772, -11.081642, 6.801522, -19.70288, -27.757076, -40.29007, -15.7406645]
2025-09-11 20:05:25,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [97.0, 88.0, 116.0, 79.0, 81.0, 18.0, 99.0, 96.0, 85.0, 105.0]
2025-09-11 20:05:25,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (-22.60) for latency ExtremeClogL1U23
2025-09-11 20:05:25,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 2/100 (estimated time remaining: 17 hours, 34 minutes, 57 seconds)
2025-09-11 20:17:36,769 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:17:36,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:18:36,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 139.79912 ± 130.199
2025-09-11 20:18:36,678 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [237.02036, 192.98305, 37.80963, 184.37228, 28.997385, 44.482155, 96.14986, 58.242638, 466.9668, 50.96708]
2025-09-11 20:18:36,678 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [197.0, 292.0, 133.0, 394.0, 114.0, 144.0, 245.0, 134.0, 397.0, 147.0]
2025-09-11 20:18:36,678 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (139.80) for latency ExtremeClogL1U23
2025-09-11 20:18:36,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 3/100 (estimated time remaining: 19 hours, 28 minutes, 1 second)
2025-09-11 20:29:44,561 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:29:44,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:30:43,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 124.92580 ± 139.029
2025-09-11 20:30:43,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [307.51935, 51.231503, 382.19345, 303.48325, 38.963093, 5.017789, 25.929663, -8.5807705, 45.268322, 98.23226]
2025-09-11 20:30:43,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [209.0, 102.0, 270.0, 624.0, 280.0, 136.0, 145.0, 54.0, 93.0, 227.0]
2025-09-11 20:30:43,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 4/100 (estimated time remaining: 19 hours, 22 minutes, 15 seconds)
2025-09-11 20:42:09,339 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:42:09,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:43:05,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 237.32907 ± 159.737
2025-09-11 20:43:05,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [492.5257, 242.97403, 214.39804, 341.43024, 336.86282, 36.65992, 376.53995, 12.403322, 7.339831, 312.1569]
2025-09-11 20:43:05,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [583.0, 161.0, 168.0, 215.0, 209.0, 204.0, 233.0, 24.0, 20.0, 194.0]
2025-09-11 20:43:05,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (237.33) for latency ExtremeClogL1U23
2025-09-11 20:43:05,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 5/100 (estimated time remaining: 19 hours, 19 minutes, 26 seconds)
2025-09-11 20:54:46,117 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:54:46,125 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:55:29,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 189.04941 ± 178.939
2025-09-11 20:55:29,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2.7120306, 478.68176, 4.734865, 250.02444, 298.6014, 461.4989, 255.70207, 135.52998, -0.5102979, 3.5189261]
2025-09-11 20:55:29,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [13.0, 482.0, 18.0, 186.0, 200.0, 319.0, 159.0, 167.0, 11.0, 16.0]
2025-09-11 20:55:29,841 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 6/100 (estimated time remaining: 19 hours, 13 minutes, 44 seconds)
2025-09-11 21:07:14,549 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:07:14,558 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:07:45,702 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 95.68739 ± 128.495
2025-09-11 21:07:45,702 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [11.773157, 5.3585353, 22.413767, 45.44143, 134.92006, 5.992065, 21.175478, 382.811, 296.2414, 30.747053]
2025-09-11 21:07:45,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [81.0, 22.0, 58.0, 134.0, 214.0, 20.0, 93.0, 220.0, 211.0, 60.0]
2025-09-11 21:07:45,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 7/100 (estimated time remaining: 19 hours, 31 minutes, 49 seconds)
2025-09-11 21:19:12,788 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:19:12,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:19:54,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 142.15869 ± 83.484
2025-09-11 21:19:54,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [198.5165, 105.57158, 122.87008, 119.146065, 319.11517, 195.5584, 4.7367306, 38.837227, 156.09177, 161.1433]
2025-09-11 21:19:54,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [106.0, 130.0, 158.0, 216.0, 257.0, 122.0, 14.0, 70.0, 242.0, 188.0]
2025-09-11 21:19:54,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 8/100 (estimated time remaining: 18 hours, 59 minutes, 58 seconds)
2025-09-11 21:31:29,942 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:31:29,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:32:00,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 146.33139 ± 128.517
2025-09-11 21:32:00,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [335.47607, 24.183853, 77.30636, 7.1976757, 305.8303, 251.76616, 2.5485964, 8.359725, 253.61919, 197.02597]
2025-09-11 21:32:00,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [270.0, 66.0, 90.0, 22.0, 163.0, 145.0, 17.0, 43.0, 167.0, 125.0]
2025-09-11 21:32:00,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 9/100 (estimated time remaining: 18 hours, 47 minutes, 47 seconds)
2025-09-11 21:43:44,215 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:43:44,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:44:34,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 262.02063 ± 177.671
2025-09-11 21:44:34,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [118.108635, 379.71695, 358.67572, 601.2821, 210.18172, 0.45644987, 396.75677, 3.1356049, 276.59747, 275.295]
2025-09-11 21:44:34,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [137.0, 237.0, 252.0, 386.0, 180.0, 16.0, 257.0, 15.0, 211.0, 153.0]
2025-09-11 21:44:34,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (262.02) for latency ExtremeClogL1U23
2025-09-11 21:44:34,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 10/100 (estimated time remaining: 18 hours, 39 minutes, 16 seconds)
2025-09-11 21:56:14,447 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:56:14,455 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:57:12,526 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 339.61270 ± 138.002
2025-09-11 21:57:12,526 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [479.64938, 288.4119, 310.53525, 572.5504, 352.85455, 356.10684, 369.35257, 325.24207, 5.6128144, 335.81094]
2025-09-11 21:57:12,526 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [319.0, 159.0, 174.0, 307.0, 253.0, 216.0, 237.0, 152.0, 25.0, 219.0]
2025-09-11 21:57:12,526 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (339.61) for latency ExtremeClogL1U23
2025-09-11 21:57:12,534 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 11/100 (estimated time remaining: 18 hours, 30 minutes, 48 seconds)
2025-09-11 22:08:50,912 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:08:50,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:09:38,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 219.68188 ± 113.155
2025-09-11 22:09:38,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [340.0373, 155.65054, 47.537907, 311.82465, 117.61655, 293.74127, 39.463093, 251.31366, 289.0367, 350.59708]
2025-09-11 22:09:38,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [195.0, 144.0, 65.0, 286.0, 131.0, 190.0, 49.0, 220.0, 168.0, 244.0]
2025-09-11 22:09:38,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 12/100 (estimated time remaining: 18 hours, 21 minutes, 21 seconds)
2025-09-11 22:21:21,496 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:21:21,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:21:58,918 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 220.20166 ± 147.245
2025-09-11 22:21:58,918 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [102.76123, 323.30753, 283.36246, 80.956665, 5.4113135, 5.1831465, 399.17453, 332.27722, 280.99448, 388.58813]
2025-09-11 22:21:58,918 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [127.0, 184.0, 137.0, 85.0, 21.0, 20.0, 225.0, 165.0, 176.0, 207.0]
2025-09-11 22:21:58,925 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 13/100 (estimated time remaining: 18 hours, 12 minutes, 38 seconds)
2025-09-11 22:33:52,384 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:33:52,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:34:39,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 298.64740 ± 111.960
2025-09-11 22:34:39,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [279.83707, 260.13574, 341.8402, 363.54456, 238.30373, 31.570526, 423.20023, 293.53046, 291.91116, 462.6007]
2025-09-11 22:34:39,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [138.0, 169.0, 137.0, 211.0, 147.0, 59.0, 284.0, 165.0, 152.0, 220.0]
2025-09-11 22:34:39,420 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 14/100 (estimated time remaining: 18 hours, 9 minutes, 59 seconds)
2025-09-11 22:46:04,055 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:46:04,062 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:46:53,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 242.47144 ± 152.575
2025-09-11 22:46:53,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [6.148981, 402.6507, 281.60242, 96.120026, 3.1050074, 367.70312, 231.2598, 227.02579, 439.58566, 369.51294]
2025-09-11 22:46:53,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [19.0, 205.0, 222.0, 142.0, 14.0, 295.0, 141.0, 129.0, 289.0, 340.0]
2025-09-11 22:46:53,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 15/100 (estimated time remaining: 17 hours, 51 minutes, 43 seconds)
2025-09-11 22:58:36,918 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:58:36,939 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:59:31,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 282.87842 ± 128.817
2025-09-11 22:59:31,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [277.9612, 296.30234, 95.48509, 337.4773, 276.62784, 43.92144, 484.4064, 454.43124, 266.11954, 296.05164]
2025-09-11 22:59:31,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [197.0, 145.0, 119.0, 234.0, 236.0, 53.0, 265.0, 346.0, 144.0, 215.0]
2025-09-11 22:59:31,248 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 16/100 (estimated time remaining: 17 hours, 39 minutes, 18 seconds)
2025-09-11 23:11:15,286 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:11:15,301 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:11:46,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 215.30576 ± 149.082
2025-09-11 23:11:46,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [4.5373397, 0.4509751, 292.57803, 286.22614, -2.0990217, 273.2244, 392.3213, 403.35565, 231.62865, 270.8342]
2025-09-11 23:11:46,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [16.0, 11.0, 142.0, 142.0, 22.0, 140.0, 190.0, 192.0, 140.0, 156.0]
2025-09-11 23:11:46,658 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 17/100 (estimated time remaining: 17 hours, 23 minutes, 59 seconds)
2025-09-11 23:22:58,817 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:22:58,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:23:48,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 347.82153 ± 102.241
2025-09-11 23:23:48,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [599.02734, 333.32083, 300.1896, 256.72623, 278.65747, 279.25052, 392.29074, 244.16678, 439.41446, 355.1714]
2025-09-11 23:23:48,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [295.0, 163.0, 228.0, 123.0, 147.0, 151.0, 195.0, 138.0, 232.0, 205.0]
2025-09-11 23:23:48,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (347.82) for latency ExtremeClogL1U23
2025-09-11 23:23:48,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 18/100 (estimated time remaining: 17 hours, 6 minutes, 21 seconds)
2025-09-11 23:35:18,136 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:35:18,145 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:36:17,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 325.79776 ± 96.919
2025-09-11 23:36:17,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [312.67294, 345.6975, 299.48447, 369.15707, 303.73428, 194.74553, 391.3467, 553.73944, 202.4941, 284.90555]
2025-09-11 23:36:17,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [237.0, 221.0, 200.0, 263.0, 224.0, 112.0, 283.0, 454.0, 109.0, 166.0]
2025-09-11 23:36:17,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 19/100 (estimated time remaining: 16 hours, 50 minutes, 48 seconds)
2025-09-11 23:48:35,004 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:48:35,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:49:16,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 271.49561 ± 37.157
2025-09-11 23:49:16,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [206.28102, 304.17795, 323.9659, 259.01572, 250.26642, 302.35184, 310.8874, 284.9089, 236.41232, 236.6884]
2025-09-11 23:49:16,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [118.0, 164.0, 191.0, 116.0, 114.0, 152.0, 146.0, 174.0, 103.0, 136.0]
2025-09-11 23:49:16,922 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 20/100 (estimated time remaining: 16 hours, 50 minutes, 42 seconds)
2025-09-12 00:02:00,874 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:02:00,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:02:56,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 339.33582 ± 72.969
2025-09-12 00:02:56,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [294.50143, 345.19223, 399.4284, 233.4987, 433.6634, 358.25314, 322.14545, 266.51492, 270.9746, 469.18582]
2025-09-12 00:02:56,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [148.0, 201.0, 252.0, 110.0, 234.0, 177.0, 169.0, 144.0, 132.0, 308.0]
2025-09-12 00:02:56,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 21/100 (estimated time remaining: 16 hours, 54 minutes, 41 seconds)
2025-09-12 00:15:37,976 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:15:37,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:16:29,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 284.76520 ± 103.520
2025-09-12 00:16:29,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [313.66284, 455.024, 340.15088, 28.180151, 293.6628, 279.55835, 342.57928, 215.8888, 279.48203, 299.4631]
2025-09-12 00:16:29,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [145.0, 400.0, 210.0, 43.0, 173.0, 131.0, 198.0, 128.0, 150.0, 167.0]
2025-09-12 00:16:29,697 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 22/100 (estimated time remaining: 17 hours, 2 minutes, 32 seconds)
2025-09-12 00:28:56,156 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:28:56,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:29:38,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 261.83698 ± 153.093
2025-09-12 00:29:38,499 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [290.55414, 477.77673, 234.17122, 331.78528, 322.75104, 207.78685, 266.65588, 473.0113, 1.2316864, 12.645667]
2025-09-12 00:29:38,499 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [178.0, 273.0, 117.0, 156.0, 186.0, 118.0, 140.0, 247.0, 11.0, 22.0]
2025-09-12 00:29:38,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 23/100 (estimated time remaining: 17 hours, 6 minutes, 57 seconds)
2025-09-12 00:42:11,952 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:42:11,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:42:56,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 246.98726 ± 129.090
2025-09-12 00:42:56,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [250.88739, 391.6036, 405.10248, 269.93137, 53.091587, 329.70728, 352.93326, 185.94588, 229.64343, 1.0263417]
2025-09-12 00:42:56,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [141.0, 257.0, 225.0, 185.0, 45.0, 163.0, 197.0, 115.0, 157.0, 16.0]
2025-09-12 00:42:56,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 24/100 (estimated time remaining: 17 hours, 6 minutes, 25 seconds)
2025-09-12 00:55:36,975 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:55:36,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:56:23,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 266.69403 ± 105.682
2025-09-12 00:56:23,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [286.51538, 237.60799, 251.65688, 261.28488, -4.6968384, 217.71696, 321.87454, 336.62482, 400.39655, 357.95895]
2025-09-12 00:56:23,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [159.0, 136.0, 138.0, 150.0, 20.0, 118.0, 203.0, 192.0, 242.0, 195.0]
2025-09-12 00:56:23,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 25/100 (estimated time remaining: 17 hours, 10 seconds)
2025-09-12 01:08:50,138 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:08:50,147 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:09:31,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 252.85185 ± 100.853
2025-09-12 01:09:31,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [5.877405, 264.73105, 270.4569, 294.8045, 350.13052, 279.09015, 363.40915, 139.94214, 310.89288, 249.18376]
2025-09-12 01:09:31,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [17.0, 157.0, 133.0, 136.0, 242.0, 162.0, 173.0, 110.0, 144.0, 125.0]
2025-09-12 01:09:31,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 26/100 (estimated time remaining: 16 hours, 38 minutes, 49 seconds)
2025-09-12 01:22:13,110 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:22:13,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:23:06,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 330.16827 ± 79.266
2025-09-12 01:23:06,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [252.02068, 422.34216, 241.63681, 386.99225, 264.11227, 360.6949, 280.0245, 489.4772, 267.50217, 336.87976]
2025-09-12 01:23:06,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [120.0, 219.0, 137.0, 221.0, 141.0, 203.0, 157.0, 248.0, 127.0, 188.0]
2025-09-12 01:23:06,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 27/100 (estimated time remaining: 16 hours, 25 minutes, 53 seconds)
2025-09-12 01:35:41,164 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:35:41,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:36:24,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 274.49286 ± 152.883
2025-09-12 01:36:24,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [5.285039, 23.955475, 557.2605, 260.08817, 309.21133, 272.4657, 304.68253, 283.49445, 366.66748, 361.81808]
2025-09-12 01:36:24,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [21.0, 36.0, 346.0, 142.0, 167.0, 134.0, 121.0, 134.0, 177.0, 168.0]
2025-09-12 01:36:24,047 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 28/100 (estimated time remaining: 16 hours, 14 minutes, 40 seconds)
2025-09-12 01:49:14,202 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:49:14,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:50:01,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 331.51575 ± 57.890
2025-09-12 01:50:01,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [360.45535, 350.7727, 346.0929, 300.3117, 326.93716, 363.73834, 434.26907, 248.38443, 223.24387, 360.95193]
2025-09-12 01:50:01,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [175.0, 136.0, 140.0, 166.0, 154.0, 171.0, 171.0, 110.0, 145.0, 171.0]
2025-09-12 01:50:01,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 29/100 (estimated time remaining: 16 hours, 5 minutes, 53 seconds)
2025-09-12 02:02:24,498 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:02:24,506 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:03:10,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 350.38995 ± 69.534
2025-09-12 02:03:10,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [292.11267, 340.6265, 377.91537, 292.75455, 407.52072, 301.28897, 331.6322, 267.55627, 377.2746, 515.2175]
2025-09-12 02:03:10,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [132.0, 137.0, 184.0, 115.0, 161.0, 134.0, 138.0, 130.0, 145.0, 260.0]
2025-09-12 02:03:10,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (350.39) for latency ExtremeClogL1U23
2025-09-12 02:03:10,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 30/100 (estimated time remaining: 15 hours, 48 minutes, 16 seconds)
2025-09-12 02:15:47,580 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:15:47,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:16:19,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 213.13533 ± 134.757
2025-09-12 02:16:19,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [296.2509, 259.32693, 313.00687, 335.84183, 301.57294, 0.43481594, 21.259058, 7.225115, 317.78363, 278.6512]
2025-09-12 02:16:19,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [139.0, 130.0, 167.0, 148.0, 139.0, 12.0, 50.0, 17.0, 151.0, 127.0]
2025-09-12 02:16:19,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 31/100 (estimated time remaining: 15 hours, 35 minutes, 11 seconds)
2025-09-12 02:28:56,990 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:28:56,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:29:36,458 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 285.94330 ± 107.047
2025-09-12 02:29:36,458 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [203.95984, 365.21384, -0.12788394, 318.3604, 295.39395, 382.78653, 367.14703, 282.4225, 320.7914, 323.48532]
2025-09-12 02:29:36,458 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [108.0, 162.0, 12.0, 153.0, 132.0, 165.0, 167.0, 149.0, 130.0, 132.0]
2025-09-12 02:29:36,481 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 32/100 (estimated time remaining: 15 hours, 17 minutes, 41 seconds)
2025-09-12 02:42:17,756 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:42:17,774 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:42:56,000 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 284.02063 ± 159.932
2025-09-12 02:42:56,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [337.13477, 355.501, 299.92453, 283.798, 3.6080582, 297.57062, 360.885, 1.7731575, 331.4486, 568.5628]
2025-09-12 02:42:56,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [156.0, 130.0, 135.0, 123.0, 30.0, 129.0, 147.0, 17.0, 168.0, 245.0]
2025-09-12 02:42:56,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 33/100 (estimated time remaining: 15 hours, 4 minutes, 50 seconds)
2025-09-12 02:55:26,007 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:55:26,009 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:56:03,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 300.23935 ± 47.227
2025-09-12 02:56:03,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [278.18005, 315.69653, 378.00867, 288.35394, 359.14288, 336.81787, 252.87308, 313.102, 261.79846, 218.41985]
2025-09-12 02:56:03,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [122.0, 151.0, 137.0, 119.0, 133.0, 144.0, 107.0, 128.0, 103.0, 99.0]
2025-09-12 02:56:03,367 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 34/100 (estimated time remaining: 14 hours, 44 minutes, 54 seconds)
2025-09-12 03:08:34,210 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:08:34,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:09:07,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 240.19572 ± 130.077
2025-09-12 03:09:07,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [307.48947, 435.9262, 307.97708, 252.29608, 223.23413, 0.7976387, 6.267357, 326.34564, 269.71442, 271.90924]
2025-09-12 03:09:07,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [115.0, 274.0, 122.0, 108.0, 96.0, 24.0, 20.0, 126.0, 110.0, 107.0]
2025-09-12 03:09:07,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 35/100 (estimated time remaining: 14 hours, 30 minutes, 25 seconds)
2025-09-12 03:21:44,686 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:21:44,689 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:22:27,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 344.76401 ± 105.133
2025-09-12 03:22:27,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [308.31784, 348.3291, 329.21246, 309.85764, 391.58133, 261.18634, 258.507, 635.4532, 341.6764, 263.51917]
2025-09-12 03:22:27,932 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [139.0, 158.0, 132.0, 117.0, 175.0, 123.0, 102.0, 231.0, 139.0, 109.0]
2025-09-12 03:22:27,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 36/100 (estimated time remaining: 14 hours, 19 minutes, 48 seconds)
2025-09-12 03:35:03,668 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:35:03,670 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:35:39,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 274.25626 ± 144.058
2025-09-12 03:35:39,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [347.2495, 295.8619, 265.09875, 15.578113, 349.38028, 278.45657, 4.1026464, 481.36966, 373.66592, 331.79904]
2025-09-12 03:35:39,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [163.0, 119.0, 105.0, 30.0, 150.0, 118.0, 15.0, 183.0, 153.0, 146.0]
2025-09-12 03:35:39,521 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 37/100 (estimated time remaining: 14 hours, 5 minutes, 26 seconds)
2025-09-12 03:48:22,444 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:48:22,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:48:59,169 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 288.22568 ± 182.380
2025-09-12 03:48:59,169 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [261.8674, 0.7218057, -0.8290528, 331.18405, 263.64468, 648.6769, 400.6032, 321.79053, 418.22632, 236.37094]
2025-09-12 03:48:59,169 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [114.0, 11.0, 19.0, 124.0, 105.0, 272.0, 184.0, 136.0, 161.0, 100.0]
2025-09-12 03:48:59,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 38/100 (estimated time remaining: 13 hours, 52 minutes, 16 seconds)
2025-09-12 04:01:39,990 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:01:39,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:02:14,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 279.67609 ± 112.683
2025-09-12 04:02:14,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [248.30486, 328.01483, 371.76163, 412.56662, 399.77545, 229.7615, 202.1818, 239.70677, 17.952015, 346.7353]
2025-09-12 04:02:14,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [104.0, 120.0, 140.0, 151.0, 152.0, 96.0, 90.0, 102.0, 31.0, 148.0]
2025-09-12 04:02:14,101 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 39/100 (estimated time remaining: 13 hours, 40 minutes, 37 seconds)
2025-09-12 04:14:55,429 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:14:55,431 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:15:27,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 246.38708 ± 88.225
2025-09-12 04:15:27,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [237.32787, 304.29376, 275.0278, 259.9077, 2.4571211, 211.16023, 265.4083, 293.33737, 271.23605, 343.7147]
2025-09-12 04:15:27,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [109.0, 124.0, 116.0, 101.0, 17.0, 100.0, 124.0, 121.0, 112.0, 133.0]
2025-09-12 04:15:27,138 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 40/100 (estimated time remaining: 13 hours, 29 minutes, 14 seconds)
2025-09-12 04:28:03,799 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:28:03,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:28:43,705 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 299.09702 ± 198.667
2025-09-12 04:28:43,705 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [245.88086, 661.2146, 222.47069, 5.5531797, 495.6317, 18.128798, 411.14224, 236.06516, 225.139, 469.74396]
2025-09-12 04:28:43,705 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [110.0, 266.0, 99.0, 20.0, 206.0, 27.0, 184.0, 109.0, 126.0, 185.0]
2025-09-12 04:28:43,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 41/100 (estimated time remaining: 13 hours, 15 minutes, 9 seconds)
2025-09-12 04:41:33,801 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:41:33,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:42:17,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 376.68036 ± 120.651
2025-09-12 04:42:17,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [385.451, 280.59348, 308.1981, 399.28632, 420.37195, 707.0652, 352.10184, 257.3659, 343.18478, 313.18518]
2025-09-12 04:42:17,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [143.0, 112.0, 138.0, 144.0, 155.0, 284.0, 148.0, 105.0, 131.0, 120.0]
2025-09-12 04:42:17,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (376.68) for latency ExtremeClogL1U23
2025-09-12 04:42:17,576 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 42/100 (estimated time remaining: 13 hours, 6 minutes, 17 seconds)
2025-09-12 04:54:40,320 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:54:40,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:55:20,558 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 333.57205 ± 71.293
2025-09-12 04:55:20,558 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [294.28165, 280.98697, 279.31204, 391.50818, 466.4721, 449.3648, 300.7336, 301.80002, 250.53961, 320.72134]
2025-09-12 04:55:20,558 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [131.0, 117.0, 119.0, 157.0, 183.0, 171.0, 112.0, 131.0, 107.0, 142.0]
2025-09-12 04:55:20,570 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 43/100 (estimated time remaining: 12 hours, 49 minutes, 43 seconds)
2025-09-12 05:08:07,889 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:08:07,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:08:47,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 319.46802 ± 135.758
2025-09-12 05:08:47,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [302.00262, 0.21792443, 323.7917, 316.44046, 534.381, 317.5994, 370.43323, 312.73065, 235.73894, 481.344]
2025-09-12 05:08:47,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [128.0, 13.0, 137.0, 177.0, 192.0, 118.0, 147.0, 128.0, 120.0, 166.0]
2025-09-12 05:08:47,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 44/100 (estimated time remaining: 12 hours, 38 minutes, 41 seconds)
2025-09-12 05:21:25,954 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:21:25,956 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:21:59,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 261.76877 ± 141.795
2025-09-12 05:21:59,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [0.34410766, 294.25595, 425.7084, 3.455028, 282.5011, 274.53976, 310.6585, 388.90622, 397.51715, 239.8014]
2025-09-12 05:21:59,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [14.0, 113.0, 174.0, 21.0, 105.0, 123.0, 122.0, 159.0, 163.0, 107.0]
2025-09-12 05:21:59,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 45/100 (estimated time remaining: 12 hours, 25 minutes, 12 seconds)
2025-09-12 05:34:37,819 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:34:37,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:35:09,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 244.22388 ± 134.548
2025-09-12 05:35:09,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [272.71408, 309.65134, 263.39984, 249.16835, 275.4222, 375.55206, 449.40753, 238.45811, 3.5526764, 4.9126506]
2025-09-12 05:35:09,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [116.0, 166.0, 108.0, 115.0, 108.0, 129.0, 170.0, 103.0, 14.0, 19.0]
2025-09-12 05:35:09,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 46/100 (estimated time remaining: 12 hours, 10 minutes, 46 seconds)
2025-09-12 05:47:47,089 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:47:47,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:48:27,754 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 349.27545 ± 88.665
2025-09-12 05:48:27,754 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [404.1396, 411.52902, 501.2849, 290.15268, 474.20242, 311.13153, 320.5671, 213.3743, 278.94983, 287.4231]
2025-09-12 05:48:27,754 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [153.0, 147.0, 198.0, 114.0, 184.0, 109.0, 120.0, 107.0, 109.0, 102.0]
2025-09-12 05:48:27,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 47/100 (estimated time remaining: 11 hours, 54 minutes, 38 seconds)
2025-09-12 06:01:11,469 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:01:11,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:01:42,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 260.88312 ± 201.364
2025-09-12 06:01:42,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [645.3031, 275.7175, 269.7866, 462.6969, 378.27957, 3.927813, 1.7520232, 287.04773, -0.7319433, 285.052]
2025-09-12 06:01:42,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [227.0, 100.0, 105.0, 163.0, 131.0, 16.0, 23.0, 122.0, 18.0, 114.0]
2025-09-12 06:01:42,131 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 48/100 (estimated time remaining: 11 hours, 43 minutes, 24 seconds)
2025-09-12 06:14:18,521 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:14:18,524 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:15:00,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 333.76859 ± 69.637
2025-09-12 06:15:00,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [315.5672, 274.50143, 283.44214, 330.24594, 290.29514, 428.43695, 296.79443, 254.11607, 477.8874, 386.399]
2025-09-12 06:15:00,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [129.0, 117.0, 114.0, 133.0, 147.0, 163.0, 130.0, 118.0, 177.0, 169.0]
2025-09-12 06:15:00,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 49/100 (estimated time remaining: 11 hours, 28 minutes, 38 seconds)
2025-09-12 06:27:48,909 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:27:48,918 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:28:33,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 371.43204 ± 68.179
2025-09-12 06:28:33,137 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [300.65988, 383.46832, 282.25482, 325.57776, 355.6644, 413.8406, 297.66193, 470.84726, 398.04504, 486.3002]
2025-09-12 06:28:33,137 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [117.0, 155.0, 112.0, 132.0, 137.0, 161.0, 125.0, 191.0, 157.0, 179.0]
2025-09-12 06:28:33,148 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 50/100 (estimated time remaining: 11 hours, 18 minutes, 57 seconds)
2025-09-12 06:41:06,142 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:41:06,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:41:49,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 377.62097 ± 177.421
2025-09-12 06:41:49,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [308.31863, 467.10825, 290.49054, 293.00543, 655.9886, 640.9384, 383.5523, 398.00748, 6.2566943, 332.54337]
2025-09-12 06:41:49,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [130.0, 165.0, 115.0, 111.0, 226.0, 238.0, 132.0, 162.0, 21.0, 145.0]
2025-09-12 06:41:49,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (377.62) for latency ExtremeClogL1U23
2025-09-12 06:41:49,243 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 51/100 (estimated time remaining: 11 hours, 6 minutes, 35 seconds)
2025-09-12 06:54:43,491 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:54:43,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:55:23,730 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 317.65884 ± 198.906
2025-09-12 06:55:23,730 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [202.05244, 276.5257, 488.88715, 510.43906, 555.1861, 2.0826359, 3.9795368, 330.61115, 253.71591, 553.10895]
2025-09-12 06:55:23,730 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [196.0, 102.0, 229.0, 173.0, 185.0, 12.0, 15.0, 116.0, 103.0, 200.0]
2025-09-12 06:55:23,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 52/100 (estimated time remaining: 10 hours, 55 minutes, 56 seconds)
2025-09-12 07:08:14,450 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:08:14,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:09:02,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 424.47232 ± 127.797
2025-09-12 07:09:02,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [500.37848, 281.56436, 503.51624, 489.40945, 278.05588, 466.85764, 329.08975, 709.3293, 344.7504, 341.77158]
2025-09-12 07:09:02,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [195.0, 110.0, 183.0, 161.0, 117.0, 174.0, 129.0, 275.0, 132.0, 128.0]
2025-09-12 07:09:02,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (424.47) for latency ExtremeClogL1U23
2025-09-12 07:09:02,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 53/100 (estimated time remaining: 10 hours, 46 minutes, 24 seconds)
2025-09-12 07:21:21,962 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:21:21,964 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:22:08,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 388.04465 ± 128.946
2025-09-12 07:22:08,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [249.57504, 344.5648, 523.46564, 310.86227, 397.8163, 293.45718, 369.553, 713.4077, 318.51364, 359.23087]
2025-09-12 07:22:08,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [108.0, 128.0, 232.0, 146.0, 130.0, 120.0, 157.0, 248.0, 117.0, 149.0]
2025-09-12 07:22:08,564 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 54/100 (estimated time remaining: 10 hours, 31 minutes, 7 seconds)
2025-09-12 07:35:03,102 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:35:03,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:35:48,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 415.43140 ± 158.983
2025-09-12 07:35:48,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2.8668191, 398.97372, 608.08655, 561.80756, 433.48148, 408.95358, 504.29922, 511.7324, 377.8162, 346.29657]
2025-09-12 07:35:48,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [19.0, 151.0, 204.0, 192.0, 166.0, 152.0, 179.0, 181.0, 145.0, 125.0]
2025-09-12 07:35:48,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 55/100 (estimated time remaining: 10 hours, 18 minutes, 42 seconds)
2025-09-12 07:48:13,832 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:48:13,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:49:00,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 411.90314 ± 174.787
2025-09-12 07:49:00,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [370.99738, 423.52313, 10.642666, 652.6713, 302.44424, 498.4237, 658.0445, 407.20786, 351.54413, 443.53268]
2025-09-12 07:49:00,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [137.0, 168.0, 45.0, 223.0, 121.0, 170.0, 226.0, 152.0, 136.0, 157.0]
2025-09-12 07:49:00,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 56/100 (estimated time remaining: 10 hours, 4 minutes, 37 seconds)
2025-09-12 08:01:50,708 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:01:50,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:02:42,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 467.43344 ± 117.435
2025-09-12 08:02:42,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [705.84576, 395.03763, 475.95645, 567.75305, 479.3258, 541.74457, 371.22604, 482.15475, 403.39105, 251.8996]
2025-09-12 08:02:42,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [279.0, 140.0, 167.0, 214.0, 166.0, 181.0, 135.0, 186.0, 147.0, 126.0]
2025-09-12 08:02:42,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (467.43) for latency ExtremeClogL1U23
2025-09-12 08:02:42,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 57/100 (estimated time remaining: 9 hours, 52 minutes, 21 seconds)
2025-09-12 08:15:34,951 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:15:34,961 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:16:32,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 514.79443 ± 163.088
2025-09-12 08:16:32,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [476.1285, 466.744, 375.64893, 929.0314, 573.27985, 451.3521, 669.6194, 382.82208, 435.25006, 388.0683]
2025-09-12 08:16:32,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [177.0, 165.0, 166.0, 335.0, 207.0, 156.0, 259.0, 146.0, 159.0, 132.0]
2025-09-12 08:16:32,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (514.79) for latency ExtremeClogL1U23
2025-09-12 08:16:32,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 58/100 (estimated time remaining: 9 hours, 40 minutes, 34 seconds)
2025-09-12 08:28:52,101 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:28:52,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:29:56,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 592.30103 ± 312.288
2025-09-12 08:29:56,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1.5432166, 524.14014, 906.08215, 906.24347, 573.9068, 356.28833, 579.34216, 525.5468, 1160.4374, 389.48004]
2025-09-12 08:29:56,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [16.0, 231.0, 284.0, 295.0, 215.0, 162.0, 197.0, 174.0, 422.0, 137.0]
2025-09-12 08:29:56,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (592.30) for latency ExtremeClogL1U23
2025-09-12 08:29:56,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 59/100 (estimated time remaining: 9 hours, 29 minutes, 28 seconds)
2025-09-12 08:42:39,125 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:42:39,128 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:43:53,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 677.13641 ± 145.872
2025-09-12 08:43:53,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [540.51465, 697.32886, 570.37756, 698.6905, 479.0865, 763.4216, 758.6316, 619.4919, 617.82983, 1025.9913]
2025-09-12 08:43:53,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [188.0, 268.0, 221.0, 211.0, 182.0, 271.0, 274.0, 214.0, 277.0, 375.0]
2025-09-12 08:43:53,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (677.14) for latency ExtremeClogL1U23
2025-09-12 08:43:53,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 60/100 (estimated time remaining: 9 hours, 18 minutes, 21 seconds)
2025-09-12 08:56:34,371 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:56:34,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:57:36,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 614.70740 ± 306.421
2025-09-12 08:57:36,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [755.79865, 1175.6486, 540.2255, 571.7547, 700.8109, 817.3401, 289.79767, 463.26498, 3.1716502, 829.26074]
2025-09-12 08:57:36,531 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [245.0, 351.0, 186.0, 186.0, 212.0, 277.0, 111.0, 161.0, 20.0, 301.0]
2025-09-12 08:57:36,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 61/100 (estimated time remaining: 9 hours, 8 minutes, 51 seconds)
2025-09-12 09:10:16,698 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:10:16,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:11:06,134 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 460.99277 ± 512.822
2025-09-12 09:11:06,134 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [516.1935, 11.53833, 324.80493, 1132.341, 1609.3921, -1.2077013, 5.5193167, 480.13217, 2.0491345, 529.16504]
2025-09-12 09:11:06,134 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [171.0, 23.0, 130.0, 362.0, 502.0, 25.0, 20.0, 196.0, 19.0, 184.0]
2025-09-12 09:11:06,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 62/100 (estimated time remaining: 8 hours, 53 minutes, 28 seconds)
2025-09-12 09:23:51,076 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:23:51,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:24:42,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 458.60312 ± 307.653
2025-09-12 09:24:42,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1095.4089, 349.34137, 2.3780334, 500.10373, 386.24973, 701.86346, -1.8666657, 408.21027, 487.31064, 657.03186]
2025-09-12 09:24:42,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [347.0, 129.0, 17.0, 178.0, 145.0, 238.0, 15.0, 241.0, 161.0, 217.0]
2025-09-12 09:24:42,243 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 63/100 (estimated time remaining: 8 hours, 38 minutes)
2025-09-12 09:37:27,929 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:37:27,932 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:38:56,689 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 845.96210 ± 602.569
2025-09-12 09:38:56,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1348.4219, 444.97485, 3.351445, 397.832, 1028.0983, 2236.893, 779.6136, 500.7527, 531.962, 1187.7219]
2025-09-12 09:38:56,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [466.0, 156.0, 19.0, 155.0, 370.0, 584.0, 340.0, 182.0, 198.0, 490.0]
2025-09-12 09:38:56,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (845.96) for latency ExtremeClogL1U23
2025-09-12 09:38:56,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 64/100 (estimated time remaining: 8 hours, 30 minutes, 39 seconds)
2025-09-12 09:51:39,401 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:51:39,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:52:24,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 382.78339 ± 332.086
2025-09-12 09:52:24,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [377.077, 379.56476, 6.293522, 4.9479747, 440.4529, 1166.0105, 407.36978, 610.9012, 0.48145792, 434.73477]
2025-09-12 09:52:24,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [137.0, 146.0, 18.0, 19.0, 196.0, 399.0, 140.0, 279.0, 23.0, 161.0]
2025-09-12 09:52:24,925 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 65/100 (estimated time remaining: 8 hours, 13 minutes, 20 seconds)
2025-09-12 10:04:59,663 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:04:59,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:06:19,305 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 814.58752 ± 299.335
2025-09-12 10:06:19,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [987.42163, 462.74747, 631.47516, 1021.8378, 1089.0264, 608.01526, 455.11942, 570.2641, 914.6561, 1405.3118]
2025-09-12 10:06:19,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [294.0, 159.0, 182.0, 356.0, 352.0, 175.0, 243.0, 169.0, 284.0, 410.0]
2025-09-12 10:06:19,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 66/100 (estimated time remaining: 8 hours, 59 seconds)
2025-09-12 10:19:08,675 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:19:08,684 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:20:34,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 919.74091 ± 614.194
2025-09-12 10:20:34,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [584.2604, 1812.5541, 9.523709, 2173.6567, 435.1858, 818.8456, 524.1707, 767.5525, 964.01013, 1107.649]
2025-09-12 10:20:34,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [198.0, 532.0, 22.0, 599.0, 150.0, 263.0, 160.0, 239.0, 324.0, 323.0]
2025-09-12 10:20:34,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (919.74) for latency ExtremeClogL1U23
2025-09-12 10:20:34,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 67/100 (estimated time remaining: 7 hours, 52 minutes, 23 seconds)
2025-09-12 10:33:22,784 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:33:22,786 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:34:49,687 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 952.07648 ± 385.003
2025-09-12 10:34:49,688 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1903.1036, 486.7416, 639.67566, 934.9144, 952.83044, 1254.3666, 1128.5223, 738.6442, 685.2832, 796.68286]
2025-09-12 10:34:49,688 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [578.0, 164.0, 207.0, 297.0, 298.0, 348.0, 307.0, 224.0, 226.0, 260.0]
2025-09-12 10:34:49,688 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (952.08) for latency ExtremeClogL1U23
2025-09-12 10:34:49,702 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 68/100 (estimated time remaining: 7 hours, 42 minutes, 49 seconds)
2025-09-12 10:47:58,156 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:47:58,165 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:49:11,315 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 757.51562 ± 504.015
2025-09-12 10:49:11,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [985.51227, 1040.5519, 1699.6345, 1165.8252, 346.18558, -1.0514116, 18.427008, 962.31915, 744.9646, 612.78687]
2025-09-12 10:49:11,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [299.0, 291.0, 539.0, 335.0, 130.0, 14.0, 34.0, 347.0, 231.0, 219.0]
2025-09-12 10:49:11,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 69/100 (estimated time remaining: 7 hours, 29 minutes, 33 seconds)
2025-09-12 11:01:37,976 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:01:37,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:03:20,444 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1040.10449 ± 666.104
2025-09-12 11:03:20,445 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [764.5006, 314.17194, 1600.257, 3.6607745, 1520.398, 1406.0997, 700.7485, 509.3252, 1263.2197, 2318.6636]
2025-09-12 11:03:20,445 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [241.0, 159.0, 508.0, 17.0, 467.0, 460.0, 216.0, 211.0, 392.0, 727.0]
2025-09-12 11:03:20,445 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (1040.10) for latency ExtremeClogL1U23
2025-09-12 11:03:20,453 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 70/100 (estimated time remaining: 7 hours, 19 minutes, 44 seconds)
2025-09-12 11:16:21,645 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:16:21,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:18:09,975 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1194.57727 ± 498.874
2025-09-12 11:18:09,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [930.5335, 1680.9653, 1334.7783, 2029.9592, 241.59142, 746.6176, 1498.1909, 957.5841, 1577.7687, 947.7836]
2025-09-12 11:18:09,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [283.0, 477.0, 404.0, 576.0, 131.0, 261.0, 417.0, 279.0, 474.0, 294.0]
2025-09-12 11:18:09,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (1194.58) for latency ExtremeClogL1U23
2025-09-12 11:18:09,989 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 71/100 (estimated time remaining: 7 hours, 11 minutes, 4 seconds)
2025-09-12 11:30:16,000 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:30:16,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:32:27,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1434.10352 ± 816.002
2025-09-12 11:32:27,265 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [767.4287, 689.0978, 2034.8488, 605.98285, 834.7101, 1755.4973, 1812.9658, 997.61426, 1455.4951, 3387.395]
2025-09-12 11:32:27,266 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [260.0, 192.0, 578.0, 194.0, 259.0, 557.0, 559.0, 322.0, 465.0, 1000.0]
2025-09-12 11:32:27,266 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (1434.10) for latency ExtremeClogL1U23
2025-09-12 11:32:27,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 72/100 (estimated time remaining: 6 hours, 56 minutes, 55 seconds)
2025-09-12 11:45:31,668 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:45:31,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:49:37,048 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2660.94482 ± 977.536
2025-09-12 11:49:37,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2405.8667, 3298.0747, 1757.7655, 3156.7473, 3167.7207, 3201.7712, 3672.011, 3354.8071, 2358.3237, 236.35811]
2025-09-12 11:49:37,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [737.0, 1000.0, 529.0, 1000.0, 947.0, 1000.0, 1000.0, 1000.0, 729.0, 100.0]
2025-09-12 11:49:37,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (2660.94) for latency ExtremeClogL1U23
2025-09-12 11:49:37,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 73/100 (estimated time remaining: 6 hours, 58 minutes, 49 seconds)
2025-09-12 12:02:35,958 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:02:35,963 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:05:08,407 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1487.38635 ± 995.343
2025-09-12 12:05:08,407 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1355.6567, 2798.7393, 1586.8231, 2965.3577, 438.6524, 1068.438, 2909.285, 340.77103, 440.60858, 969.53217]
2025-09-12 12:05:08,407 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [438.0, 1000.0, 495.0, 869.0, 200.0, 398.0, 1000.0, 151.0, 170.0, 343.0]
2025-09-12 12:05:08,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 74/100 (estimated time remaining: 6 hours, 50 minutes, 8 seconds)
2025-09-12 12:17:26,992 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:17:26,996 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:20:20,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1817.35095 ± 958.849
2025-09-12 12:20:20,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1792.7007, 1116.8076, 2884.301, 399.9861, 1681.4857, 2437.454, 1357.043, 3025.1077, 3056.4155, 422.20853]
2025-09-12 12:20:20,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [502.0, 336.0, 1000.0, 145.0, 608.0, 703.0, 382.0, 924.0, 1000.0, 145.0]
2025-09-12 12:20:20,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 75/100 (estimated time remaining: 6 hours, 40 minutes, 23 seconds)
2025-09-12 12:32:48,511 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:32:48,514 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:36:46,853 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2599.73389 ± 967.425
2025-09-12 12:36:46,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3317.3982, 3310.7925, 3288.1472, 1721.9562, 3313.111, 873.2365, 3157.3254, 2870.063, 3226.937, 918.3714]
2025-09-12 12:36:46,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 546.0, 1000.0, 274.0, 1000.0, 892.0, 1000.0, 276.0]
2025-09-12 12:36:46,868 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 33 minutes, 4 seconds)
2025-09-12 12:49:42,164 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:49:42,167 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:53:01,866 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2275.85229 ± 1226.844
2025-09-12 12:53:01,878 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [627.02277, 2300.1755, 2995.0273, 15.160423, 3522.868, 3484.9275, 2408.7764, 928.8635, 3285.0386, 3190.6633]
2025-09-12 12:53:01,878 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [197.0, 657.0, 835.0, 31.0, 1000.0, 1000.0, 634.0, 305.0, 1000.0, 1000.0]
2025-09-12 12:53:01,891 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 26 minutes, 46 seconds)
2025-09-12 13:06:08,566 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:06:08,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:08:29,535 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1510.21045 ± 1294.854
2025-09-12 13:08:29,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3048.1719, 756.33777, 631.79065, 3347.7827, 1106.776, 2582.1917, 3162.208, 459.72614, -0.31520727, 7.4355826]
2025-09-12 13:08:29,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [970.0, 244.0, 208.0, 994.0, 321.0, 783.0, 1000.0, 156.0, 17.0, 20.0]
2025-09-12 13:08:29,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 78/100 (estimated time remaining: 6 hours, 2 minutes, 49 seconds)
2025-09-12 13:21:05,782 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:21:05,786 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:23:21,402 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1386.23828 ± 999.482
2025-09-12 13:23:21,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [502.10437, 2359.482, 1753.3445, 917.3987, 2164.868, 848.1849, 0.9692002, 28.767145, 2851.7576, 2435.5059]
2025-09-12 13:23:21,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [187.0, 753.0, 557.0, 298.0, 701.0, 252.0, 14.0, 60.0, 910.0, 758.0]
2025-09-12 13:23:21,417 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 44 minutes, 9 seconds)
2025-09-12 13:36:05,065 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:36:05,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:39:11,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2110.76782 ± 905.192
2025-09-12 13:39:11,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1573.592, 3104.767, 1229.5729, 1625.1176, 646.4283, 2029.7863, 2089.7922, 3580.0522, 1876.8434, 3351.724]
2025-09-12 13:39:11,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [473.0, 919.0, 348.0, 474.0, 236.0, 579.0, 653.0, 1000.0, 558.0, 1000.0]
2025-09-12 13:39:11,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 31 minutes, 10 seconds)
2025-09-12 13:51:40,090 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:51:40,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:55:17,583 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2403.18799 ± 1149.920
2025-09-12 13:55:17,585 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3258.5815, 3257.7673, 461.085, 1720.2202, 198.41096, 3648.332, 2596.5146, 2766.5142, 3253.887, 2870.567]
2025-09-12 13:55:17,585 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 163.0, 506.0, 94.0, 1000.0, 700.0, 784.0, 1000.0, 883.0]
2025-09-12 13:55:17,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 14 minutes, 2 seconds)
2025-09-12 14:09:01,121 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:09:01,126 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:12:52,636 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2509.52417 ± 905.960
2025-09-12 14:12:52,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1281.8231, 3037.273, 1336.113, 1650.8547, 3345.857, 3268.1538, 1370.5411, 3341.6357, 3247.5347, 3215.4546]
2025-09-12 14:12:52,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [437.0, 885.0, 412.0, 550.0, 1000.0, 968.0, 452.0, 1000.0, 1000.0, 1000.0]
2025-09-12 14:12:52,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 82/100 (estimated time remaining: 5 hours, 3 minutes, 24 seconds)
2025-09-12 14:25:15,186 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:25:15,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:28:05,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1908.72168 ± 1459.385
2025-09-12 14:28:05,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3572.066, 2104.6414, -5.646149, 300.83298, 3450.4626, 1.5159833, 617.24054, 3399.7424, 2246.5974, 3399.763]
2025-09-12 14:28:05,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 577.0, 20.0, 135.0, 1000.0, 18.0, 227.0, 1000.0, 624.0, 999.0]
2025-09-12 14:28:05,325 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 46 minutes, 32 seconds)
2025-09-12 14:40:11,548 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:40:11,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:44:11,969 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2596.25317 ± 986.053
2025-09-12 14:44:11,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3118.5205, 3210.1592, 948.4512, 3349.96, 1648.0516, 3178.311, 3112.8928, 3276.9248, 3333.6006, 785.6611]
2025-09-12 14:44:11,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 280.0, 1000.0, 533.0, 944.0, 1000.0, 1000.0, 1000.0, 273.0]
2025-09-12 14:44:11,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 34 minutes, 51 seconds)
2025-09-12 14:57:35,301 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:57:35,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:00:42,599 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2250.05103 ± 1318.832
2025-09-12 15:00:42,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [638.33124, 605.798, 3311.7283, 355.42633, 3866.0996, 3157.6326, 1633.5609, 3782.1619, 3282.4998, 1867.2708]
2025-09-12 15:00:42,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [198.0, 189.0, 1000.0, 131.0, 1000.0, 824.0, 466.0, 1000.0, 896.0, 508.0]
2025-09-12 15:00:42,620 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 20 minutes, 51 seconds)
2025-09-12 15:12:52,751 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:12:52,758 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:15:55,258 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2042.55505 ± 1481.615
2025-09-12 15:15:55,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3379.4917, 3447.063, 187.83017, 1631.7172, 1451.3795, 3582.8008, 3415.3674, 3326.1772, 1.1112416, 2.612473]
2025-09-12 15:15:55,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 97.0, 509.0, 431.0, 1000.0, 1000.0, 1000.0, 20.0, 13.0]
2025-09-12 15:15:55,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 86/100 (estimated time remaining: 4 hours, 1 minute, 53 seconds)
2025-09-12 15:28:24,588 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:28:24,593 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:32:30,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2756.07080 ± 1238.931
2025-09-12 15:32:30,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3314.8958, 3497.4119, 3422.901, 3226.632, 0.8027141, 3511.2498, 3242.663, 3411.781, 597.7614, 3334.6091]
2025-09-12 15:32:30,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 970.0, 1000.0, 960.0, 19.0, 1000.0, 1000.0, 998.0, 191.0, 1000.0]
2025-09-12 15:32:30,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (2756.07) for latency ExtremeClogL1U23
2025-09-12 15:32:30,128 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 42 minutes, 56 seconds)
2025-09-12 15:46:14,740 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:46:14,747 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:50:38,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 3054.74854 ± 722.921
2025-09-12 15:50:38,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2061.371, 1387.5132, 3562.0842, 3339.2222, 3451.3567, 3415.8752, 3655.724, 2719.687, 3438.147, 3516.5046]
2025-09-12 15:50:38,868 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [597.0, 416.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 766.0, 1000.0, 1000.0]
2025-09-12 15:50:38,868 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (3054.75) for latency ExtremeClogL1U23
2025-09-12 15:50:38,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 34 minutes, 39 seconds)
2025-09-12 16:02:28,392 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:02:28,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:06:16,327 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2589.16577 ± 1331.535
2025-09-12 16:06:16,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1777.7898, 3294.8613, 0.15531178, 3664.203, 3317.9773, 3436.07, 237.80226, 3259.403, 3438.0396, 3465.3574]
2025-09-12 16:06:16,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [499.0, 1000.0, 12.0, 1000.0, 1000.0, 1000.0, 98.0, 1000.0, 1000.0, 1000.0]
2025-09-12 16:06:16,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 16 minutes, 58 seconds)
2025-09-12 16:19:32,445 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:19:32,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:21:43,318 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1561.57520 ± 1225.324
2025-09-12 16:21:43,319 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [248.8733, 2787.8684, 1925.388, 538.0947, 3942.9893, 1427.855, -2.1553993, 2820.858, 670.86176, 1255.1187]
2025-09-12 16:21:43,319 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [109.0, 709.0, 529.0, 190.0, 1000.0, 404.0, 23.0, 778.0, 199.0, 354.0]
2025-09-12 16:21:43,346 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 58 minutes, 13 seconds)
2025-09-12 16:34:39,376 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:34:39,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:38:12,812 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2449.74512 ± 1332.176
2025-09-12 16:38:12,813 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [1214.0402, 4.66033, 3491.6804, 3459.1042, 3361.6438, 3544.4304, 510.18414, 3467.7266, 1960.1555, 3483.8254]
2025-09-12 16:38:12,813 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [361.0, 18.0, 1000.0, 1000.0, 1000.0, 1000.0, 195.0, 1000.0, 568.0, 1000.0]
2025-09-12 16:38:12,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 44 minutes, 35 seconds)
2025-09-12 16:51:09,304 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:51:09,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:54:50,419 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2550.75879 ± 1186.249
2025-09-12 16:54:50,437 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3308.019, 2986.345, 3638.1694, 3550.6882, 6.923268, 1301.4426, 2600.998, 3529.009, 3293.5217, 1292.4722]
2025-09-12 16:54:50,437 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [942.0, 847.0, 1000.0, 1000.0, 28.0, 405.0, 724.0, 1000.0, 1000.0, 395.0]
2025-09-12 16:54:50,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 28 minutes, 12 seconds)
2025-09-12 17:07:27,329 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:07:27,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:09:53,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 1738.82642 ± 1581.560
2025-09-12 17:09:53,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3645.6533, 234.3373, 3722.2498, 3111.8142, 2157.7976, 927.6193, 2.6060276, -2.387756, 3584.3635, 4.2100987]
2025-09-12 17:09:53,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 85.0, 1000.0, 849.0, 629.0, 287.0, 15.0, 18.0, 1000.0, 15.0]
2025-09-12 17:09:54,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 93/100 (estimated time remaining: 2 hours, 6 minutes, 48 seconds)
2025-09-12 17:22:28,580 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:22:28,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:25:36,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2277.93799 ± 1154.736
2025-09-12 17:25:36,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3796.255, 572.022, 1751.9684, 2456.3672, 1081.8586, 3904.3582, 2612.4421, 2185.2512, 3551.9119, 866.94385]
2025-09-12 17:25:36,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 190.0, 462.0, 641.0, 314.0, 1000.0, 719.0, 577.0, 1000.0, 272.0]
2025-09-12 17:25:36,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 51 minutes, 4 seconds)
2025-09-12 17:37:47,651 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:37:47,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:41:47,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2812.11963 ± 1298.040
2025-09-12 17:41:47,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [-0.99310946, 3442.6172, 3691.7876, 3270.493, 3612.4714, 2932.7383, 3676.0068, 3410.7869, 3560.8706, 524.4157]
2025-09-12 17:41:47,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [18.0, 1000.0, 1000.0, 1000.0, 1000.0, 841.0, 1000.0, 1000.0, 1000.0, 190.0]
2025-09-12 17:41:47,960 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 36 minutes, 5 seconds)
2025-09-12 17:54:26,392 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:54:26,398 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:58:25,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2804.58545 ± 1200.206
2025-09-12 17:58:25,751 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2901.1787, 3443.1362, 1080.3407, 3649.173, 3463.6816, 3630.4287, 3724.7703, 3500.0933, 2646.6235, 6.4317236]
2025-09-12 17:58:25,751 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [815.0, 1000.0, 311.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 781.0, 20.0]
2025-09-12 17:58:25,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 20 minutes, 12 seconds)
2025-09-12 18:12:08,383 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:12:08,395 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:16:02,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2757.80322 ± 1216.960
2025-09-12 18:16:02,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3157.5793, 3608.1167, 3280.095, 8.573694, 1113.5055, 2028.7976, 3451.399, 3777.628, 3588.1328, 3564.2036]
2025-09-12 18:16:02,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [866.0, 1000.0, 1000.0, 24.0, 318.0, 579.0, 956.0, 1000.0, 1000.0, 1000.0]
2025-09-12 18:16:02,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 4 minutes, 57 seconds)
2025-09-12 18:27:49,032 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:27:49,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:32:15,732 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 3136.32178 ± 803.755
2025-09-12 18:32:15,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2834.0906, 3574.2488, 3434.063, 3384.2017, 3282.9604, 3415.8252, 3759.748, 3341.9238, 822.446, 3513.7102]
2025-09-12 18:32:15,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [801.0, 1000.0, 1000.0, 1000.0, 864.0, 1000.0, 1000.0, 931.0, 258.0, 1000.0]
2025-09-12 18:32:15,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (3136.32) for latency ExtremeClogL1U23
2025-09-12 18:32:15,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 98/100 (estimated time remaining: 49 minutes, 25 seconds)
2025-09-12 18:45:36,529 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:45:36,534 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:48:45,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2196.84814 ± 1506.872
2025-09-12 18:48:45,216 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3631.9316, 3373.9275, 2.856638, 5.4516325, 3622.6555, 2013.1586, 1848.6227, 3566.001, 3604.7153, 299.15927]
2025-09-12 18:48:45,216 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 17.0, 31.0, 1000.0, 580.0, 531.0, 1000.0, 983.0, 132.0]
2025-09-12 18:48:45,230 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 99/100 (estimated time remaining: 33 minutes, 15 seconds)
2025-09-12 19:01:13,794 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:01:13,797 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:05:35,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 3234.58838 ± 623.276
2025-09-12 19:05:35,506 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [2522.2144, 3720.1638, 3733.8835, 2262.9287, 2658.6855, 3733.655, 3711.4546, 2474.882, 3730.9207, 3797.0967]
2025-09-12 19:05:35,506 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [676.0, 1000.0, 1000.0, 645.0, 709.0, 1000.0, 1000.0, 638.0, 1000.0, 1000.0]
2025-09-12 19:05:35,506 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1226 [INFO]: New best (3234.59) for latency ExtremeClogL1U23
2025-09-12 19:05:35,546 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1199 [INFO]: Iteration 100/100 (estimated time remaining: 16 minutes, 45 seconds)
2025-09-12 19:17:47,811 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:17:47,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:21:28,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1221 [DEBUG]: Total Reward: 2568.66748 ± 1351.026
2025-09-12 19:21:28,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1222 [DEBUG]: All rewards: [3556.8484, 3602.2002, 54.720917, 3266.4863, 325.56586, 3787.8638, 3629.2498, 1760.1069, 2164.7073, 3538.926]
2025-09-12 19:21:28,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 46.0, 959.0, 125.0, 1000.0, 1000.0, 508.0, 613.0, 1000.0]
2025-09-12 19:21:28,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-walker2d):1251 [DEBUG]: Training session finished
