2025-09-11 19:43:47,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc15-walker2d/ExtremeClogL1U23-mbpac_memdelay
2025-09-11 19:43:47,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc15-walker2d/ExtremeClogL1U23-mbpac_memdelay
2025-09-11 19:43:47,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x151515284550>}
2025-09-11 19:43:47,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1111 [DEBUG]: using device: cuda
2025-09-11 19:43:47,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1133 [INFO]: Creating new trainer
2025-09-11 19:43:47,292 baseline-mbpac-noiseperc15-walker2d:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=384, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=6, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(6,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=6, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(6,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 19:43:47,292 baseline-mbpac-noiseperc15-walker2d:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=23, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 19:43:47,299 baseline-mbpac-noiseperc15-walker2d:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=384, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=17, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=17, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=17, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=384, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=6, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 384, batch_first=True)
)
2025-09-11 19:43:48,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1194 [DEBUG]: Starting training session...
2025-09-11 19:43:48,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 1/100
2025-09-11 19:53:54,571 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:53:54,573 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:55:54,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 421.46738 ± 329.972
2025-09-11 19:55:54,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [243.85356, 238.8467, 880.38464, 91.57953, 913.4563, 185.45497, 921.6854, 40.393383, 327.79144, 371.22824]
2025-09-11 19:55:54,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [147.0, 144.0, 1000.0, 252.0, 1000.0, 109.0, 837.0, 138.0, 214.0, 521.0]
2025-09-11 19:55:54,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (421.47) for latency ExtremeClogL1U23
2025-09-11 19:55:54,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 2/100 (estimated time remaining: 19 hours, 57 minutes, 40 seconds)
2025-09-11 20:07:20,123 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:07:20,134 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:07:50,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 35.69270 ± 68.018
2025-09-11 20:07:50,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [16.21782, -18.96604, -28.720228, 53.562496, 66.45215, 4.660353, 36.519344, 221.12788, 3.254751, 2.8184278]
2025-09-11 20:07:50,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [153.0, 95.0, 88.0, 86.0, 154.0, 87.0, 83.0, 134.0, 140.0, 100.0]
2025-09-11 20:07:50,739 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 3/100 (estimated time remaining: 19 hours, 38 minutes, 2 seconds)
2025-09-11 20:19:09,185 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:19:09,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:19:46,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 52.55369 ± 98.585
2025-09-11 20:19:46,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [194.60289, -36.9331, 11.88995, 5.315061, 44.67957, 7.854898, 17.675547, 53.483826, 278.4351, -51.46686]
2025-09-11 20:19:46,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [107.0, 131.0, 144.0, 72.0, 122.0, 146.0, 160.0, 161.0, 184.0, 150.0]
2025-09-11 20:19:46,248 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 4/100 (estimated time remaining: 19 hours, 22 minutes, 55 seconds)
2025-09-11 20:31:13,345 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:31:13,353 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:32:14,741 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 232.79147 ± 179.567
2025-09-11 20:32:14,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [540.2559, 233.45447, 105.973434, 20.3041, 370.46384, 499.90195, 34.314377, 271.47995, 46.43439, 205.33263]
2025-09-11 20:32:14,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [449.0, 150.0, 161.0, 78.0, 265.0, 544.0, 116.0, 149.0, 134.0, 172.0]
2025-09-11 20:32:14,760 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 5/100 (estimated time remaining: 19 hours, 22 minutes, 36 seconds)
2025-09-11 20:43:44,572 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:43:44,574 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:44:49,619 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 326.26059 ± 144.199
2025-09-11 20:44:49,620 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [352.95834, 408.476, 227.98976, 87.0659, 223.90419, 549.0838, 245.64319, 486.7421, 483.2182, 197.52444]
2025-09-11 20:44:49,620 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [190.0, 290.0, 181.0, 195.0, 144.0, 436.0, 178.0, 330.0, 362.0, 120.0]
2025-09-11 20:44:49,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 6/100 (estimated time remaining: 19 hours, 19 minutes, 26 seconds)
2025-09-11 20:56:06,105 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:56:06,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:56:40,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 207.37236 ± 108.772
2025-09-11 20:56:40,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [4.876104, 287.39597, 236.46384, 306.94534, 45.7437, 278.15057, 296.49075, 168.9186, 126.86852, 321.87033]
2025-09-11 20:56:40,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [17.0, 187.0, 141.0, 169.0, 56.0, 155.0, 176.0, 92.0, 84.0, 181.0]
2025-09-11 20:56:40,194 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 7/100 (estimated time remaining: 19 hours, 2 minutes, 26 seconds)
2025-09-11 21:08:10,284 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:08:10,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:09:01,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 315.50049 ± 207.109
2025-09-11 21:09:01,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [263.9532, 759.0983, 448.67218, 2.7057896, 253.45932, 516.2234, 268.61658, 57.765133, 285.6961, 298.81506]
2025-09-11 21:09:01,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [147.0, 496.0, 247.0, 23.0, 169.0, 246.0, 149.0, 79.0, 171.0, 154.0]
2025-09-11 21:09:01,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 8/100 (estimated time remaining: 18 hours, 58 minutes, 1 second)
2025-09-11 21:20:28,696 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:20:28,698 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:21:17,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 281.56552 ± 193.474
2025-09-11 21:21:17,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [497.882, 127.342834, -0.19007893, 327.54813, 292.90906, 560.94543, 441.09277, 82.68301, 437.7099, 47.732204]
2025-09-11 21:21:17,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [272.0, 102.0, 20.0, 232.0, 195.0, 321.0, 257.0, 129.0, 219.0, 65.0]
2025-09-11 21:21:17,286 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 9/100 (estimated time remaining: 18 hours, 51 minutes, 55 seconds)
2025-09-11 21:32:47,779 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:32:47,781 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:33:39,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 384.27866 ± 137.530
2025-09-11 21:33:39,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [338.35944, 582.5103, 549.0193, 345.6111, 395.57523, 259.95142, 235.52286, 165.84067, 403.62424, 566.77216]
2025-09-11 21:33:39,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [168.0, 282.0, 248.0, 184.0, 196.0, 149.0, 123.0, 108.0, 181.0, 236.0]
2025-09-11 21:33:39,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 10/100 (estimated time remaining: 18 hours, 37 minutes, 40 seconds)
2025-09-11 21:44:59,704 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:44:59,712 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:45:51,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 355.78766 ± 56.055
2025-09-11 21:45:51,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [321.1118, 373.1194, 277.0287, 453.10437, 256.55417, 355.31967, 409.1871, 393.38828, 366.7689, 352.29446]
2025-09-11 21:45:51,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [166.0, 238.0, 149.0, 209.0, 129.0, 179.0, 196.0, 181.0, 223.0, 193.0]
2025-09-11 21:45:51,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 11/100 (estimated time remaining: 18 hours, 18 minutes, 25 seconds)
2025-09-11 21:57:11,102 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:57:11,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:57:59,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 295.35800 ± 118.909
2025-09-11 21:57:59,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [275.21286, 318.86505, 320.1245, 328.8795, 56.88583, 101.86491, 343.36063, 452.10077, 334.50095, 421.78528]
2025-09-11 21:57:59,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [266.0, 171.0, 150.0, 182.0, 113.0, 106.0, 209.0, 182.0, 166.0, 250.0]
2025-09-11 21:57:59,455 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 12/100 (estimated time remaining: 18 hours, 11 minutes, 30 seconds)
2025-09-11 22:09:31,198 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:09:31,200 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:10:10,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 305.83221 ± 121.109
2025-09-11 22:10:10,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [330.80988, 3.6609564, 515.43365, 386.1077, 345.85898, 278.21112, 330.1291, 290.64972, 305.9892, 271.47174]
2025-09-11 22:10:10,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [151.0, 13.0, 204.0, 168.0, 164.0, 118.0, 158.0, 145.0, 221.0, 111.0]
2025-09-11 22:10:10,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 13/100 (estimated time remaining: 17 hours, 56 minutes, 5 seconds)
2025-09-11 22:21:37,664 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:21:37,667 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:22:30,135 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 416.85651 ± 108.782
2025-09-11 22:22:30,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [353.71152, 453.96353, 342.5264, 600.9553, 308.12952, 347.49008, 333.3396, 544.51154, 314.71402, 569.2236]
2025-09-11 22:22:30,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [186.0, 169.0, 161.0, 242.0, 142.0, 163.0, 170.0, 307.0, 147.0, 262.0]
2025-09-11 22:22:30,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 14/100 (estimated time remaining: 17 hours, 45 minutes, 7 seconds)
2025-09-11 22:33:53,115 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:33:53,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:34:38,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 297.39746 ± 188.466
2025-09-11 22:34:38,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [489.43192, 72.3368, 409.1555, 421.4663, 310.64578, 488.83856, 304.89124, 470.50137, 0.955927, 5.751249]
2025-09-11 22:34:38,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [326.0, 54.0, 170.0, 200.0, 191.0, 306.0, 127.0, 270.0, 14.0, 18.0]
2025-09-11 22:34:38,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 15/100 (estimated time remaining: 17 hours, 28 minutes, 57 seconds)
2025-09-11 22:46:12,859 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:46:12,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:47:22,510 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 388.54147 ± 244.174
2025-09-11 22:47:22,511 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [325.4933, 501.50467, 263.75113, 449.31488, 216.31516, 498.32108, 58.26738, 1009.1702, 255.3903, 307.88678]
2025-09-11 22:47:22,511 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [157.0, 242.0, 145.0, 200.0, 116.0, 408.0, 73.0, 897.0, 139.0, 201.0]
2025-09-11 22:47:22,525 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 16/100 (estimated time remaining: 17 hours, 25 minutes, 54 seconds)
2025-09-11 22:58:47,824 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:58:47,827 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:59:18,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 235.87283 ± 207.478
2025-09-11 22:59:18,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [106.77289, 4.6938825, 4.1651683, 418.56778, 43.846455, 6.274632, 384.8658, 501.27548, 490.62347, 397.64252]
2025-09-11 22:59:18,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [76.0, 17.0, 16.0, 214.0, 44.0, 22.0, 153.0, 209.0, 216.0, 175.0]
2025-09-11 22:59:18,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 17/100 (estimated time remaining: 17 hours, 10 minutes, 10 seconds)
2025-09-11 23:10:38,327 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:10:38,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:11:48,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 428.89081 ± 232.644
2025-09-11 23:11:48,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [966.0013, 390.09094, 420.42206, 311.42725, 367.57355, 29.680313, 263.3928, 397.6945, 519.7823, 622.84326]
2025-09-11 23:11:48,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [656.0, 216.0, 279.0, 217.0, 232.0, 73.0, 140.0, 232.0, 280.0, 268.0]
2025-09-11 23:11:48,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (428.89) for latency ExtremeClogL1U23
2025-09-11 23:11:48,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 18/100 (estimated time remaining: 17 hours, 3 minutes, 8 seconds)
2025-09-11 23:23:17,859 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:23:17,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:24:26,163 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 515.57471 ± 236.654
2025-09-11 23:24:26,167 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [358.15762, 249.07083, 411.716, 612.80725, 940.5686, 423.6848, 550.794, 942.84845, 285.5604, 380.53882]
2025-09-11 23:24:26,167 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [164.0, 111.0, 190.0, 233.0, 527.0, 234.0, 208.0, 568.0, 155.0, 131.0]
2025-09-11 23:24:26,167 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (515.57) for latency ExtremeClogL1U23
2025-09-11 23:24:26,195 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 19/100 (estimated time remaining: 16 hours, 55 minutes, 43 seconds)
2025-09-11 23:36:08,985 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:36:09,003 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:37:02,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 441.11768 ± 202.711
2025-09-11 23:37:02,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [382.9882, 539.6063, 265.0618, 314.83252, 417.0567, 732.24915, 28.42484, 446.10257, 726.5927, 558.26196]
2025-09-11 23:37:02,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [176.0, 229.0, 124.0, 163.0, 195.0, 342.0, 36.0, 188.0, 315.0, 232.0]
2025-09-11 23:37:02,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 20/100 (estimated time remaining: 16 hours, 50 minutes, 54 seconds)
2025-09-11 23:48:09,291 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:48:09,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:48:51,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 327.24054 ± 96.612
2025-09-11 23:48:51,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [223.3956, 461.5352, 323.61417, 455.83087, 456.062, 330.6626, 276.02078, 167.81227, 302.12656, 275.3451]
2025-09-11 23:48:51,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [116.0, 203.0, 153.0, 226.0, 194.0, 151.0, 144.0, 104.0, 130.0, 124.0]
2025-09-11 23:48:51,152 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 21/100 (estimated time remaining: 16 hours, 23 minutes, 38 seconds)
2025-09-12 00:00:21,666 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:00:21,669 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:01:27,820 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 527.47083 ± 274.975
2025-09-12 00:01:27,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [414.92935, 1146.048, 641.2096, 361.9896, 414.35287, 573.3303, 392.42892, 641.0248, 666.9132, 22.481377]
2025-09-12 00:01:27,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [174.0, 594.0, 245.0, 177.0, 157.0, 282.0, 200.0, 287.0, 294.0, 35.0]
2025-09-12 00:01:27,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (527.47) for latency ExtremeClogL1U23
2025-09-12 00:01:27,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 22/100 (estimated time remaining: 16 hours, 22 minutes, 1 second)
2025-09-12 00:12:53,035 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:12:53,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:13:43,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 424.54037 ± 176.759
2025-09-12 00:13:43,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [367.07193, 438.6698, 478.69553, 283.36267, 565.6249, 696.5141, 10.932355, 560.7796, 457.5866, 386.16635]
2025-09-12 00:13:43,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [160.0, 192.0, 201.0, 113.0, 233.0, 322.0, 19.0, 214.0, 212.0, 184.0]
2025-09-12 00:13:43,251 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 23/100 (estimated time remaining: 16 hours, 5 minutes, 52 seconds)
2025-09-12 00:25:24,577 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:25:24,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:26:16,034 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 440.07001 ± 150.735
2025-09-12 00:26:16,034 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [666.675, 384.90393, 217.09944, 442.11267, 605.6179, 436.05374, 457.40433, 627.9922, 211.70747, 351.13367]
2025-09-12 00:26:16,034 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [235.0, 194.0, 108.0, 215.0, 231.0, 186.0, 193.0, 266.0, 102.0, 170.0]
2025-09-12 00:26:16,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 24/100 (estimated time remaining: 15 hours, 52 minutes, 11 seconds)
2025-09-12 00:37:37,770 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:37:37,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:39:25,705 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 910.65979 ± 424.428
2025-09-12 00:39:25,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [945.9844, 946.24554, 563.2733, 1043.3456, 1570.328, 1255.178, 1513.4698, 561.66534, 373.94046, 333.16653]
2025-09-12 00:39:25,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [429.0, 416.0, 299.0, 402.0, 666.0, 553.0, 562.0, 304.0, 201.0, 137.0]
2025-09-12 00:39:25,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (910.66) for latency ExtremeClogL1U23
2025-09-12 00:39:25,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 25/100 (estimated time remaining: 15 hours, 48 minutes, 14 seconds)
2025-09-12 00:50:53,012 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:50:53,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:51:45,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 442.07388 ± 97.314
2025-09-12 00:51:45,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [425.7617, 319.98026, 669.771, 395.8858, 420.0033, 414.47836, 341.3945, 495.9473, 538.8851, 398.6316]
2025-09-12 00:51:45,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [199.0, 143.0, 235.0, 181.0, 184.0, 185.0, 174.0, 215.0, 241.0, 200.0]
2025-09-12 00:51:45,899 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 26/100 (estimated time remaining: 15 hours, 43 minutes, 41 seconds)
2025-09-12 01:03:11,130 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:03:11,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:04:06,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 440.52881 ± 288.443
2025-09-12 01:04:06,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [325.47443, 803.411, 570.41943, 908.828, 374.71158, 697.7038, 230.43166, 54.576435, 7.799737, 431.93225]
2025-09-12 01:04:06,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [122.0, 311.0, 307.0, 371.0, 219.0, 325.0, 126.0, 80.0, 30.0, 158.0]
2025-09-12 01:04:06,180 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 27/100 (estimated time remaining: 15 hours, 27 minutes, 3 seconds)
2025-09-12 01:15:27,674 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:15:27,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:16:57,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 779.33606 ± 579.675
2025-09-12 01:16:57,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [789.77716, 570.12897, 492.40128, 912.85016, 2381.67, 436.7844, 103.14429, 546.1683, 680.7666, 879.669]
2025-09-12 01:16:57,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [314.0, 278.0, 196.0, 356.0, 902.0, 217.0, 166.0, 217.0, 321.0, 372.0]
2025-09-12 01:16:57,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 28/100 (estimated time remaining: 15 hours, 23 minutes, 16 seconds)
2025-09-12 01:28:38,237 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:28:38,241 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:30:02,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 757.32953 ± 328.572
2025-09-12 01:30:02,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [396.06033, 847.9588, 934.1402, 776.5524, 627.0028, 839.53253, 912.7346, 972.7534, 1259.4421, 7.117794]
2025-09-12 01:30:02,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [237.0, 320.0, 371.0, 306.0, 284.0, 341.0, 409.0, 379.0, 489.0, 19.0]
2025-09-12 01:30:02,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 29/100 (estimated time remaining: 15 hours, 18 minutes, 27 seconds)
2025-09-12 01:41:04,003 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:41:04,009 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:42:52,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1034.38184 ± 603.986
2025-09-12 01:42:52,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [314.12424, 975.00464, 1586.9661, 316.95792, 1911.7672, 1931.8932, 1058.4181, 478.09595, 451.80096, 1318.7914]
2025-09-12 01:42:52,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [145.0, 380.0, 580.0, 145.0, 678.0, 740.0, 436.0, 232.0, 200.0, 485.0]
2025-09-12 01:42:52,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (1034.38) for latency ExtremeClogL1U23
2025-09-12 01:42:52,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 30/100 (estimated time remaining: 15 hours, 57 seconds)
2025-09-12 01:54:22,368 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:54:22,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:56:51,797 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1492.44751 ± 728.814
2025-09-12 01:56:51,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [832.77106, 627.1211, 1690.9672, 2204.3535, 1067.0298, 1298.1221, 2787.2031, 946.1417, 938.54694, 2532.2197]
2025-09-12 01:56:51,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [296.0, 217.0, 686.0, 782.0, 394.0, 443.0, 971.0, 352.0, 391.0, 1000.0]
2025-09-12 01:56:51,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (1492.45) for latency ExtremeClogL1U23
2025-09-12 01:56:51,810 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 31/100 (estimated time remaining: 15 hours, 11 minutes, 22 seconds)
2025-09-12 02:09:04,043 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:09:04,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:11:46,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1629.32019 ± 740.901
2025-09-12 02:11:46,417 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [492.2051, 1593.8462, 1171.5116, 351.39752, 2616.2615, 1728.8558, 2333.3022, 2019.9061, 1515.5668, 2470.3486]
2025-09-12 02:11:46,417 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [256.0, 566.0, 444.0, 149.0, 1000.0, 654.0, 773.0, 687.0, 460.0, 906.0]
2025-09-12 02:11:46,417 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (1629.32) for latency ExtremeClogL1U23
2025-09-12 02:11:46,422 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 32/100 (estimated time remaining: 15 hours, 33 minutes, 51 seconds)
2025-09-12 02:23:04,283 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:23:04,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:25:48,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1880.34961 ± 833.827
2025-09-12 02:25:48,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3179.555, 2573.1558, 1575.0498, 1291.3345, 2528.893, 2493.267, 1192.846, 599.2404, 861.0105, 2509.1436]
2025-09-12 02:25:48,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [972.0, 867.0, 499.0, 446.0, 728.0, 804.0, 366.0, 250.0, 280.0, 809.0]
2025-09-12 02:25:48,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (1880.35) for latency ExtremeClogL1U23
2025-09-12 02:25:48,378 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 33/100 (estimated time remaining: 15 hours, 36 minutes, 19 seconds)
2025-09-12 02:36:40,060 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:36:40,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:40:19,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2597.56494 ± 793.101
2025-09-12 02:40:19,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [1019.90466, 3650.2366, 1777.9962, 1916.75, 3323.4595, 2990.507, 3158.8523, 2149.1528, 2843.7776, 3145.0117]
2025-09-12 02:40:19,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [378.0, 1000.0, 574.0, 591.0, 950.0, 952.0, 1000.0, 681.0, 1000.0, 1000.0]
2025-09-12 02:40:19,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (2597.56) for latency ExtremeClogL1U23
2025-09-12 02:40:19,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 34/100 (estimated time remaining: 15 hours, 41 minutes, 38 seconds)
2025-09-12 02:52:10,194 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:52:10,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:54:55,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1852.84595 ± 1082.132
2025-09-12 02:54:55,226 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [438.01096, 3140.6484, 3221.2444, 915.08685, 382.3301, 2715.41, 2348.8904, 1284.8899, 1104.9874, 2976.961]
2025-09-12 02:54:55,226 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [155.0, 1000.0, 1000.0, 337.0, 169.0, 805.0, 852.0, 448.0, 383.0, 928.0]
2025-09-12 02:54:55,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 35/100 (estimated time remaining: 15 hours, 50 minutes, 59 seconds)
2025-09-12 03:05:43,789 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:05:43,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:07:42,591 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1292.37305 ± 1320.161
2025-09-12 03:07:42,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [398.329, 3327.664, 220.46626, 4.658867, 3046.4932, 758.6081, 6.033677, 1951.982, 184.78415, 3024.7107]
2025-09-12 03:07:42,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [163.0, 993.0, 111.0, 15.0, 1000.0, 288.0, 20.0, 623.0, 116.0, 1000.0]
2025-09-12 03:07:42,599 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 36/100 (estimated time remaining: 15 hours, 21 minutes)
2025-09-12 03:19:55,592 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:19:55,597 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:23:02,932 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2216.97510 ± 1080.932
2025-09-12 03:23:02,932 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3108.9001, 1365.2694, 3361.7004, 799.94244, 3391.646, 2666.5054, 2365.027, 3279.068, 312.91324, 1518.7783]
2025-09-12 03:23:02,932 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 469.0, 1000.0, 326.0, 1000.0, 784.0, 679.0, 1000.0, 141.0, 510.0]
2025-09-12 03:23:02,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 37/100 (estimated time remaining: 15 hours, 12 minutes, 19 seconds)
2025-09-12 03:33:38,250 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:33:38,260 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:36:36,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2088.01978 ± 1495.391
2025-09-12 03:36:36,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3345.0398, 201.03914, 3105.7961, 3369.146, 3180.2195, 200.30449, 3449.1567, 3354.1968, 657.1186, 18.180159]
2025-09-12 03:36:36,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 94.0, 1000.0, 1000.0, 1000.0, 96.0, 1000.0, 1000.0, 236.0, 27.0]
2025-09-12 03:36:36,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 38/100 (estimated time remaining: 14 hours, 52 minutes, 9 seconds)
2025-09-12 03:48:24,812 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:48:24,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:52:11,731 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2712.98682 ± 892.540
2025-09-12 03:52:11,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3250.4783, 518.7123, 3381.3052, 3344.7979, 3176.7874, 1920.8933, 2016.8417, 3194.8086, 3325.5244, 2999.7205]
2025-09-12 03:52:11,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 248.0, 1000.0, 1000.0, 1000.0, 663.0, 608.0, 1000.0, 964.0, 891.0]
2025-09-12 03:52:11,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (2712.99) for latency ExtremeClogL1U23
2025-09-12 03:52:11,739 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 39/100 (estimated time remaining: 14 hours, 51 minutes, 14 seconds)
2025-09-12 04:03:23,817 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:03:23,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:05:37,245 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1578.75122 ± 1132.148
2025-09-12 04:05:37,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3182.1343, 1297.8657, 173.99059, 1578.4062, 207.72955, 1336.2842, 3538.4326, 1156.1523, 603.64154, 2712.8745]
2025-09-12 04:05:37,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 439.0, 85.0, 523.0, 102.0, 443.0, 1000.0, 353.0, 217.0, 769.0]
2025-09-12 04:05:37,252 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 40/100 (estimated time remaining: 14 hours, 22 minutes, 32 seconds)
2025-09-12 04:17:36,667 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:17:36,671 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:19:53,286 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1611.90161 ± 1478.805
2025-09-12 04:19:53,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3576.8562, 879.09875, 2021.6542, 11.734063, 9.483282, 2645.1958, 3445.6602, 8.089207, 3361.8115, 159.4335]
2025-09-12 04:19:53,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 319.0, 629.0, 32.0, 22.0, 832.0, 1000.0, 34.0, 1000.0, 91.0]
2025-09-12 04:19:53,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 41/100 (estimated time remaining: 14 hours, 26 minutes, 8 seconds)
2025-09-12 04:30:57,094 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:30:57,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:33:54,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2153.06958 ± 702.701
2025-09-12 04:33:54,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2457.961, 2834.466, 1842.3702, 2708.9329, 2599.7668, 1328.4584, 1731.671, 1141.1776, 3371.93, 1513.9615]
2025-09-12 04:33:54,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [751.0, 873.0, 555.0, 804.0, 743.0, 425.0, 562.0, 359.0, 1000.0, 505.0]
2025-09-12 04:33:54,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 42/100 (estimated time remaining: 13 hours, 56 minutes, 8 seconds)
2025-09-12 04:45:21,158 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:45:21,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:48:31,459 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2241.11230 ± 1175.650
2025-09-12 04:48:31,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2846.2285, 2975.352, 3112.9636, 178.37181, 3331.277, 831.5483, 3270.7742, 3160.2651, 2139.708, 564.63434]
2025-09-12 04:48:31,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [823.0, 965.0, 1000.0, 102.0, 1000.0, 286.0, 943.0, 926.0, 713.0, 234.0]
2025-09-12 04:48:31,469 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 43/100 (estimated time remaining: 13 hours, 54 minutes, 10 seconds)
2025-09-12 04:59:44,179 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:59:44,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:02:13,655 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1796.81213 ± 1357.201
2025-09-12 05:02:13,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2530.444, 4.6638346, 3283.206, 712.2892, 3318.2434, 744.3186, 1036.7743, 3528.937, 2814.6455, -5.4023337]
2025-09-12 05:02:13,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [717.0, 17.0, 1000.0, 229.0, 1000.0, 264.0, 334.0, 1000.0, 872.0, 20.0]
2025-09-12 05:02:13,688 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 44/100 (estimated time remaining: 13 hours, 18 minutes, 22 seconds)
2025-09-12 05:13:47,378 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:13:47,384 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:16:49,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2252.00244 ± 1339.076
2025-09-12 05:16:49,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3406.97, 3388.457, 3543.269, 1784.1055, 15.134771, 3542.169, 2656.4104, 194.25925, 940.81824, 3048.4343]
2025-09-12 05:16:49,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [951.0, 1000.0, 1000.0, 574.0, 40.0, 1000.0, 786.0, 94.0, 308.0, 863.0]
2025-09-12 05:16:49,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 45/100 (estimated time remaining: 13 hours, 17 minutes, 26 seconds)
2025-09-12 05:28:28,873 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:28:28,878 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:31:27,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2302.57080 ± 1400.454
2025-09-12 05:31:27,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3535.3203, 3550.3542, 3709.4177, 1517.2438, 3711.5957, 326.28278, 237.9969, 2014.2661, 3569.3596, 853.8708]
2025-09-12 05:31:27,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 993.0, 459.0, 1000.0, 126.0, 114.0, 603.0, 1000.0, 296.0]
2025-09-12 05:31:27,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 46/100 (estimated time remaining: 13 hours, 7 minutes, 18 seconds)
2025-09-12 05:43:12,829 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:43:12,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:46:55,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2776.11670 ± 1180.255
2025-09-12 05:46:55,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3456.8687, 3494.6492, 3621.9045, 3340.0413, 3360.715, 3244.1348, 387.60986, 2906.7832, 499.77176, 3448.6912]
2025-09-12 05:46:55,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 150.0, 849.0, 200.0, 1000.0]
2025-09-12 05:46:55,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (2776.12) for latency ExtremeClogL1U23
2025-09-12 05:46:55,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 47/100 (estimated time remaining: 13 hours, 8 minutes, 32 seconds)
2025-09-12 05:57:28,536 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:57:28,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:00:48,789 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2493.20312 ± 1420.989
2025-09-12 06:00:48,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3423.899, 175.83865, 3416.8271, 3598.9749, -0.27993074, 3448.204, 943.2639, 2823.8044, 3582.4487, 3519.0505]
2025-09-12 06:00:48,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 102.0, 1000.0, 1000.0, 23.0, 995.0, 391.0, 833.0, 1000.0, 1000.0]
2025-09-12 06:00:48,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 48/100 (estimated time remaining: 12 hours, 46 minutes, 15 seconds)
2025-09-12 06:13:05,815 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:13:05,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:15:26,915 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1782.35022 ± 1458.194
2025-09-12 06:15:26,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3692.8481, 203.82324, 3167.2036, 2675.3423, 72.332565, 2025.03, 2490.1938, 46.811264, 3443.0256, 6.893229]
2025-09-12 06:15:26,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 97.0, 887.0, 751.0, 50.0, 616.0, 707.0, 44.0, 1000.0, 17.0]
2025-09-12 06:15:26,962 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 49/100 (estimated time remaining: 12 hours, 41 minutes, 30 seconds)
2025-09-12 06:26:10,106 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:26:10,110 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:29:11,167 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2238.63428 ± 1453.447
2025-09-12 06:29:11,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [1767.4579, 7.300048, 3138.0522, 3450.885, 257.19598, 170.49028, 3421.2822, 3535.9346, 3202.8413, 3434.9043]
2025-09-12 06:29:11,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [546.0, 21.0, 888.0, 964.0, 111.0, 76.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 06:29:11,178 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 50/100 (estimated time remaining: 12 hours, 18 minutes, 7 seconds)
2025-09-12 06:40:33,333 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:40:33,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:43:16,000 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 1959.14771 ± 1532.982
2025-09-12 06:43:16,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2187.3662, 3365.6545, 3398.9224, 3463.458, 3335.4102, 4.7268686, 3309.9084, 191.36694, 10.143698, 324.51917]
2025-09-12 06:43:16,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [639.0, 1000.0, 1000.0, 1000.0, 1000.0, 28.0, 1000.0, 92.0, 20.0, 153.0]
2025-09-12 06:43:16,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 51/100 (estimated time remaining: 11 hours, 58 minutes, 3 seconds)
2025-09-12 06:54:25,944 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:54:25,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:58:19,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3079.88647 ± 762.288
2025-09-12 06:58:19,578 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3469.6587, 2950.314, 3388.0725, 3666.062, 2134.168, 3453.6287, 1185.845, 3506.002, 3456.9985, 3588.1165]
2025-09-12 06:58:19,578 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 817.0, 1000.0, 1000.0, 604.0, 1000.0, 376.0, 969.0, 1000.0, 1000.0]
2025-09-12 06:58:19,578 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (3079.89) for latency ExtremeClogL1U23
2025-09-12 06:58:19,584 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 52/100 (estimated time remaining: 11 hours, 39 minutes, 46 seconds)
2025-09-12 07:10:20,547 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:10:20,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:13:25,934 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2378.40576 ± 1467.643
2025-09-12 07:13:25,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3503.0579, 518.0977, 3608.5938, 3483.5752, 3431.465, 3566.1672, 162.79436, 1799.032, 3553.9473, 157.32903]
2025-09-12 07:13:25,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 182.0, 1000.0, 1000.0, 1000.0, 1000.0, 93.0, 566.0, 1000.0, 65.0]
2025-09-12 07:13:25,943 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 53/100 (estimated time remaining: 11 hours, 37 minutes, 8 seconds)
2025-09-12 07:24:49,575 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:24:49,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:28:08,226 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2621.62427 ± 1315.205
2025-09-12 07:28:08,228 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3540.2405, 254.7802, 3593.2725, 2152.8584, 3402.3298, 3604.4744, 3086.667, 3581.7273, 2992.9458, 6.944798]
2025-09-12 07:28:08,228 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 108.0, 1000.0, 616.0, 1000.0, 1000.0, 849.0, 1000.0, 807.0, 17.0]
2025-09-12 07:28:08,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 54/100 (estimated time remaining: 11 hours, 23 minutes, 15 seconds)
2025-09-12 07:39:39,195 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:39:39,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:43:40,517 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3178.15576 ± 733.557
2025-09-12 07:43:40,520 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3546.2505, 1358.8619, 3686.835, 3663.7693, 3344.2346, 3706.2178, 3164.8037, 2240.4255, 3520.6855, 3549.474]
2025-09-12 07:43:40,520 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 410.0, 1000.0, 1000.0, 1000.0, 1000.0, 864.0, 644.0, 1000.0, 1000.0]
2025-09-12 07:43:40,520 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (3178.16) for latency ExtremeClogL1U23
2025-09-12 07:43:40,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 55/100 (estimated time remaining: 11 hours, 25 minutes, 18 seconds)
2025-09-12 07:54:23,533 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:54:23,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:58:51,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3468.96631 ± 153.327
2025-09-12 07:58:51,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3500.8745, 3711.2468, 3395.0876, 3450.4968, 3429.4097, 3402.0547, 3524.794, 3105.3787, 3560.4832, 3609.8403]
2025-09-12 07:58:51,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 826.0, 1000.0, 1000.0]
2025-09-12 07:58:51,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (3468.97) for latency ExtremeClogL1U23
2025-09-12 07:58:51,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 56/100 (estimated time remaining: 11 hours, 20 minutes, 22 seconds)
2025-09-12 08:10:13,117 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:10:13,129 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:13:28,345 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2473.64209 ± 1315.688
2025-09-12 08:13:28,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2229.8562, 3231.5442, 3428.5386, 3507.678, 3503.8408, 1370.3181, 3577.7957, 172.33266, 253.86014, 3460.6558]
2025-09-12 08:13:28,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [664.0, 1000.0, 1000.0, 1000.0, 1000.0, 418.0, 1000.0, 79.0, 125.0, 1000.0]
2025-09-12 08:13:28,370 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 57/100 (estimated time remaining: 11 hours, 1 minute, 17 seconds)
2025-09-12 08:24:40,330 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:24:40,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:29:10,921 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3569.18091 ± 80.387
2025-09-12 08:29:10,923 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3612.0286, 3548.8958, 3683.5579, 3602.827, 3480.0203, 3449.176, 3637.7688, 3614.1917, 3441.492, 3621.8533]
2025-09-12 08:29:10,923 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 08:29:10,923 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (3569.18) for latency ExtremeClogL1U23
2025-09-12 08:29:10,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 58/100 (estimated time remaining: 10 hours, 51 minutes, 26 seconds)
2025-09-12 08:41:11,989 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:41:11,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:45:35,285 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3412.75073 ± 471.220
2025-09-12 08:45:35,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3512.9395, 3548.1487, 3491.962, 3475.8071, 3662.0168, 3542.1912, 3629.6353, 3574.7422, 3677.223, 2012.8398]
2025-09-12 08:45:35,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 614.0]
2025-09-12 08:45:35,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 59/100 (estimated time remaining: 10 hours, 50 minutes, 35 seconds)
2025-09-12 08:56:51,171 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:56:51,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:59:38,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2291.19727 ± 1745.388
2025-09-12 08:59:38,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3829.0813, 300.81863, 3672.3335, 3704.5579, 48.59982, 3780.5227, 252.072, 25.209555, 3613.6758, 3685.1023]
2025-09-12 08:59:38,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 131.0, 1000.0, 1000.0, 45.0, 1000.0, 107.0, 41.0, 1000.0, 1000.0]
2025-09-12 08:59:38,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 60/100 (estimated time remaining: 10 hours, 22 minutes, 56 seconds)
2025-09-12 09:10:30,135 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:10:30,140 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:13:04,606 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2050.38013 ± 1725.753
2025-09-12 09:13:04,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [248.3845, 3900.4993, 6.727099, 3691.4019, 3693.2673, 3771.008, 3507.625, 18.755089, 5.957266, 1660.176]
2025-09-12 09:13:04,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [107.0, 1000.0, 19.0, 1000.0, 1000.0, 1000.0, 1000.0, 52.0, 38.0, 474.0]
2025-09-12 09:13:04,648 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 61/100 (estimated time remaining: 9 hours, 53 minutes, 42 seconds)
2025-09-12 09:24:39,406 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:24:39,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:27:47,729 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2536.43433 ± 1485.485
2025-09-12 09:27:47,731 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3648.5579, 1386.2723, 3757.2983, 2.591111, 3669.3477, 3820.5142, 1652.5698, 3713.5408, 3526.14, 187.51137]
2025-09-12 09:27:47,731 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 392.0, 1000.0, 15.0, 1000.0, 1000.0, 483.0, 1000.0, 1000.0, 100.0]
2025-09-12 09:27:47,756 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 62/100 (estimated time remaining: 9 hours, 39 minutes, 43 seconds)
2025-09-12 09:38:56,669 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:38:56,680 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:42:39,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3000.91968 ± 1120.462
2025-09-12 09:42:39,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3672.5369, 3632.5713, 3773.3564, 433.8564, 3749.6313, 3622.522, 3712.5867, 1578.2322, 3633.2822, 2200.6228]
2025-09-12 09:42:39,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 172.0, 1000.0, 1000.0, 1000.0, 451.0, 1000.0, 644.0]
2025-09-12 09:42:39,131 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 63/100 (estimated time remaining: 9 hours, 18 minutes, 22 seconds)
2025-09-12 09:54:40,086 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:54:40,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:57:53,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2704.09204 ± 1456.975
2025-09-12 09:57:53,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3878.116, 229.98578, 1608.8823, 431.17404, 3895.3364, 3802.0054, 3910.3706, 3871.0464, 1726.8939, 3687.1072]
2025-09-12 09:57:53,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 100.0, 460.0, 155.0, 1000.0, 1000.0, 1000.0, 1000.0, 480.0, 933.0]
2025-09-12 09:57:53,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 64/100 (estimated time remaining: 8 hours, 55 minutes)
2025-09-12 10:09:23,523 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:09:23,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:13:23,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3317.33398 ± 1050.184
2025-09-12 10:13:23,047 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [539.0944, 3862.2844, 3988.1724, 2219.17, 3790.9707, 3867.07, 3927.9995, 3829.223, 3752.6643, 3396.6907]
2025-09-12 10:13:23,047 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [195.0, 1000.0, 1000.0, 592.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 941.0]
2025-09-12 10:13:23,053 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 65/100 (estimated time remaining: 8 hours, 50 minutes, 55 seconds)
2025-09-12 10:24:11,788 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:24:11,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:27:02,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2356.41797 ± 1615.611
2025-09-12 10:27:02,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2789.623, 3644.4915, 3801.605, 9.858036, 2090.2166, 3710.2761, 34.4686, 6.7945666, 3848.9482, 3627.8977]
2025-09-12 10:27:02,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [789.0, 1000.0, 1000.0, 21.0, 558.0, 1000.0, 31.0, 19.0, 1000.0, 1000.0]
2025-09-12 10:27:02,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 66/100 (estimated time remaining: 8 hours, 37 minutes, 43 seconds)
2025-09-12 10:37:56,755 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:37:56,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:41:57,488 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3302.13818 ± 1000.117
2025-09-12 10:41:57,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3748.9524, 3744.4658, 3618.4048, 3051.7668, 3729.041, 3630.455, 363.58347, 3742.341, 3782.4172, 3609.955]
2025-09-12 10:41:57,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 832.0, 1000.0, 1000.0, 126.0, 1000.0, 1000.0, 1000.0]
2025-09-12 10:41:57,504 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 67/100 (estimated time remaining: 8 hours, 24 minutes, 18 seconds)
2025-09-12 10:53:46,430 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:53:46,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:57:42,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3276.38550 ± 1115.381
2025-09-12 10:57:42,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3743.1548, 3890.2473, 3006.1682, 3729.5627, 3456.0403, 3571.7234, 3823.6182, 3742.7761, 3790.2373, 10.327228]
2025-09-12 10:57:42,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 773.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 22.0]
2025-09-12 10:57:42,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 68/100 (estimated time remaining: 8 hours, 15 minutes, 20 seconds)
2025-09-12 11:09:34,064 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:09:34,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:13:27,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3264.07764 ± 1223.708
2025-09-12 11:13:27,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3797.8257, 3830.8887, 1585.1279, 3985.8357, 3851.1714, 3730.1892, 3649.6626, 222.48274, 4023.398, 3964.1929]
2025-09-12 11:13:27,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 463.0, 1000.0, 1000.0, 1000.0, 1000.0, 111.0, 1000.0, 1000.0]
2025-09-12 11:13:27,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 3 minutes, 37 seconds)
2025-09-12 11:24:09,715 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:24:09,719 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:28:17,513 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3337.17236 ± 767.930
2025-09-12 11:28:17,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3507.2297, 3686.7546, 3742.8328, 1983.5472, 1653.5187, 3683.7346, 3874.895, 3745.412, 3784.45, 3709.3467]
2025-09-12 11:28:17,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 537.0, 466.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 11:28:17,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 70/100 (estimated time remaining: 7 hours, 44 minutes, 25 seconds)
2025-09-12 11:39:54,700 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:39:54,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:44:09,349 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3622.24487 ± 640.670
2025-09-12 11:44:09,351 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3911.9111, 1822.663, 3968.1511, 3796.7434, 3805.652, 3965.635, 3937.7393, 3861.0217, 3176.6055, 3976.3276]
2025-09-12 11:44:09,351 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 532.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 833.0, 1000.0]
2025-09-12 11:44:09,351 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (3622.24) for latency ExtremeClogL1U23
2025-09-12 11:44:09,374 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 71/100 (estimated time remaining: 7 hours, 42 minutes, 42 seconds)
2025-09-12 11:55:42,513 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:55:42,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:59:52,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3410.73486 ± 936.248
2025-09-12 11:59:52,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3861.7754, 4093.7478, 3799.2969, 2139.392, 3984.5032, 1083.2162, 3745.12, 3795.4802, 3715.3591, 3889.4565]
2025-09-12 11:59:52,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 606.0, 1000.0, 358.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 11:59:52,105 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 72/100 (estimated time remaining: 7 hours, 31 minutes, 52 seconds)
2025-09-12 12:11:48,561 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:11:48,565 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:15:45,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3353.76953 ± 1153.757
2025-09-12 12:15:45,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3882.047, 81.42446, 3786.3726, 3973.844, 2637.141, 3940.1465, 3791.798, 3716.9595, 4023.3347, 3704.6267]
2025-09-12 12:15:45,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 65.0, 1000.0, 1000.0, 729.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 12:15:45,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 73/100 (estimated time remaining: 7 hours, 17 minutes, 5 seconds)
2025-09-12 12:26:13,278 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:26:13,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:30:10,345 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3309.60278 ± 1104.266
2025-09-12 12:30:10,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3824.131, 3821.0176, 2602.1558, 3606.9392, 3698.2146, 3729.1868, 182.17471, 3888.0261, 3829.5056, 3914.6772]
2025-09-12 12:30:10,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 713.0, 1000.0, 1000.0, 1000.0, 84.0, 1000.0, 1000.0, 1000.0]
2025-09-12 12:30:10,365 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 74/100 (estimated time remaining: 6 hours, 54 minutes, 17 seconds)
2025-09-12 12:42:07,009 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:42:07,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:46:14,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3537.01953 ± 700.552
2025-09-12 12:46:14,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3930.576, 2196.9866, 3952.7156, 3809.079, 3187.4768, 4098.8223, 3997.0068, 2247.9995, 3884.9688, 4064.5645]
2025-09-12 12:46:14,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 598.0, 1000.0, 1000.0, 863.0, 1000.0, 1000.0, 617.0, 1000.0, 1000.0]
2025-09-12 12:46:14,868 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 75/100 (estimated time remaining: 6 hours, 45 minutes, 22 seconds)
2025-09-12 12:56:36,613 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:56:36,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:59:17,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2314.95166 ± 1535.100
2025-09-12 12:59:17,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [8.3274355, 7.465607, 1534.0226, 4003.1816, 1037.5529, 2775.1975, 3879.1255, 4012.966, 2082.4597, 3809.216]
2025-09-12 12:59:17,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [22.0, 22.0, 421.0, 1000.0, 320.0, 716.0, 1000.0, 1000.0, 560.0, 1000.0]
2025-09-12 12:59:17,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 15 minutes, 39 seconds)
2025-09-12 13:10:59,671 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:10:59,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:14:52,437 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3301.46094 ± 1116.960
2025-09-12 13:14:52,438 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3893.3691, 3705.1538, 3711.5457, 225.20605, 3812.7046, 3878.1714, 3826.4368, 3867.3582, 3758.7349, 2335.9292]
2025-09-12 13:14:52,438 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 100.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 638.0]
2025-09-12 13:14:52,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 1 second)
2025-09-12 13:26:29,373 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:26:29,377 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:30:56,266 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3927.90112 ± 92.904
2025-09-12 13:30:56,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3813.3342, 3978.3357, 3983.7458, 4024.2642, 3903.5356, 3803.0251, 3987.2402, 3797.9993, 4074.3223, 3913.2085]
2025-09-12 13:30:56,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:30:56,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (3927.90) for latency ExtremeClogL1U23
2025-09-12 13:30:56,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 78/100 (estimated time remaining: 5 hours, 45 minutes, 50 seconds)
2025-09-12 13:41:25,897 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:41:25,901 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:45:26,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3378.02856 ± 917.016
2025-09-12 13:45:26,714 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3953.7002, 3892.0972, 971.0798, 2372.7842, 3986.5254, 3789.6326, 3819.0186, 3665.1047, 3546.5764, 3783.7666]
2025-09-12 13:45:26,714 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 323.0, 699.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:45:26,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 31 minutes, 11 seconds)
2025-09-12 13:56:44,386 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:56:44,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:00:11,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2898.44800 ± 1503.445
2025-09-12 14:00:11,014 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3892.8396, 3840.0168, 1321.8318, 266.1123, 3811.2751, 3894.287, 3844.683, 3982.7695, 329.10645, 3801.5576]
2025-09-12 14:00:11,014 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 388.0, 135.0, 1000.0, 1000.0, 1000.0, 1000.0, 117.0, 1000.0]
2025-09-12 14:00:11,021 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 10 minutes, 31 seconds)
2025-09-12 14:12:09,248 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:12:09,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:16:05,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3300.14771 ± 1056.935
2025-09-12 14:16:05,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [2965.7393, 3906.1719, 3720.3926, 316.6024, 3853.0703, 3908.6804, 3732.9434, 3824.5925, 2906.7893, 3866.4922]
2025-09-12 14:16:05,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [807.0, 1000.0, 1000.0, 128.0, 1000.0, 1000.0, 1000.0, 1000.0, 766.0, 1000.0]
2025-09-12 14:16:05,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 7 minutes, 14 seconds)
2025-09-12 14:27:03,834 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:27:03,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:31:31,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3781.96216 ± 93.435
2025-09-12 14:31:31,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3699.415, 3738.4578, 3781.3052, 3907.5552, 3597.526, 3709.1672, 3818.5334, 3808.038, 3914.0073, 3845.617]
2025-09-12 14:31:31,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 14:31:31,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 51 minutes, 16 seconds)
2025-09-12 14:42:40,755 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:42:40,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:45:45,865 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2481.62158 ± 1547.632
2025-09-12 14:45:45,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3918.9553, 3748.3218, 501.22653, 824.5443, 3684.1113, 3592.2854, 1131.848, 3640.2847, 3766.5452, 8.094221]
2025-09-12 14:45:45,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 172.0, 266.0, 1000.0, 1000.0, 326.0, 1000.0, 1000.0, 22.0]
2025-09-12 14:45:45,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 29 minutes, 22 seconds)
2025-09-12 14:56:55,500 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:56:55,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:59:54,573 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2504.15405 ± 1681.538
2025-09-12 14:59:54,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [1550.1178, 3886.488, 4031.5574, 3764.0752, 211.45255, 3715.5176, 3892.1892, 3742.6782, 31.51298, 215.95175]
2025-09-12 14:59:54,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [425.0, 1000.0, 1000.0, 1000.0, 89.0, 1000.0, 1000.0, 1000.0, 42.0, 153.0]
2025-09-12 14:59:54,612 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 13 minutes, 10 seconds)
2025-09-12 15:11:22,350 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:11:22,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:15:48,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3898.76489 ± 52.905
2025-09-12 15:15:48,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3819.0337, 3894.9258, 3921.6787, 3788.0552, 3908.22, 3936.1873, 3903.6663, 3976.3386, 3905.2532, 3934.2883]
2025-09-12 15:15:48,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 15:15:48,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 2 minutes, 1 second)
2025-09-12 15:26:39,452 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:26:39,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:30:17,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3104.68994 ± 1472.498
2025-09-12 15:30:17,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3750.3398, 3703.689, 3826.9304, 320.81586, 3991.937, 3880.1125, 3952.9753, 3910.8235, 17.238392, 3692.0374]
2025-09-12 15:30:17,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 127.0, 1000.0, 1000.0, 1000.0, 1000.0, 31.0, 1000.0]
2025-09-12 15:30:17,686 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 42 minutes, 35 seconds)
2025-09-12 15:41:46,833 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:41:46,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:46:17,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3901.49365 ± 83.198
2025-09-12 15:46:17,954 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3965.0815, 3901.7908, 3950.0872, 3968.2305, 3911.739, 3743.2698, 3891.2834, 3935.6204, 3997.2693, 3750.565]
2025-09-12 15:46:17,954 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 15:46:17,961 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 29 minutes, 21 seconds)
2025-09-12 15:57:24,852 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:57:24,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:01:56,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3835.15039 ± 95.274
2025-09-12 16:01:56,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3694.7886, 3844.3914, 3686.6526, 3780.4026, 3787.863, 3939.204, 3850.6238, 3850.1267, 3928.1648, 3989.2834]
2025-09-12 16:01:56,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 16:01:56,054 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 18 minutes, 2 seconds)
2025-09-12 16:13:46,474 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:13:46,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:17:32,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3302.48950 ± 1088.753
2025-09-12 16:17:32,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [4068.509, 3830.3525, 3694.918, 3969.982, 2862.5493, 3843.1816, 4047.344, 1621.7671, 4181.762, 904.53076]
2025-09-12 16:17:32,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 735.0, 1000.0, 1000.0, 452.0, 1000.0, 257.0]
2025-09-12 16:17:32,875 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 6 minutes, 19 seconds)
2025-09-12 16:28:45,740 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:28:45,781 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:32:42,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3435.44263 ± 736.232
2025-09-12 16:32:42,921 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3152.744, 3920.5906, 3972.7087, 3808.1733, 4078.2034, 3786.0786, 2316.08, 3609.8342, 3900.0447, 1809.9655]
2025-09-12 16:32:42,921 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [818.0, 1000.0, 1000.0, 1000.0, 1000.0, 974.0, 600.0, 1000.0, 1000.0, 511.0]
2025-09-12 16:32:42,939 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 49 minutes, 10 seconds)
2025-09-12 16:43:57,362 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:43:57,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:47:55,228 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3490.28271 ± 1122.374
2025-09-12 16:47:55,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3941.884, 3998.0532, 3023.7812, 3933.4546, 3937.6536, 4010.7507, 3824.056, 233.6761, 3914.9375, 4084.579]
2025-09-12 16:47:55,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 784.0, 1000.0, 1000.0, 1000.0, 1000.0, 105.0, 1000.0, 1000.0]
2025-09-12 16:47:55,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 35 minutes, 15 seconds)
2025-09-12 16:59:07,484 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:59:07,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:02:41,720 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3022.00830 ± 1475.268
2025-09-12 17:02:41,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [12.274486, 3821.6487, 3878.402, 3915.732, 2837.7283, 265.4829, 3794.4175, 3903.5898, 3861.965, 3928.843]
2025-09-12 17:02:41,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [23.0, 1000.0, 1000.0, 1000.0, 734.0, 95.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:02:41,729 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 17 minutes, 30 seconds)
2025-09-12 17:13:26,796 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:13:26,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:16:50,774 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 2964.74829 ± 1522.660
2025-09-12 17:16:50,775 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3933.9744, 3944.8538, 756.5492, 3957.944, 3978.1414, 1277.7617, 6.9781675, 3963.4968, 3943.8564, 3883.9282]
2025-09-12 17:16:50,775 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 227.0, 1000.0, 1000.0, 387.0, 18.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:16:50,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 59 minutes, 51 seconds)
2025-09-12 17:28:07,130 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:28:07,140 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:32:37,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3837.33081 ± 124.920
2025-09-12 17:32:37,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3744.9053, 3861.9539, 3992.8674, 3941.6082, 3573.248, 3844.129, 3681.2234, 3919.7708, 3881.983, 3931.6174]
2025-09-12 17:32:37,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [974.0, 1000.0, 1000.0, 1000.0, 950.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:32:37,825 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 45 minutes, 6 seconds)
2025-09-12 17:44:17,885 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:44:17,889 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:48:24,761 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3575.40698 ± 1084.446
2025-09-12 17:48:24,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [325.5967, 3998.6204, 3812.8152, 3929.1316, 3951.7031, 3947.8918, 3950.9001, 3988.7996, 3887.6863, 3960.925]
2025-09-12 17:48:24,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [114.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:48:24,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 30 minutes, 50 seconds)
2025-09-12 17:59:12,989 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:59:12,998 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:03:18,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3622.25903 ± 1135.957
2025-09-12 18:03:18,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3936.5305, 3891.9858, 3894.3145, 4132.1885, 4084.946, 224.14375, 4103.555, 3901.2283, 4039.2712, 4014.4246]
2025-09-12 18:03:18,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 963.0, 1000.0, 1000.0, 131.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 18:03:18,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 15 minutes, 23 seconds)
2025-09-12 18:14:19,513 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:14:19,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:18:47,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 4045.24146 ± 51.375
2025-09-12 18:18:47,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [4126.316, 4022.723, 4011.279, 4051.8625, 4101.8115, 4061.821, 3950.7068, 4059.0635, 4084.5703, 3982.2595]
2025-09-12 18:18:47,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 18:18:47,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1226 [INFO]: New best (4045.24) for latency ExtremeClogL1U23
2025-09-12 18:18:47,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 52 seconds)
2025-09-12 18:30:29,973 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:30:29,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:34:07,658 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3204.68042 ± 1602.246
2025-09-12 18:34:07,660 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3875.48, 8.53489, 4144.8467, 3957.3572, 4105.5474, 4052.2559, 4019.8489, 3831.286, 4049.3972, 2.2513006]
2025-09-12 18:34:07,660 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 19.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 987.0, 1000.0, 19.0]
2025-09-12 18:34:07,669 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 98/100 (estimated time remaining: 46 minutes, 22 seconds)
2025-09-12 18:45:17,770 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:45:17,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:49:07,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3446.73096 ± 1153.144
2025-09-12 18:49:07,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [4003.3643, 4065.7522, 1116.6329, 4004.915, 4099.804, 3975.429, 4014.6765, 1171.478, 4127.8154, 3887.4414]
2025-09-12 18:49:07,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 307.0, 1000.0, 1000.0, 1000.0, 1000.0, 318.0, 1000.0, 1000.0]
2025-09-12 18:49:07,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 99/100 (estimated time remaining: 30 minutes, 35 seconds)
2025-09-12 18:59:35,961 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:59:35,965 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:03:23,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3364.75513 ± 1303.524
2025-09-12 19:03:23,058 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3996.9854, 1343.8124, 4132.867, 4016.9287, 3862.2026, 3966.0383, 3959.7378, 4016.3984, 4084.1614, 268.41846]
2025-09-12 19:03:23,058 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [1000.0, 353.0, 1000.0, 1000.0, 1000.0, 966.0, 1000.0, 1000.0, 1000.0, 102.0]
2025-09-12 19:03:23,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1199 [INFO]: Iteration 100/100 (estimated time remaining: 14 minutes, 59 seconds)
2025-09-12 19:15:08,337 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:15:08,341 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:19:24,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1221 [DEBUG]: Total Reward: 3921.01514 ± 380.648
2025-09-12 19:19:24,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1222 [DEBUG]: All rewards: [3199.512, 4005.9187, 4024.6003, 4176.676, 4116.778, 4211.889, 4139.6597, 3139.256, 4114.9575, 4080.9075]
2025-09-12 19:19:24,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1223 [DEBUG]: All trajectory lengths: [793.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 769.0, 1000.0, 1000.0]
2025-09-12 19:19:24,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-walker2d):1251 [DEBUG]: Training session finished
