2025-09-11 19:28:09,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc10-humanoid/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 19:28:09,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc10-humanoid/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 19:28:09,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x149766eb42d0>}
2025-09-11 19:28:09,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1111 [DEBUG]: using device: cuda
2025-09-11 19:28:09,240 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1133 [INFO]: Creating new trainer
2025-09-11 19:28:09,259 baseline-mbpac-noiseperc10-humanoid:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (tanh_refit): NNTanhRefit(
    scale: tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000,
             0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000]]), shift: tensor([[-0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000]])
  )
)
2025-09-11 19:28:09,259 baseline-mbpac-noiseperc10-humanoid:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=393, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 19:28:09,270 baseline-mbpac-noiseperc10-humanoid:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=376, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=17, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 19:28:10,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1194 [DEBUG]: Starting training session...
2025-09-11 19:28:10,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 1/100
2025-09-11 19:41:02,098 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:41:02,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:41:21,453 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 354.78958 ± 95.374
2025-09-11 19:41:21,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [524.37335, 408.27817, 404.12283, 372.12662, 243.69731, 483.11005, 318.8335, 289.36523, 275.37283, 228.61606]
2025-09-11 19:41:21,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [103.0, 78.0, 78.0, 71.0, 47.0, 104.0, 59.0, 64.0, 53.0, 44.0]
2025-09-11 19:41:21,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (354.79) for latency ExtremeClogL1U23
2025-09-11 19:41:21,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 2/100 (estimated time remaining: 21 hours, 44 minutes, 50 seconds)
2025-09-11 19:55:46,879 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:55:46,880 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:56:08,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 396.24393 ± 82.083
2025-09-11 19:56:08,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [500.22934, 408.2616, 442.18253, 523.34125, 423.67654, 263.84375, 306.16602, 414.1174, 391.20908, 289.4119]
2025-09-11 19:56:08,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [105.0, 76.0, 84.0, 102.0, 80.0, 56.0, 60.0, 80.0, 86.0, 63.0]
2025-09-11 19:56:08,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (396.24) for latency ExtremeClogL1U23
2025-09-11 19:56:08,887 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 3/100 (estimated time remaining: 22 hours, 50 minutes, 31 seconds)
2025-09-11 20:10:16,235 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:10:16,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:10:34,587 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 337.82346 ± 111.791
2025-09-11 20:10:34,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [500.27786, 351.13553, 270.94415, 416.59134, 426.20688, 419.1433, 397.9762, 284.77762, 140.55054, 170.631]
2025-09-11 20:10:34,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [94.0, 68.0, 58.0, 79.0, 82.0, 91.0, 76.0, 53.0, 27.0, 33.0]
2025-09-11 20:10:34,636 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 4/100 (estimated time remaining: 22 hours, 50 minutes, 54 seconds)
2025-09-11 20:24:53,643 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:24:53,645 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:25:16,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 430.27026 ± 124.345
2025-09-11 20:25:16,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [472.53055, 305.1091, 478.78177, 488.30905, 293.23776, 357.84256, 489.00845, 409.0632, 719.5003, 289.32]
2025-09-11 20:25:16,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [91.0, 59.0, 89.0, 97.0, 55.0, 66.0, 92.0, 91.0, 148.0, 54.0]
2025-09-11 20:25:16,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (430.27) for latency ExtremeClogL1U23
2025-09-11 20:25:16,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 5/100 (estimated time remaining: 22 hours, 50 minutes, 30 seconds)
2025-09-11 20:39:44,468 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:39:44,469 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:40:12,629 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 503.77475 ± 105.934
2025-09-11 20:40:12,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [697.8077, 503.70828, 343.61646, 530.4659, 637.272, 540.4303, 520.99255, 347.5661, 466.42474, 449.46332]
2025-09-11 20:40:12,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [134.0, 95.0, 69.0, 107.0, 125.0, 107.0, 115.0, 75.0, 97.0, 84.0]
2025-09-11 20:40:12,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (503.77) for latency ExtremeClogL1U23
2025-09-11 20:40:12,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 6/100 (estimated time remaining: 22 hours, 48 minutes, 36 seconds)
2025-09-11 20:54:39,569 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:54:39,591 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:55:08,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 503.02496 ± 117.747
2025-09-11 20:55:08,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [427.94055, 326.8941, 691.75635, 418.5348, 647.62634, 630.3184, 493.58966, 399.86774, 421.67334, 572.04846]
2025-09-11 20:55:08,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [95.0, 75.0, 132.0, 79.0, 136.0, 136.0, 95.0, 89.0, 94.0, 109.0]
2025-09-11 20:55:08,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 7/100 (estimated time remaining: 23 hours, 7 minutes, 8 seconds)
2025-09-11 21:09:30,023 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:09:30,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:09:55,231 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 477.82422 ± 91.844
2025-09-11 21:09:55,231 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [366.7615, 422.87582, 457.69077, 499.03983, 636.7107, 301.7892, 483.2044, 534.72125, 564.8128, 510.63583]
2025-09-11 21:09:55,231 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [77.0, 77.0, 84.0, 93.0, 122.0, 61.0, 89.0, 101.0, 108.0, 95.0]
2025-09-11 21:09:55,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 8/100 (estimated time remaining: 22 hours, 52 minutes, 10 seconds)
2025-09-11 21:24:16,966 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:24:16,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:24:38,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 364.58350 ± 69.995
2025-09-11 21:24:38,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [384.67746, 322.28436, 262.2844, 390.7427, 348.55725, 308.5942, 332.93655, 349.11862, 412.6098, 534.0296]
2025-09-11 21:24:38,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [83.0, 64.0, 53.0, 86.0, 76.0, 67.0, 71.0, 75.0, 88.0, 117.0]
2025-09-11 21:24:38,453 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 9/100 (estimated time remaining: 22 hours, 42 minutes, 46 seconds)
2025-09-11 21:39:05,691 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:39:05,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:39:28,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 448.14062 ± 85.158
2025-09-11 21:39:28,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [362.6658, 537.32574, 349.04004, 439.9265, 459.23184, 356.1256, 625.1481, 381.89844, 471.96457, 498.07947]
2025-09-11 21:39:28,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [66.0, 100.0, 66.0, 81.0, 85.0, 66.0, 121.0, 69.0, 87.0, 93.0]
2025-09-11 21:39:28,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 10/100 (estimated time remaining: 22 hours, 30 minutes, 26 seconds)
2025-09-11 21:53:49,077 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:53:49,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:54:12,353 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 435.60287 ± 98.901
2025-09-11 21:54:12,353 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [311.72592, 330.69128, 404.85156, 587.24335, 494.26443, 467.4087, 272.34064, 452.8871, 554.4865, 480.12924]
2025-09-11 21:54:12,353 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [69.0, 64.0, 76.0, 118.0, 92.0, 88.0, 55.0, 84.0, 101.0, 89.0]
2025-09-11 21:54:12,356 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 11/100 (estimated time remaining: 22 hours, 11 minutes, 54 seconds)
2025-09-11 22:08:37,170 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:08:37,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:08:58,389 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 398.62152 ± 106.276
2025-09-11 22:08:58,389 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [161.21854, 515.42065, 402.65668, 309.23975, 417.35022, 438.3028, 319.9433, 490.38416, 397.43036, 534.2689]
2025-09-11 22:08:58,389 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [31.0, 95.0, 75.0, 67.0, 79.0, 81.0, 63.0, 91.0, 73.0, 115.0]
2025-09-11 22:08:58,418 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 12/100 (estimated time remaining: 21 hours, 54 minutes, 11 seconds)
2025-09-11 22:23:23,227 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:23:23,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:23:46,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 429.13583 ± 39.342
2025-09-11 22:23:46,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [446.8825, 509.77667, 382.59546, 429.8227, 435.6657, 468.03796, 444.8318, 406.24515, 378.85944, 388.6411]
2025-09-11 22:23:46,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [96.0, 96.0, 73.0, 83.0, 79.0, 87.0, 89.0, 75.0, 72.0, 74.0]
2025-09-11 22:23:46,315 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 13/100 (estimated time remaining: 21 hours, 39 minutes, 46 seconds)
2025-09-11 22:38:13,762 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:38:13,764 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:38:37,559 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 443.97089 ± 130.808
2025-09-11 22:38:37,559 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [557.3938, 380.36484, 494.97018, 366.77747, 401.19092, 521.6032, 493.039, 113.454155, 580.6728, 530.2427]
2025-09-11 22:38:37,559 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [118.0, 80.0, 92.0, 72.0, 76.0, 98.0, 92.0, 22.0, 108.0, 106.0]
2025-09-11 22:38:37,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 14/100 (estimated time remaining: 21 hours, 27 minutes, 20 seconds)
2025-09-11 22:52:59,671 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:52:59,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:53:22,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 435.32031 ± 63.052
2025-09-11 22:53:22,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [497.88123, 372.5312, 579.5864, 392.267, 453.23776, 461.49728, 376.38507, 370.5845, 411.70145, 437.53104]
2025-09-11 22:53:22,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [91.0, 68.0, 108.0, 72.0, 84.0, 86.0, 69.0, 69.0, 76.0, 81.0]
2025-09-11 22:53:22,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 15/100 (estimated time remaining: 21 hours, 10 minutes, 51 seconds)
2025-09-11 23:07:48,560 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:07:48,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:08:14,141 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 496.60480 ± 99.558
2025-09-11 23:08:14,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [373.29034, 723.21576, 436.29025, 543.5605, 426.6054, 442.68268, 541.69366, 591.06537, 478.33133, 409.3123]
2025-09-11 23:08:14,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [68.0, 135.0, 82.0, 101.0, 84.0, 82.0, 104.0, 109.0, 87.0, 75.0]
2025-09-11 23:08:14,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 16/100 (estimated time remaining: 20 hours, 58 minutes, 30 seconds)
2025-09-11 23:22:33,249 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:22:33,251 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:23:00,320 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 517.22180 ± 72.716
2025-09-11 23:23:00,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [639.7331, 605.38556, 432.89456, 524.2025, 541.3347, 454.07007, 423.1592, 533.7837, 576.2874, 441.36725]
2025-09-11 23:23:00,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [121.0, 114.0, 80.0, 101.0, 103.0, 84.0, 78.0, 113.0, 108.0, 82.0]
2025-09-11 23:23:00,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (517.22) for latency ExtremeClogL1U23
2025-09-11 23:23:00,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 17/100 (estimated time remaining: 20 hours, 43 minutes, 44 seconds)
2025-09-11 23:37:22,850 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:37:22,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:37:51,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 516.49231 ± 76.035
2025-09-11 23:37:51,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [522.91986, 438.60944, 526.5601, 490.50705, 414.93445, 522.5311, 634.4417, 467.90076, 667.76074, 478.75827]
2025-09-11 23:37:51,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [110.0, 80.0, 111.0, 91.0, 76.0, 110.0, 134.0, 85.0, 143.0, 90.0]
2025-09-11 23:37:51,564 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 18/100 (estimated time remaining: 20 hours, 29 minutes, 51 seconds)
2025-09-11 23:52:18,542 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:52:18,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:52:43,332 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 464.32513 ± 109.035
2025-09-11 23:52:43,332 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [429.02066, 776.6532, 390.45905, 460.51212, 495.7212, 423.96887, 442.76553, 373.67578, 436.83008, 413.64496]
2025-09-11 23:52:43,332 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [88.0, 146.0, 79.0, 87.0, 91.0, 78.0, 95.0, 68.0, 80.0, 76.0]
2025-09-11 23:52:43,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 19/100 (estimated time remaining: 20 hours, 15 minutes, 10 seconds)
2025-09-12 00:07:08,059 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:07:08,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:07:32,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 457.50494 ± 97.968
2025-09-12 00:07:32,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [391.99084, 589.94495, 487.45413, 470.0465, 296.02744, 608.4668, 472.9338, 347.13153, 532.32965, 378.72348]
2025-09-12 00:07:32,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [84.0, 112.0, 90.0, 88.0, 54.0, 128.0, 89.0, 68.0, 101.0, 72.0]
2025-09-12 00:07:32,531 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 20/100 (estimated time remaining: 20 hours, 1 minute, 35 seconds)
2025-09-12 00:22:02,888 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:22:02,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:22:29,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 493.80438 ± 69.916
2025-09-12 00:22:29,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [460.1682, 407.0167, 439.11435, 449.76227, 513.9212, 421.07816, 625.9558, 479.12027, 573.2298, 568.67725]
2025-09-12 00:22:29,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [86.0, 75.0, 80.0, 83.0, 106.0, 77.0, 133.0, 102.0, 105.0, 105.0]
2025-09-12 00:22:29,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 21/100 (estimated time remaining: 19 hours, 47 minutes, 58 seconds)
2025-09-12 00:36:54,837 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:36:54,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:37:16,065 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 406.05478 ± 151.460
2025-09-12 00:37:16,065 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [449.38495, 583.18414, 447.49118, 513.5934, 401.05627, 451.7391, 553.5487, 140.48717, 108.65386, 411.4093]
2025-09-12 00:37:16,065 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [85.0, 113.0, 83.0, 99.0, 74.0, 83.0, 104.0, 27.0, 21.0, 77.0]
2025-09-12 00:37:16,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 22/100 (estimated time remaining: 19 hours, 33 minutes, 21 seconds)
2025-09-12 00:51:36,864 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:51:36,866 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:52:08,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 564.00488 ± 151.308
2025-09-12 00:52:08,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [635.543, 575.0957, 427.7893, 435.84125, 609.05273, 539.4552, 931.6332, 340.41232, 592.5415, 552.68445]
2025-09-12 00:52:08,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [123.0, 109.0, 85.0, 83.0, 127.0, 117.0, 197.0, 62.0, 110.0, 102.0]
2025-09-12 00:52:08,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (564.00) for latency ExtremeClogL1U23
2025-09-12 00:52:08,030 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 23/100 (estimated time remaining: 19 hours, 18 minutes, 40 seconds)
2025-09-12 01:06:50,761 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:06:50,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:07:17,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 497.04681 ± 93.244
2025-09-12 01:07:17,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [613.3054, 451.3749, 417.73016, 400.97345, 709.16223, 554.7985, 472.20898, 462.0753, 451.9679, 436.87173]
2025-09-12 01:07:17,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [113.0, 86.0, 77.0, 77.0, 134.0, 121.0, 98.0, 84.0, 89.0, 80.0]
2025-09-12 01:07:17,475 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 24/100 (estimated time remaining: 19 hours, 8 minutes, 21 seconds)
2025-09-12 01:21:31,459 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:21:31,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:21:59,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 510.09155 ± 155.359
2025-09-12 01:21:59,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [615.52856, 351.86496, 472.76483, 134.20284, 671.9481, 557.11926, 628.2852, 528.6496, 649.51764, 491.0348]
2025-09-12 01:21:59,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [120.0, 77.0, 87.0, 26.0, 127.0, 103.0, 125.0, 100.0, 127.0, 105.0]
2025-09-12 01:21:59,124 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 25/100 (estimated time remaining: 18 hours, 51 minutes, 32 seconds)
2025-09-12 01:36:26,120 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:36:26,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:36:58,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 626.70410 ± 167.907
2025-09-12 01:36:58,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [537.3166, 540.7371, 1084.6527, 728.104, 562.4002, 482.45798, 648.2722, 513.9885, 535.1447, 633.96704]
2025-09-12 01:36:58,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [99.0, 103.0, 211.0, 137.0, 103.0, 92.0, 120.0, 96.0, 100.0, 119.0]
2025-09-12 01:36:58,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (626.70) for latency ExtremeClogL1U23
2025-09-12 01:36:58,758 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 26/100 (estimated time remaining: 18 hours, 37 minutes, 25 seconds)
2025-09-12 01:51:29,452 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:51:29,459 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:51:53,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 456.06161 ± 320.851
2025-09-12 01:51:53,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [656.1443, 130.37918, 119.497765, 454.75134, 370.26923, 113.64743, 129.06645, 945.05096, 949.76794, 692.0414]
2025-09-12 01:51:53,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [122.0, 25.0, 23.0, 83.0, 72.0, 22.0, 25.0, 187.0, 187.0, 139.0]
2025-09-12 01:51:53,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 27/100 (estimated time remaining: 18 hours, 24 minutes, 25 seconds)
2025-09-12 02:06:19,531 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:06:19,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:06:46,386 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 510.82489 ± 193.669
2025-09-12 02:06:46,386 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [709.0494, 761.4117, 455.2004, 550.5978, 316.818, 406.9058, 783.66583, 146.6074, 412.09985, 565.8926]
2025-09-12 02:06:46,386 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [148.0, 145.0, 83.0, 104.0, 58.0, 74.0, 145.0, 28.0, 76.0, 106.0]
2025-09-12 02:06:46,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 28/100 (estimated time remaining: 18 hours, 9 minutes, 44 seconds)
2025-09-12 02:21:15,890 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:21:15,892 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:21:49,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 598.05231 ± 227.668
2025-09-12 02:21:49,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [612.24396, 928.2857, 727.5065, 517.5591, 142.52548, 472.20163, 624.8738, 493.26242, 490.80127, 971.2632]
2025-09-12 02:21:49,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [122.0, 194.0, 143.0, 111.0, 28.0, 99.0, 121.0, 104.0, 90.0, 201.0]
2025-09-12 02:21:49,434 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 29/100 (estimated time remaining: 17 hours, 53 minutes, 16 seconds)
2025-09-12 02:36:18,732 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:36:18,747 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:36:46,084 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 527.84570 ± 131.257
2025-09-12 02:36:46,085 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [507.73898, 582.57806, 561.9549, 197.18652, 601.34283, 500.51923, 486.8863, 619.9407, 729.58514, 490.7247]
2025-09-12 02:36:46,085 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [93.0, 107.0, 102.0, 38.0, 112.0, 93.0, 90.0, 115.0, 145.0, 94.0]
2025-09-12 02:36:46,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 30/100 (estimated time remaining: 17 hours, 41 minutes, 54 seconds)
2025-09-12 02:51:05,903 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:51:05,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:51:37,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 622.74597 ± 112.644
2025-09-12 02:51:37,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [674.40686, 803.72986, 552.8359, 567.2725, 544.2198, 667.38885, 673.2435, 644.623, 374.24463, 725.49445]
2025-09-12 02:51:37,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [125.0, 153.0, 100.0, 106.0, 101.0, 125.0, 125.0, 121.0, 69.0, 135.0]
2025-09-12 02:51:37,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 31/100 (estimated time remaining: 17 hours, 24 minutes, 59 seconds)
2025-09-12 03:06:06,844 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:06:06,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:06:41,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 637.55682 ± 152.591
2025-09-12 03:06:41,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [604.54755, 517.4557, 579.0741, 903.3374, 379.82205, 588.84454, 797.4556, 529.7752, 830.5109, 644.74554]
2025-09-12 03:06:41,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [130.0, 96.0, 107.0, 176.0, 83.0, 114.0, 163.0, 100.0, 155.0, 119.0]
2025-09-12 03:06:41,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (637.56) for latency ExtremeClogL1U23
2025-09-12 03:06:41,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 32/100 (estimated time remaining: 17 hours, 12 minutes, 10 seconds)
2025-09-12 03:21:14,553 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:21:14,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:21:52,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 733.51373 ± 151.865
2025-09-12 03:21:52,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [435.823, 942.95374, 733.8597, 711.04407, 861.45905, 706.4122, 675.14075, 587.29315, 712.46985, 968.68176]
2025-09-12 03:21:52,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [86.0, 178.0, 136.0, 138.0, 164.0, 133.0, 125.0, 110.0, 133.0, 186.0]
2025-09-12 03:21:52,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (733.51) for latency ExtremeClogL1U23
2025-09-12 03:21:52,962 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 33/100 (estimated time remaining: 17 hours, 1 minute, 29 seconds)
2025-09-12 03:36:18,322 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:36:18,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:36:51,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 643.99500 ± 162.800
2025-09-12 03:36:51,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [511.36035, 704.5276, 877.6221, 815.497, 676.0, 440.63184, 866.1478, 625.72, 439.88544, 482.55814]
2025-09-12 03:36:51,818 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [96.0, 134.0, 163.0, 159.0, 122.0, 81.0, 174.0, 118.0, 79.0, 92.0]
2025-09-12 03:36:51,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 34/100 (estimated time remaining: 16 hours, 45 minutes, 32 seconds)
2025-09-12 03:51:28,392 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:51:28,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:52:07,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 737.97260 ± 210.893
2025-09-12 03:52:07,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [675.7072, 564.74744, 482.24496, 430.14954, 721.41815, 663.30054, 1106.3965, 832.5202, 1018.3504, 884.8912]
2025-09-12 03:52:07,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [129.0, 110.0, 105.0, 84.0, 136.0, 124.0, 218.0, 163.0, 199.0, 169.0]
2025-09-12 03:52:07,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (737.97) for latency ExtremeClogL1U23
2025-09-12 03:52:08,000 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 35/100 (estimated time remaining: 16 hours, 34 minutes, 49 seconds)
2025-09-12 04:06:25,012 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:06:25,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:06:59,141 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 655.49573 ± 221.908
2025-09-12 04:06:59,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [949.61865, 595.84827, 576.24316, 773.2225, 492.16443, 859.28845, 125.307144, 776.95715, 620.633, 785.67487]
2025-09-12 04:06:59,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [180.0, 111.0, 104.0, 147.0, 105.0, 165.0, 24.0, 151.0, 121.0, 145.0]
2025-09-12 04:06:59,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 36/100 (estimated time remaining: 16 hours, 19 minutes, 44 seconds)
2025-09-12 04:21:34,298 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:21:34,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:22:23,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 907.81738 ± 330.877
2025-09-12 04:22:23,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1734.3677, 809.6163, 823.3894, 600.59534, 702.4104, 692.42975, 795.0383, 1323.2393, 835.23364, 761.85394]
2025-09-12 04:22:23,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [342.0, 168.0, 167.0, 125.0, 131.0, 134.0, 167.0, 254.0, 165.0, 144.0]
2025-09-12 04:22:23,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (907.82) for latency ExtremeClogL1U23
2025-09-12 04:22:23,792 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 37/100 (estimated time remaining: 16 hours, 9 minutes, 5 seconds)
2025-09-12 04:36:44,535 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:36:44,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:37:30,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 824.44641 ± 234.032
2025-09-12 04:37:30,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [808.79004, 442.24265, 641.2323, 789.5081, 635.69073, 1079.2427, 767.62524, 1208.4338, 1148.6282, 723.06995]
2025-09-12 04:37:30,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [169.0, 92.0, 121.0, 162.0, 122.0, 204.0, 152.0, 251.0, 237.0, 145.0]
2025-09-12 04:37:30,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 38/100 (estimated time remaining: 15 hours, 52 minutes, 51 seconds)
2025-09-12 04:52:10,750 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:52:10,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:52:49,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 704.92352 ± 357.172
2025-09-12 04:52:49,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [558.64404, 611.24994, 587.97003, 397.35666, 874.4469, 1398.3608, 697.8082, 1238.6405, 142.50911, 542.2488]
2025-09-12 04:52:49,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [119.0, 130.0, 108.0, 84.0, 163.0, 269.0, 134.0, 253.0, 27.0, 118.0]
2025-09-12 04:52:49,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 39/100 (estimated time remaining: 15 hours, 41 minutes, 56 seconds)
2025-09-12 05:07:09,832 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:07:09,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:07:54,476 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 806.73938 ± 237.126
2025-09-12 05:07:54,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [695.58167, 927.77466, 716.03796, 505.14246, 966.1163, 1036.6718, 763.09204, 1287.1946, 700.3187, 469.46393]
2025-09-12 05:07:54,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [140.0, 177.0, 143.0, 111.0, 200.0, 198.0, 155.0, 249.0, 132.0, 100.0]
2025-09-12 05:07:54,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 40/100 (estimated time remaining: 15 hours, 24 minutes, 27 seconds)
2025-09-12 05:22:30,435 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:22:30,438 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:23:24,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1012.13879 ± 326.814
2025-09-12 05:23:24,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1465.2836, 865.71545, 701.22894, 1584.4113, 429.1876, 833.0559, 1119.0255, 1152.4368, 905.8077, 1065.234]
2025-09-12 05:23:24,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [277.0, 162.0, 132.0, 306.0, 88.0, 157.0, 215.0, 226.0, 174.0, 203.0]
2025-09-12 05:23:24,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1012.14) for latency ExtremeClogL1U23
2025-09-12 05:23:24,574 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 41/100 (estimated time remaining: 15 hours, 17 minutes, 4 seconds)
2025-09-12 05:37:50,888 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:37:50,891 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:38:38,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 903.47687 ± 327.902
2025-09-12 05:38:38,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1129.1058, 130.10143, 1176.0153, 902.19336, 1151.8019, 985.4742, 709.1175, 713.06006, 790.8155, 1347.0836]
2025-09-12 05:38:38,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [218.0, 25.0, 220.0, 167.0, 220.0, 188.0, 134.0, 134.0, 150.0, 259.0]
2025-09-12 05:38:38,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 42/100 (estimated time remaining: 14 hours, 59 minutes, 44 seconds)
2025-09-12 05:53:08,733 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:53:08,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:54:00,897 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 998.92072 ± 310.650
2025-09-12 05:54:00,897 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [687.4488, 1151.2596, 1138.2587, 464.87576, 1629.6311, 989.63654, 1259.3007, 731.2617, 991.74854, 945.78485]
2025-09-12 05:54:00,898 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [132.0, 213.0, 217.0, 90.0, 304.0, 182.0, 235.0, 140.0, 187.0, 178.0]
2025-09-12 05:54:00,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 43/100 (estimated time remaining: 14 hours, 47 minutes, 30 seconds)
2025-09-12 06:08:39,178 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:08:39,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:09:17,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 742.54114 ± 285.767
2025-09-12 06:09:17,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1057.1261, 796.3776, 781.4632, 1071.6156, 631.6179, 877.1538, 461.1898, 156.46198, 1066.5082, 525.8976]
2025-09-12 06:09:17,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [194.0, 150.0, 142.0, 217.0, 129.0, 165.0, 87.0, 30.0, 193.0, 94.0]
2025-09-12 06:09:17,876 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 44/100 (estimated time remaining: 14 hours, 31 minutes, 46 seconds)
2025-09-12 06:23:52,453 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:23:52,455 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:24:54,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1162.48901 ± 513.031
2025-09-12 06:24:54,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [805.25494, 1655.1117, 830.6515, 880.0956, 1245.6954, 1421.428, 2075.915, 135.31174, 1091.395, 1484.0308]
2025-09-12 06:24:54,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [156.0, 321.0, 154.0, 171.0, 229.0, 269.0, 406.0, 26.0, 205.0, 289.0]
2025-09-12 06:24:54,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1162.49) for latency ExtremeClogL1U23
2025-09-12 06:24:54,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 45/100 (estimated time remaining: 14 hours, 22 minutes, 23 seconds)
2025-09-12 06:39:43,447 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:39:43,448 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:40:34,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 945.11066 ± 268.901
2025-09-12 06:40:34,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [704.8824, 864.2519, 948.9117, 878.6554, 1519.3035, 828.22595, 1226.0466, 704.918, 1177.686, 598.2253]
2025-09-12 06:40:34,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [127.0, 166.0, 175.0, 164.0, 312.0, 173.0, 231.0, 147.0, 220.0, 114.0]
2025-09-12 06:40:34,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 46/100 (estimated time remaining: 14 hours, 8 minutes, 46 seconds)
2025-09-12 06:54:45,189 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:54:45,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:55:35,328 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 955.47400 ± 262.562
2025-09-12 06:55:35,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1158.357, 879.7553, 872.4387, 734.9558, 1020.4362, 1478.6079, 1102.6177, 852.3713, 1021.6616, 433.53885]
2025-09-12 06:55:35,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [215.0, 177.0, 160.0, 141.0, 189.0, 279.0, 217.0, 163.0, 201.0, 80.0]
2025-09-12 06:55:35,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 47/100 (estimated time remaining: 13 hours, 50 minutes, 59 seconds)
2025-09-12 07:10:16,150 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:10:16,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:11:14,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1108.14136 ± 383.433
2025-09-12 07:11:14,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1423.8872, 790.7073, 1627.7336, 425.5243, 1662.3081, 984.74835, 800.36523, 945.7367, 1396.6804, 1023.72144]
2025-09-12 07:11:14,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [271.0, 149.0, 307.0, 77.0, 318.0, 188.0, 147.0, 178.0, 276.0, 199.0]
2025-09-12 07:11:14,687 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 48/100 (estimated time remaining: 13 hours, 38 minutes, 38 seconds)
2025-09-12 07:25:39,366 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:25:39,368 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:26:36,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1045.49561 ± 570.667
2025-09-12 07:26:36,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1238.643, 552.8131, 757.0495, 1297.3037, 327.73764, 1226.6832, 2423.3862, 691.709, 1321.0613, 618.569]
2025-09-12 07:26:36,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [233.0, 110.0, 143.0, 251.0, 63.0, 243.0, 474.0, 134.0, 283.0, 131.0]
2025-09-12 07:26:36,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 49/100 (estimated time remaining: 13 hours, 24 minutes)
2025-09-12 07:41:08,595 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:41:08,601 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:41:56,285 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 899.61896 ± 238.893
2025-09-12 07:41:56,285 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [566.36255, 1005.4712, 1032.8652, 1259.2678, 644.7439, 1202.4805, 732.8901, 668.43805, 768.86224, 1114.8076]
2025-09-12 07:41:56,285 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [121.0, 187.0, 189.0, 241.0, 118.0, 224.0, 150.0, 126.0, 146.0, 211.0]
2025-09-12 07:41:56,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 50/100 (estimated time remaining: 13 hours, 5 minutes, 42 seconds)
2025-09-12 07:56:24,465 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:56:24,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:57:34,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1267.08423 ± 736.276
2025-09-12 07:57:34,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2995.9424, 2082.9092, 1262.5361, 645.02515, 1719.4751, 791.2691, 796.9602, 790.1178, 953.0126, 633.5937]
2025-09-12 07:57:34,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [604.0, 411.0, 252.0, 134.0, 345.0, 169.0, 152.0, 160.0, 201.0, 117.0]
2025-09-12 07:57:34,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1267.08) for latency ExtremeClogL1U23
2025-09-12 07:57:34,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 51/100 (estimated time remaining: 12 hours, 50 minutes, 3 seconds)
2025-09-12 08:12:01,149 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:12:01,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:12:57,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1066.37732 ± 366.451
2025-09-12 08:12:57,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1175.4224, 1107.3065, 728.21545, 1056.587, 1309.1434, 727.69617, 969.6643, 1914.0537, 514.6986, 1160.986]
2025-09-12 08:12:57,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [230.0, 210.0, 152.0, 199.0, 241.0, 135.0, 179.0, 365.0, 106.0, 232.0]
2025-09-12 08:12:57,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 52/100 (estimated time remaining: 12 hours, 38 minutes, 11 seconds)
2025-09-12 08:27:20,251 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:27:20,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:28:42,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1574.50720 ± 605.693
2025-09-12 08:28:42,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1781.107, 2474.9229, 1592.7188, 1791.856, 1986.9767, 1339.1975, 2079.273, 169.67072, 1511.4475, 1017.9025]
2025-09-12 08:28:42,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [352.0, 459.0, 307.0, 340.0, 386.0, 255.0, 407.0, 33.0, 289.0, 192.0]
2025-09-12 08:28:42,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1574.51) for latency ExtremeClogL1U23
2025-09-12 08:28:42,866 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 53/100 (estimated time remaining: 12 hours, 23 minutes, 42 seconds)
2025-09-12 08:43:15,684 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:43:15,693 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:45:12,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2158.41479 ± 1162.210
2025-09-12 08:45:12,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3851.9307, 626.29584, 1317.3666, 938.5588, 914.6965, 3227.8306, 1647.0271, 3172.4746, 2279.9114, 3608.055]
2025-09-12 08:45:12,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [780.0, 115.0, 244.0, 194.0, 178.0, 653.0, 332.0, 619.0, 441.0, 704.0]
2025-09-12 08:45:12,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2158.41) for latency ExtremeClogL1U23
2025-09-12 08:45:12,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 54/100 (estimated time remaining: 12 hours, 18 minutes, 49 seconds)
2025-09-12 08:59:24,727 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:59:24,729 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:00:35,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1384.30701 ± 636.000
2025-09-12 09:00:35,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1060.3762, 776.6687, 2282.987, 2020.3912, 1106.188, 635.4175, 771.68665, 1764.7423, 2414.159, 1010.45447]
2025-09-12 09:00:35,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [205.0, 156.0, 421.0, 383.0, 193.0, 118.0, 142.0, 327.0, 463.0, 191.0]
2025-09-12 09:00:35,021 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 55/100 (estimated time remaining: 12 hours, 3 minutes, 32 seconds)
2025-09-12 09:14:46,026 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:14:46,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:17:06,729 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2684.93359 ± 1528.799
2025-09-12 09:17:06,731 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2084.1687, 2104.9668, 2215.2883, 5112.0146, 3552.2507, 2608.3696, 956.47235, 2950.0913, 108.744934, 5156.9717]
2025-09-12 09:17:06,731 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [405.0, 403.0, 417.0, 1000.0, 694.0, 503.0, 193.0, 575.0, 21.0, 1000.0]
2025-09-12 09:17:06,731 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2684.93) for latency ExtremeClogL1U23
2025-09-12 09:17:06,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 56/100 (estimated time remaining: 11 hours, 55 minutes, 48 seconds)
2025-09-12 09:31:20,997 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:31:20,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:32:46,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1674.92151 ± 765.823
2025-09-12 09:32:46,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2377.7256, 435.481, 701.68286, 1735.051, 1834.3433, 1291.4407, 2780.923, 958.60754, 2042.5714, 2591.3882]
2025-09-12 09:32:46,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [451.0, 86.0, 133.0, 330.0, 346.0, 248.0, 521.0, 181.0, 383.0, 484.0]
2025-09-12 09:32:46,732 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 57/100 (estimated time remaining: 11 hours, 42 minutes, 26 seconds)
2025-09-12 09:47:22,702 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:47:22,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:48:32,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1343.06677 ± 955.892
2025-09-12 09:48:32,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2427.1045, 494.93243, 1786.5018, 750.9506, 3098.44, 569.7034, 140.40619, 495.50412, 2263.8162, 1403.309]
2025-09-12 09:48:32,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [465.0, 90.0, 343.0, 147.0, 603.0, 110.0, 27.0, 91.0, 432.0, 269.0]
2025-09-12 09:48:32,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 58/100 (estimated time remaining: 11 hours, 26 minutes, 26 seconds)
2025-09-12 10:02:39,366 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:02:39,368 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:03:58,459 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1444.01685 ± 1280.553
2025-09-12 10:03:58,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1209.0292, 5055.606, 631.5321, 952.7543, 576.56635, 1806.2324, 368.12143, 1163.8245, 980.6316, 1695.8715]
2025-09-12 10:03:58,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [255.0, 1000.0, 132.0, 196.0, 122.0, 365.0, 69.0, 239.0, 202.0, 338.0]
2025-09-12 10:03:58,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 59/100 (estimated time remaining: 11 hours, 1 minute, 39 seconds)
2025-09-12 10:18:03,409 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:18:03,411 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:19:15,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1405.21533 ± 1187.679
2025-09-12 10:19:15,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [644.80786, 1518.8102, 1239.2683, 2078.9795, 1012.5179, 109.12434, 571.7292, 4572.05, 735.4233, 1569.4431]
2025-09-12 10:19:15,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [125.0, 287.0, 256.0, 390.0, 191.0, 21.0, 110.0, 877.0, 140.0, 298.0]
2025-09-12 10:19:15,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 60/100 (estimated time remaining: 10 hours, 45 minutes, 10 seconds)
2025-09-12 10:33:36,005 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:33:36,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:35:18,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1930.22949 ± 1398.679
2025-09-12 10:35:18,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1409.6062, 2958.3838, 2080.0195, 474.42926, 403.89407, 607.4872, 945.8896, 2010.8337, 4789.4766, 3622.2756]
2025-09-12 10:35:18,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [271.0, 568.0, 405.0, 100.0, 82.0, 111.0, 178.0, 386.0, 930.0, 691.0]
2025-09-12 10:35:18,031 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 61/100 (estimated time remaining: 10 hours, 25 minutes, 30 seconds)
2025-09-12 10:49:37,654 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:49:37,657 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:51:21,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2063.30078 ± 647.708
2025-09-12 10:51:21,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2054.067, 2754.8933, 1401.2936, 2095.7363, 3183.2893, 1050.0342, 2538.9717, 1205.1494, 2241.6152, 2107.9573]
2025-09-12 10:51:21,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [376.0, 523.0, 252.0, 376.0, 593.0, 189.0, 472.0, 233.0, 426.0, 385.0]
2025-09-12 10:51:21,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 62/100 (estimated time remaining: 10 hours, 12 minutes, 57 seconds)
2025-09-12 11:05:45,067 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:05:45,071 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:07:49,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2368.63159 ± 1058.468
2025-09-12 11:07:49,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3008.461, 1896.0101, 2638.899, 1889.5964, 1260.7893, 4820.983, 2383.3745, 3002.628, 1992.8813, 792.69403]
2025-09-12 11:07:49,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [578.0, 354.0, 509.0, 362.0, 246.0, 930.0, 457.0, 573.0, 388.0, 166.0]
2025-09-12 11:07:49,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 63/100 (estimated time remaining: 10 hours, 2 minutes, 36 seconds)
2025-09-12 11:22:16,667 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:22:16,671 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:23:29,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1381.92639 ± 833.654
2025-09-12 11:23:29,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [822.228, 2006.3385, 1434.0294, 1928.1595, 108.34628, 1638.2039, 580.2934, 1592.4653, 3104.4182, 604.781]
2025-09-12 11:23:29,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [154.0, 395.0, 292.0, 386.0, 21.0, 317.0, 122.0, 309.0, 605.0, 129.0]
2025-09-12 11:23:29,860 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 64/100 (estimated time remaining: 9 hours, 48 minutes, 28 seconds)
2025-09-12 11:38:20,546 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:38:20,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:40:33,260 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2521.89404 ± 1657.176
2025-09-12 11:40:33,265 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3847.237, 2251.7761, 711.6164, 5178.132, 3606.7358, 1078.6088, 1441.9191, 4773.0547, 134.54596, 2195.3137]
2025-09-12 11:40:33,265 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [737.0, 448.0, 145.0, 1000.0, 690.0, 210.0, 281.0, 911.0, 26.0, 417.0]
2025-09-12 11:40:33,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 65/100 (estimated time remaining: 9 hours, 45 minutes, 18 seconds)
2025-09-12 11:54:01,763 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:54:01,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:55:17,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1354.87292 ± 600.675
2025-09-12 11:55:17,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1912.4668, 794.9387, 941.98956, 1116.4691, 1040.3326, 477.37012, 1837.9823, 2523.3203, 1084.566, 1819.2942]
2025-09-12 11:55:17,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [405.0, 165.0, 196.0, 239.0, 219.0, 104.0, 363.0, 516.0, 228.0, 376.0]
2025-09-12 11:55:18,005 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 66/100 (estimated time remaining: 9 hours, 19 minutes, 59 seconds)
2025-09-12 12:10:27,931 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:10:27,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:12:19,902 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2187.53149 ± 1357.165
2025-09-12 12:12:19,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1031.1818, 777.4528, 3813.5054, 2162.0503, 2568.9856, 793.4901, 2154.3665, 5306.9214, 1547.0349, 1720.3256]
2025-09-12 12:12:19,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [212.0, 144.0, 743.0, 402.0, 487.0, 143.0, 413.0, 993.0, 300.0, 329.0]
2025-09-12 12:12:19,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 67/100 (estimated time remaining: 9 hours, 10 minutes, 35 seconds)
2025-09-12 12:25:44,863 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:25:44,865 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:28:10,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2738.66138 ± 1477.444
2025-09-12 12:28:10,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1825.1527, 4345.2617, 1142.217, 2355.3086, 1186.0837, 3669.2998, 2786.9885, 4364.1626, 646.04553, 5066.091]
2025-09-12 12:28:10,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [350.0, 845.0, 229.0, 464.0, 213.0, 714.0, 560.0, 855.0, 124.0, 1000.0]
2025-09-12 12:28:10,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2738.66) for latency ExtremeClogL1U23
2025-09-12 12:28:10,369 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 68/100 (estimated time remaining: 8 hours, 50 minutes, 17 seconds)
2025-09-12 12:42:44,815 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:42:44,818 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:44:14,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1704.96948 ± 1555.278
2025-09-12 12:44:14,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1119.6077, 593.69275, 505.7638, 114.44878, 773.003, 2748.776, 3532.6697, 1772.9241, 678.58575, 5210.2227]
2025-09-12 12:44:14,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [221.0, 117.0, 97.0, 22.0, 145.0, 522.0, 679.0, 342.0, 129.0, 1000.0]
2025-09-12 12:44:14,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 36 minutes, 44 seconds)
2025-09-12 12:58:49,077 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:58:49,082 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:00:34,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2059.79785 ± 1320.626
2025-09-12 13:00:34,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1600.7496, 1212.1549, 2495.913, 5254.388, 607.7124, 2725.5972, 1100.6688, 3165.142, 1264.2535, 1171.3977]
2025-09-12 13:00:34,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [296.0, 224.0, 473.0, 1000.0, 126.0, 523.0, 210.0, 600.0, 240.0, 213.0]
2025-09-12 13:00:34,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 70/100 (estimated time remaining: 8 hours, 16 minutes, 4 seconds)
2025-09-12 13:14:48,145 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:14:48,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:17:24,274 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2956.47266 ± 1870.923
2025-09-12 13:17:24,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [857.66907, 5185.704, 1566.518, 5145.5645, 5187.7227, 1038.4427, 5138.9766, 1526.723, 2809.894, 1107.5117]
2025-09-12 13:17:24,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [180.0, 1000.0, 297.0, 1000.0, 1000.0, 215.0, 1000.0, 305.0, 552.0, 219.0]
2025-09-12 13:17:24,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2956.47) for latency ExtremeClogL1U23
2025-09-12 13:17:24,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 71/100 (estimated time remaining: 8 hours, 12 minutes, 37 seconds)
2025-09-12 13:31:22,943 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:31:22,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:32:29,760 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1247.97095 ± 1182.167
2025-09-12 13:32:29,761 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [526.3293, 1102.4247, 1103.6555, 859.73016, 1330.6392, 4711.403, 896.3012, 511.22458, 848.2421, 589.75977]
2025-09-12 13:32:29,761 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [107.0, 224.0, 212.0, 162.0, 266.0, 902.0, 183.0, 107.0, 175.0, 119.0]
2025-09-12 13:32:29,788 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 72/100 (estimated time remaining: 7 hours, 44 minutes, 57 seconds)
2025-09-12 13:48:01,238 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:48:01,243 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:49:57,273 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2205.78345 ± 1801.428
2025-09-12 13:49:57,274 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [686.12494, 1591.4639, 5029.6167, 677.71454, 411.4884, 633.5927, 3356.7612, 5162.649, 895.56006, 3612.8633]
2025-09-12 13:49:57,274 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [139.0, 310.0, 970.0, 137.0, 74.0, 134.0, 652.0, 1000.0, 183.0, 704.0]
2025-09-12 13:49:57,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 73/100 (estimated time remaining: 7 hours, 37 minutes, 58 seconds)
2025-09-12 14:03:00,541 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:03:00,543 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:05:35,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2858.72290 ± 1649.385
2025-09-12 14:05:35,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [910.39105, 2194.8489, 1004.5842, 3059.087, 5108.6816, 4251.962, 5102.685, 1397.0773, 1150.752, 4407.1597]
2025-09-12 14:05:35,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [174.0, 447.0, 207.0, 594.0, 1000.0, 835.0, 1000.0, 272.0, 234.0, 899.0]
2025-09-12 14:05:35,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 74/100 (estimated time remaining: 7 hours, 19 minutes, 16 seconds)
2025-09-12 14:20:06,613 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:20:06,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:22:09,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2333.33789 ± 1341.065
2025-09-12 14:22:09,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3639.3435, 482.91177, 1158.1824, 2275.9504, 2434.1362, 5058.175, 3212.8638, 1554.1694, 760.115, 2757.5305]
2025-09-12 14:22:09,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [691.0, 106.0, 232.0, 434.0, 472.0, 1000.0, 610.0, 319.0, 148.0, 554.0]
2025-09-12 14:22:09,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 75/100 (estimated time remaining: 7 hours, 4 minutes, 16 seconds)
2025-09-12 14:36:44,394 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:36:44,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:39:21,138 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2915.73779 ± 1858.694
2025-09-12 14:39:21,140 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5039.972, 1637.7483, 5121.8384, 5138.766, 1569.33, 452.21732, 2477.8557, 2016.345, 649.1043, 5054.2036]
2025-09-12 14:39:21,140 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 314.0, 1000.0, 1000.0, 311.0, 85.0, 491.0, 378.0, 143.0, 1000.0]
2025-09-12 14:39:21,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 49 minutes, 44 seconds)
2025-09-12 14:53:43,558 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:53:43,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:56:39,267 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3232.13550 ± 1973.374
2025-09-12 14:56:39,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [698.44904, 4411.0645, 4617.7935, 120.04005, 5081.4985, 4553.1655, 5058.3433, 5020.8057, 684.5031, 2075.691]
2025-09-12 14:56:39,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [143.0, 881.0, 923.0, 23.0, 1000.0, 889.0, 1000.0, 1000.0, 139.0, 405.0]
2025-09-12 14:56:39,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (3232.14) for latency ExtremeClogL1U23
2025-09-12 14:56:39,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 43 minutes, 57 seconds)
2025-09-12 15:10:43,775 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:10:43,778 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:13:05,719 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2676.43774 ± 2050.895
2025-09-12 15:13:05,720 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1882.0712, 4989.576, 2314.9995, 5043.111, 1214.399, 572.1682, 5091.849, 400.15088, 5154.129, 101.92304]
2025-09-12 15:13:05,720 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [360.0, 1000.0, 458.0, 1000.0, 247.0, 116.0, 1000.0, 72.0, 1000.0, 20.0]
2025-09-12 15:13:05,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 78/100 (estimated time remaining: 6 hours, 22 minutes, 26 seconds)
2025-09-12 15:27:47,290 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:27:47,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:29:27,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1940.30237 ± 1371.671
2025-09-12 15:29:27,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3417.1987, 5243.9023, 1498.3491, 827.82886, 1075.0037, 2645.5085, 982.8686, 825.46985, 1883.6552, 1003.2398]
2025-09-12 15:29:27,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [660.0, 1000.0, 283.0, 170.0, 197.0, 498.0, 192.0, 158.0, 356.0, 213.0]
2025-09-12 15:29:27,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 79/100 (estimated time remaining: 6 hours, 9 minutes, 1 second)
2025-09-12 15:43:28,201 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:43:28,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:46:29,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3460.83252 ± 1865.565
2025-09-12 15:46:29,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [997.2567, 5177.8003, 3631.798, 1228.7509, 2288.9287, 5152.0654, 4928.2925, 5303.171, 661.9185, 5238.3423]
2025-09-12 15:46:29,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [206.0, 1000.0, 692.0, 249.0, 440.0, 1000.0, 930.0, 1000.0, 139.0, 1000.0]
2025-09-12 15:46:29,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (3460.83) for latency ExtremeClogL1U23
2025-09-12 15:46:29,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 54 minutes, 12 seconds)
2025-09-12 16:00:40,294 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:00:40,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:03:04,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2709.90796 ± 1287.311
2025-09-12 16:03:04,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3015.0205, 2397.5386, 2926.1125, 4033.5254, 524.12115, 5195.504, 2547.8723, 912.55035, 3198.646, 2348.1868]
2025-09-12 16:03:04,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [597.0, 456.0, 579.0, 781.0, 111.0, 1000.0, 509.0, 187.0, 617.0, 460.0]
2025-09-12 16:03:04,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 34 minutes, 54 seconds)
2025-09-12 16:18:03,164 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:18:03,168 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:19:44,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1968.15430 ± 1693.583
2025-09-12 16:19:44,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5114.4824, 1757.2925, 1042.0023, 740.26135, 2401.5461, 767.9309, 1001.9729, 1150.1849, 5258.79, 447.0779]
2025-09-12 16:19:44,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [950.0, 335.0, 194.0, 141.0, 457.0, 155.0, 209.0, 220.0, 1000.0, 84.0]
2025-09-12 16:19:44,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 82/100 (estimated time remaining: 5 hours, 15 minutes, 42 seconds)
2025-09-12 16:33:54,101 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:33:54,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:35:52,724 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2235.70825 ± 1778.662
2025-09-12 16:35:52,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2128.714, 753.1838, 2363.2703, 655.37744, 5120.55, 118.40072, 125.08396, 3368.389, 2621.0208, 5103.091]
2025-09-12 16:35:52,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [420.0, 160.0, 463.0, 122.0, 1000.0, 23.0, 24.0, 648.0, 517.0, 1000.0]
2025-09-12 16:35:52,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 58 minutes, 1 second)
2025-09-12 16:50:00,943 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:50:00,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:51:08,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1319.14954 ± 1121.084
2025-09-12 16:51:08,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [693.03577, 546.29443, 994.35815, 1493.6088, 661.9747, 1011.0383, 4533.119, 572.856, 1462.1406, 1223.0696]
2025-09-12 16:51:08,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [128.0, 102.0, 188.0, 299.0, 117.0, 205.0, 850.0, 109.0, 278.0, 231.0]
2025-09-12 16:51:08,969 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 37 minutes, 45 seconds)
2025-09-12 17:04:39,977 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:04:39,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:05:48,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1358.33350 ± 1550.357
2025-09-12 17:05:48,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1060.8849, 2381.6765, 748.25995, 2292.6877, 5362.344, 140.10838, 130.0448, 377.81277, 960.11365, 129.40196]
2025-09-12 17:05:48,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [201.0, 442.0, 141.0, 421.0, 1000.0, 27.0, 25.0, 72.0, 181.0, 25.0]
2025-09-12 17:05:48,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 13 minutes, 49 seconds)
2025-09-12 17:21:12,491 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:21:12,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:22:42,956 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1757.67676 ± 1511.879
2025-09-12 17:22:42,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5356.7646, 114.02689, 331.80167, 1257.9354, 941.3611, 3099.1865, 454.04245, 2577.13, 1922.7754, 1521.7445]
2025-09-12 17:22:42,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 22.0, 64.0, 238.0, 176.0, 593.0, 82.0, 490.0, 366.0, 284.0]
2025-09-12 17:22:42,966 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 58 minutes, 54 seconds)
2025-09-12 17:36:21,954 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:36:21,962 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:38:48,053 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2819.54053 ± 1759.097
2025-09-12 17:38:48,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1151.9062, 1305.3866, 5257.003, 896.4771, 3647.642, 5261.0156, 2390.3342, 5241.279, 1813.1058, 1231.2559]
2025-09-12 17:38:48,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [241.0, 248.0, 1000.0, 174.0, 707.0, 1000.0, 462.0, 1000.0, 344.0, 237.0]
2025-09-12 17:38:48,062 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 41 minutes, 23 seconds)
2025-09-12 17:52:47,365 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:52:47,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:56:01,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3751.02539 ± 1552.101
2025-09-12 17:56:01,635 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5235.139, 1550.65, 2050.5554, 5244.1953, 5137.8687, 5251.042, 5255.2925, 2431.6213, 3525.6094, 1828.282]
2025-09-12 17:56:01,635 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 293.0, 382.0, 1000.0, 1000.0, 1000.0, 1000.0, 470.0, 671.0, 357.0]
2025-09-12 17:56:01,635 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (3751.03) for latency ExtremeClogL1U23
2025-09-12 17:56:01,644 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 28 minutes, 23 seconds)
2025-09-12 18:10:06,878 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:10:06,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:12:50,951 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3207.27319 ± 1820.994
2025-09-12 18:12:50,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1478.3108, 5220.487, 4761.9614, 1156.2981, 5290.1597, 1959.3712, 4100.5464, 2233.5588, 516.7831, 5355.2534]
2025-09-12 18:12:50,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [279.0, 1000.0, 905.0, 244.0, 1000.0, 374.0, 778.0, 438.0, 109.0, 1000.0]
2025-09-12 18:12:50,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 16 minutes, 4 seconds)
2025-09-12 18:27:14,313 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:27:14,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:28:59,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1976.07397 ± 1574.502
2025-09-12 18:28:59,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1542.3472, 849.6539, 2801.369, 3199.1187, 721.959, 741.7492, 554.60065, 3902.3037, 356.21484, 5091.4253]
2025-09-12 18:28:59,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [311.0, 162.0, 563.0, 631.0, 142.0, 141.0, 115.0, 744.0, 74.0, 1000.0]
2025-09-12 18:28:59,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 90/100 (estimated time remaining: 3 hours, 2 minutes, 59 seconds)
2025-09-12 18:43:17,299 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:43:17,301 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:45:46,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2908.67188 ± 1744.356
2025-09-12 18:45:46,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [592.6187, 414.16833, 1515.9297, 3008.9233, 5057.7285, 3703.9707, 5165.422, 5254.83, 2177.8643, 2195.262]
2025-09-12 18:45:46,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [106.0, 78.0, 295.0, 572.0, 929.0, 716.0, 992.0, 1000.0, 413.0, 416.0]
2025-09-12 18:45:46,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 46 minutes, 7 seconds)
2025-09-12 18:59:51,570 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:59:51,574 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:02:00,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2539.26270 ± 1735.661
2025-09-12 19:02:00,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1446.0519, 3066.853, 148.41907, 3967.1768, 5110.23, 1310.4574, 587.2254, 1286.79, 3326.8052, 5142.6196]
2025-09-12 19:02:00,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [274.0, 573.0, 29.0, 729.0, 954.0, 249.0, 112.0, 253.0, 623.0, 958.0]
2025-09-12 19:02:00,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 29 minutes, 46 seconds)
2025-09-12 19:16:08,076 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:16:08,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:18:31,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2751.45508 ± 1597.695
2025-09-12 19:18:31,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2521.248, 3215.5952, 5154.912, 2661.3704, 4172.626, 1051.8646, 5219.7183, 1985.5835, 780.33813, 751.2941]
2025-09-12 19:18:31,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [483.0, 629.0, 1000.0, 512.0, 805.0, 196.0, 1000.0, 382.0, 146.0, 149.0]
2025-09-12 19:18:31,344 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 93/100 (estimated time remaining: 2 hours, 11 minutes, 59 seconds)
2025-09-12 19:33:09,496 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:33:09,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:35:11,880 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2422.04712 ± 1805.221
2025-09-12 19:35:11,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [843.441, 1841.5499, 5453.987, 1849.3363, 506.24826, 3586.1536, 388.18317, 5428.0063, 1185.7479, 3137.8167]
2025-09-12 19:35:11,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [163.0, 355.0, 1000.0, 341.0, 105.0, 671.0, 73.0, 1000.0, 230.0, 585.0]
2025-09-12 19:35:11,889 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 55 minutes, 17 seconds)
2025-09-12 19:49:36,154 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:49:36,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:51:52,936 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2667.62085 ± 1557.834
2025-09-12 19:51:52,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2655.7312, 5318.7754, 1480.4382, 2201.915, 1363.5825, 1399.139, 1944.5924, 5235.484, 933.01715, 4143.5327]
2025-09-12 19:51:52,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [517.0, 1000.0, 284.0, 410.0, 251.0, 261.0, 367.0, 1000.0, 174.0, 767.0]
2025-09-12 19:51:52,955 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 39 minutes, 27 seconds)
2025-09-12 20:05:33,299 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:05:33,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:07:44,880 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2565.80859 ± 1955.747
2025-09-12 20:07:44,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1159.5806, 936.6765, 5256.7153, 5305.627, 659.31006, 5277.11, 3718.7612, 1276.488, 718.19476, 1349.623]
2025-09-12 20:07:44,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [236.0, 200.0, 1000.0, 1000.0, 136.0, 1000.0, 718.0, 247.0, 135.0, 278.0]
2025-09-12 20:07:44,888 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 21 minutes, 58 seconds)
2025-09-12 20:22:46,036 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:22:46,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:24:48,655 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2435.32031 ± 1928.289
2025-09-12 20:24:48,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [267.23227, 320.1079, 4158.5464, 3674.4358, 5443.4136, 1414.6531, 1173.5533, 4976.793, 145.41736, 2779.0508]
2025-09-12 20:24:48,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [54.0, 62.0, 766.0, 674.0, 1000.0, 267.0, 214.0, 910.0, 28.0, 509.0]
2025-09-12 20:24:48,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 6 minutes, 14 seconds)
2025-09-12 20:39:27,618 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:39:27,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:42:39,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3644.64795 ± 1372.904
2025-09-12 20:42:39,218 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [4637.4976, 2348.198, 837.1524, 5167.198, 5216.861, 2975.104, 5250.3413, 3528.9878, 3513.5154, 2971.6238]
2025-09-12 20:42:39,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [891.0, 456.0, 176.0, 1000.0, 1000.0, 551.0, 1000.0, 671.0, 660.0, 565.0]
2025-09-12 20:42:39,240 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 98/100 (estimated time remaining: 50 minutes, 28 seconds)
2025-09-12 20:56:46,266 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:56:46,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:59:39,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3271.25635 ± 2060.915
2025-09-12 20:59:39,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1265.5393, 4127.444, 494.7007, 119.34791, 5037.3813, 5051.8345, 1327.8104, 5174.2373, 5117.4233, 4996.844]
2025-09-12 20:59:39,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [241.0, 790.0, 105.0, 23.0, 1000.0, 1000.0, 274.0, 1000.0, 1000.0, 1000.0]
2025-09-12 20:59:39,929 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 99/100 (estimated time remaining: 33 minutes, 47 seconds)
2025-09-12 21:13:25,353 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 21:13:25,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 21:16:00,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3058.62085 ± 1682.968
2025-09-12 21:16:00,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [999.53864, 2162.1382, 1650.8054, 2355.1133, 4564.051, 5372.2754, 3740.3853, 4067.042, 428.63583, 5246.222]
2025-09-12 21:16:00,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [195.0, 409.0, 310.0, 440.0, 858.0, 1000.0, 696.0, 768.0, 90.0, 1000.0]
2025-09-12 21:16:00,694 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 100/100 (estimated time remaining: 16 minutes, 49 seconds)
2025-09-12 21:30:20,963 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 21:30:20,965 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 21:32:57,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2969.24805 ± 1969.033
2025-09-12 21:32:57,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [758.34106, 1048.8344, 5230.438, 2743.4517, 5196.534, 5210.7656, 274.6015, 5269.5293, 2497.658, 1462.3256]
2025-09-12 21:32:57,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [150.0, 220.0, 1000.0, 544.0, 1000.0, 1000.0, 53.0, 1000.0, 466.0, 276.0]
2025-09-12 21:32:57,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1251 [DEBUG]: Training session finished
