2025-09-11 19:28:37,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc15-humanoid/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 19:28:37,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc15-humanoid/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 19:28:37,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x1498556e4dd0>}
2025-09-11 19:28:37,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1111 [DEBUG]: using device: cuda
2025-09-11 19:28:37,402 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1133 [INFO]: Creating new trainer
2025-09-11 19:28:37,420 baseline-mbpac-noiseperc15-humanoid:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (tanh_refit): NNTanhRefit(
    scale: tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000,
             0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000]]), shift: tensor([[-0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000]])
  )
)
2025-09-11 19:28:37,420 baseline-mbpac-noiseperc15-humanoid:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=393, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 19:28:37,431 baseline-mbpac-noiseperc15-humanoid:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=376, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=17, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 19:28:38,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1194 [DEBUG]: Starting training session...
2025-09-11 19:28:38,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 1/100
2025-09-11 19:41:16,615 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:41:16,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:41:32,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 299.51425 ± 48.969
2025-09-11 19:41:32,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [295.71918, 431.42664, 271.6037, 272.18768, 326.15543, 244.12822, 277.51285, 271.91724, 301.57797, 302.9138]
2025-09-11 19:41:32,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [55.0, 88.0, 52.0, 51.0, 63.0, 46.0, 53.0, 53.0, 58.0, 58.0]
2025-09-11 19:41:32,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (299.51) for latency ExtremeClogL1U23
2025-09-11 19:41:32,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 2/100 (estimated time remaining: 21 hours, 17 minutes, 2 seconds)
2025-09-11 19:55:38,037 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:55:38,053 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:55:57,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 339.10553 ± 97.869
2025-09-11 19:55:57,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [368.40924, 418.324, 463.99817, 101.95668, 439.7062, 365.3773, 279.7655, 296.8119, 313.20822, 343.4979]
2025-09-11 19:55:57,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [75.0, 79.0, 89.0, 20.0, 89.0, 77.0, 60.0, 65.0, 70.0, 68.0]
2025-09-11 19:55:57,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (339.11) for latency ExtremeClogL1U23
2025-09-11 19:55:57,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 3/100 (estimated time remaining: 22 hours, 18 minutes, 16 seconds)
2025-09-11 20:10:12,634 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:10:12,636 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:10:37,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 463.34308 ± 125.582
2025-09-11 20:10:37,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [385.9906, 419.1018, 680.7372, 437.16562, 656.63715, 350.86807, 315.61557, 380.14172, 603.1477, 404.02536]
2025-09-11 20:10:37,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [71.0, 82.0, 130.0, 90.0, 137.0, 65.0, 59.0, 72.0, 113.0, 76.0]
2025-09-11 20:10:37,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (463.34) for latency ExtremeClogL1U23
2025-09-11 20:10:37,415 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 4/100 (estimated time remaining: 22 hours, 37 minutes, 19 seconds)
2025-09-11 20:24:44,942 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:24:44,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:25:04,375 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 378.01068 ± 51.258
2025-09-11 20:25:04,375 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [400.1323, 320.16824, 336.4289, 420.21136, 307.15552, 432.91843, 414.48492, 461.86298, 335.25708, 351.487]
2025-09-11 20:25:04,375 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [74.0, 59.0, 61.0, 76.0, 59.0, 80.0, 77.0, 84.0, 62.0, 65.0]
2025-09-11 20:25:04,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 5/100 (estimated time remaining: 22 hours, 34 minutes, 17 seconds)
2025-09-11 20:39:10,886 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:39:10,888 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:39:34,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 448.08057 ± 176.087
2025-09-11 20:39:34,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [365.8494, 337.5013, 392.20227, 406.3223, 421.3926, 634.97687, 410.7974, 810.61194, 128.38692, 572.76465]
2025-09-11 20:39:34,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [71.0, 63.0, 72.0, 76.0, 78.0, 127.0, 87.0, 157.0, 25.0, 105.0]
2025-09-11 20:39:34,473 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 6/100 (estimated time remaining: 22 hours, 27 minutes, 40 seconds)
2025-09-11 20:53:50,060 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:53:50,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:54:13,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 449.05478 ± 60.504
2025-09-11 20:54:13,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [399.56693, 507.4428, 343.00574, 459.2812, 464.55518, 446.50876, 507.87384, 376.18896, 436.01526, 550.1093]
2025-09-11 20:54:13,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [76.0, 96.0, 77.0, 90.0, 102.0, 83.0, 92.0, 70.0, 80.0, 103.0]
2025-09-11 20:54:13,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 7/100 (estimated time remaining: 22 hours, 46 minutes, 24 seconds)
2025-09-11 21:08:09,340 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:08:09,341 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:08:33,593 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 471.12808 ± 142.766
2025-09-11 21:08:33,593 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [437.60126, 505.36136, 383.32858, 474.31723, 393.6232, 875.86597, 416.50647, 354.35297, 484.05325, 386.27057]
2025-09-11 21:08:33,593 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [80.0, 95.0, 69.0, 88.0, 72.0, 166.0, 80.0, 64.0, 103.0, 70.0]
2025-09-11 21:08:33,593 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (471.13) for latency ExtremeClogL1U23
2025-09-11 21:08:33,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 8/100 (estimated time remaining: 22 hours, 30 minutes, 25 seconds)
2025-09-11 21:22:34,177 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:22:34,179 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:22:56,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 409.37869 ± 75.796
2025-09-11 21:22:56,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [326.368, 362.50626, 443.37097, 360.43845, 307.13025, 489.6754, 516.72577, 335.04944, 507.19894, 445.32324]
2025-09-11 21:22:56,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [59.0, 66.0, 82.0, 67.0, 68.0, 105.0, 95.0, 62.0, 97.0, 99.0]
2025-09-11 21:22:56,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 9/100 (estimated time remaining: 22 hours, 10 minutes, 30 seconds)
2025-09-11 21:36:59,936 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:36:59,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:37:27,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 515.71814 ± 184.556
2025-09-11 21:37:27,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [478.76404, 458.76852, 118.74407, 439.05087, 807.3454, 596.46497, 639.2849, 470.56335, 405.5086, 742.68665]
2025-09-11 21:37:27,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [89.0, 90.0, 23.0, 80.0, 170.0, 112.0, 118.0, 91.0, 85.0, 144.0]
2025-09-11 21:37:27,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (515.72) for latency ExtremeClogL1U23
2025-09-11 21:37:27,495 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 10/100 (estimated time remaining: 21 hours, 57 minutes, 24 seconds)
2025-09-11 21:51:35,735 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:51:35,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:52:00,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 453.60376 ± 86.873
2025-09-11 21:52:00,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [481.86258, 530.6095, 583.24164, 346.1233, 581.8173, 352.4701, 354.32794, 416.5556, 410.43362, 478.59607]
2025-09-11 21:52:00,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [90.0, 96.0, 131.0, 65.0, 111.0, 68.0, 76.0, 78.0, 76.0, 103.0]
2025-09-11 21:52:00,024 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 11/100 (estimated time remaining: 21 hours, 43 minutes, 39 seconds)
2025-09-11 22:06:00,223 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:06:00,229 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:06:26,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 482.76230 ± 118.422
2025-09-11 22:06:26,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [425.70132, 653.2863, 694.76526, 509.8402, 451.87082, 250.79332, 507.10132, 484.98672, 417.3405, 431.93677]
2025-09-11 22:06:26,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [90.0, 135.0, 139.0, 95.0, 85.0, 53.0, 108.0, 96.0, 77.0, 79.0]
2025-09-11 22:06:26,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 12/100 (estimated time remaining: 21 hours, 25 minutes, 20 seconds)
2025-09-11 22:20:30,629 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:20:30,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:20:55,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 459.44174 ± 159.780
2025-09-11 22:20:55,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [376.25943, 451.8756, 368.4336, 615.86566, 513.1441, 287.50943, 383.44135, 339.86566, 858.72925, 399.29352]
2025-09-11 22:20:55,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [72.0, 83.0, 68.0, 124.0, 103.0, 54.0, 85.0, 64.0, 183.0, 87.0]
2025-09-11 22:20:56,000 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 13/100 (estimated time remaining: 21 hours, 13 minutes, 46 seconds)
2025-09-11 22:35:05,462 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:35:05,464 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:35:33,203 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 499.07562 ± 169.236
2025-09-11 22:35:33,203 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [547.57245, 394.59003, 350.8868, 316.8282, 941.9216, 428.6739, 392.21945, 546.11945, 519.499, 552.446]
2025-09-11 22:35:33,203 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [122.0, 75.0, 76.0, 70.0, 185.0, 95.0, 73.0, 102.0, 98.0, 118.0]
2025-09-11 22:35:33,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 14/100 (estimated time remaining: 21 hours, 3 minutes, 34 seconds)
2025-09-11 22:49:36,660 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:49:36,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:50:02,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 519.25555 ± 157.148
2025-09-11 22:50:02,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [413.53955, 484.93533, 929.1058, 470.00967, 389.7914, 647.2683, 467.28098, 356.5846, 547.6492, 486.39038]
2025-09-11 22:50:02,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [77.0, 89.0, 177.0, 88.0, 72.0, 122.0, 86.0, 66.0, 101.0, 90.0]
2025-09-11 22:50:02,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (519.26) for latency ExtremeClogL1U23
2025-09-11 22:50:02,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 15/100 (estimated time remaining: 20 hours, 48 minutes, 29 seconds)
2025-09-11 23:04:04,565 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:04:04,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:04:27,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 443.82455 ± 180.147
2025-09-11 23:04:27,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [473.00507, 475.90652, 538.3907, 558.43384, 512.3166, 108.56065, 165.37556, 743.06934, 341.88486, 521.3027]
2025-09-11 23:04:27,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [91.0, 96.0, 113.0, 105.0, 95.0, 21.0, 32.0, 138.0, 65.0, 96.0]
2025-09-11 23:04:27,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 16/100 (estimated time remaining: 20 hours, 31 minutes, 51 seconds)
2025-09-11 23:18:31,818 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:18:31,820 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:18:57,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 480.47314 ± 157.803
2025-09-11 23:18:57,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [544.9412, 503.61627, 451.211, 456.48325, 737.31604, 380.73755, 96.42623, 524.50244, 611.82324, 497.67438]
2025-09-11 23:18:57,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [107.0, 104.0, 85.0, 98.0, 152.0, 70.0, 19.0, 106.0, 128.0, 91.0]
2025-09-11 23:18:57,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 17/100 (estimated time remaining: 20 hours, 18 minutes, 22 seconds)
2025-09-11 23:33:06,519 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:33:06,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:33:30,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 474.50006 ± 126.735
2025-09-11 23:33:30,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [543.8125, 546.55225, 496.98007, 484.54233, 376.97205, 548.34015, 680.36206, 172.48683, 447.67383, 447.27814]
2025-09-11 23:33:30,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [99.0, 99.0, 96.0, 90.0, 69.0, 104.0, 134.0, 33.0, 85.0, 96.0]
2025-09-11 23:33:31,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 18/100 (estimated time remaining: 20 hours, 4 minutes, 53 seconds)
2025-09-11 23:47:37,007 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:47:37,009 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:48:03,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 502.73633 ± 97.529
2025-09-11 23:48:03,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [439.03683, 499.0644, 681.1823, 541.77515, 365.64496, 588.0042, 407.7673, 418.75674, 621.46356, 464.66803]
2025-09-11 23:48:03,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [84.0, 97.0, 126.0, 100.0, 67.0, 114.0, 90.0, 76.0, 119.0, 98.0]
2025-09-11 23:48:03,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 19/100 (estimated time remaining: 19 hours, 49 minutes, 5 seconds)
2025-09-12 00:02:12,380 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:02:12,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:02:41,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 532.19983 ± 153.370
2025-09-12 00:02:41,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [542.43823, 349.66327, 561.06244, 481.63223, 672.5902, 415.389, 527.9313, 361.08844, 510.20932, 899.9941]
2025-09-12 00:02:41,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [111.0, 78.0, 107.0, 92.0, 125.0, 76.0, 113.0, 78.0, 109.0, 178.0]
2025-09-12 00:02:41,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (532.20) for latency ExtremeClogL1U23
2025-09-12 00:02:41,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 20/100 (estimated time remaining: 19 hours, 36 minutes, 48 seconds)
2025-09-12 00:16:44,585 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:16:44,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:17:17,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 632.00458 ± 148.961
2025-09-12 00:17:17,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [579.0445, 472.37878, 703.68744, 495.3159, 746.6447, 962.43054, 691.20685, 608.29596, 423.75485, 637.286]
2025-09-12 00:17:17,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [109.0, 87.0, 140.0, 91.0, 141.0, 187.0, 136.0, 126.0, 77.0, 119.0]
2025-09-12 00:17:17,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (632.00) for latency ExtremeClogL1U23
2025-09-12 00:17:17,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 21/100 (estimated time remaining: 19 hours, 25 minutes, 19 seconds)
2025-09-12 00:31:23,310 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:31:23,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:31:49,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 526.16199 ± 186.311
2025-09-12 00:31:49,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [453.47275, 453.1248, 731.1485, 551.15155, 512.13153, 123.87645, 818.82587, 713.348, 441.9551, 462.5857]
2025-09-12 00:31:49,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [82.0, 87.0, 132.0, 110.0, 95.0, 24.0, 146.0, 140.0, 80.0, 82.0]
2025-09-12 00:31:49,955 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 22/100 (estimated time remaining: 19 hours, 11 minutes, 24 seconds)
2025-09-12 00:45:54,727 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:45:54,729 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:46:22,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 542.75885 ± 112.284
2025-09-12 00:46:22,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [497.5812, 455.6149, 384.33514, 493.67014, 736.089, 661.4678, 559.58026, 433.7868, 697.8984, 507.5648]
2025-09-12 00:46:22,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [92.0, 82.0, 71.0, 90.0, 143.0, 127.0, 102.0, 80.0, 149.0, 93.0]
2025-09-12 00:46:22,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 23/100 (estimated time remaining: 18 hours, 56 minutes, 38 seconds)
2025-09-12 01:00:31,082 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:00:31,084 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:00:58,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 524.06989 ± 184.941
2025-09-12 01:00:58,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [612.004, 738.85925, 510.7632, 577.85876, 620.04565, 394.3566, 800.69855, 465.93332, 401.1282, 119.05091]
2025-09-12 01:00:58,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [130.0, 149.0, 93.0, 111.0, 116.0, 84.0, 162.0, 86.0, 74.0, 23.0]
2025-09-12 01:00:58,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 24/100 (estimated time remaining: 18 hours, 42 minutes, 58 seconds)
2025-09-12 01:15:03,470 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:15:03,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:15:33,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 560.09601 ± 209.268
2025-09-12 01:15:33,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [715.0703, 111.9424, 432.43536, 617.29144, 739.2335, 549.12537, 534.5046, 393.74136, 923.97986, 583.6357]
2025-09-12 01:15:33,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [152.0, 22.0, 82.0, 116.0, 154.0, 102.0, 100.0, 87.0, 179.0, 123.0]
2025-09-12 01:15:33,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 25/100 (estimated time remaining: 18 hours, 27 minutes, 40 seconds)
2025-09-12 01:29:40,346 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:29:40,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:30:09,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 550.70667 ± 126.517
2025-09-12 01:30:09,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [531.62646, 590.75055, 661.269, 307.87396, 345.48654, 611.88354, 682.46277, 569.33167, 508.76688, 697.615]
2025-09-12 01:30:09,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [119.0, 113.0, 136.0, 67.0, 62.0, 116.0, 127.0, 105.0, 110.0, 135.0]
2025-09-12 01:30:09,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 26/100 (estimated time remaining: 18 hours, 13 minutes)
2025-09-12 01:44:12,829 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:44:12,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:44:41,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 584.35779 ± 279.809
2025-09-12 01:44:41,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [108.496864, 262.54968, 924.399, 596.6839, 1112.6434, 586.88446, 357.9485, 627.80676, 641.9163, 624.2488]
2025-09-12 01:44:41,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [21.0, 54.0, 170.0, 112.0, 211.0, 106.0, 66.0, 117.0, 117.0, 113.0]
2025-09-12 01:44:41,985 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 27/100 (estimated time remaining: 17 hours, 58 minutes, 26 seconds)
2025-09-12 01:58:51,616 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:58:51,618 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:59:24,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 623.64954 ± 111.488
2025-09-12 01:59:24,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [489.28345, 712.6779, 740.2168, 638.80646, 635.645, 504.25232, 836.8574, 510.87686, 650.1522, 517.7271]
2025-09-12 01:59:24,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [93.0, 151.0, 151.0, 119.0, 120.0, 93.0, 162.0, 93.0, 135.0, 96.0]
2025-09-12 01:59:24,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 28/100 (estimated time remaining: 17 hours, 46 minutes, 14 seconds)
2025-09-12 02:13:37,709 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:13:37,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:14:07,338 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 555.79242 ± 303.541
2025-09-12 02:14:07,339 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [537.2688, 1198.0936, 649.46344, 660.85333, 774.6132, 391.10443, 422.88882, 683.20624, 107.7772, 132.65457]
2025-09-12 02:14:07,339 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [97.0, 235.0, 132.0, 142.0, 147.0, 72.0, 92.0, 144.0, 21.0, 26.0]
2025-09-12 02:14:07,355 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 29/100 (estimated time remaining: 17 hours, 33 minutes, 15 seconds)
2025-09-12 02:28:04,036 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:28:04,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:28:44,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 773.29785 ± 85.479
2025-09-12 02:28:44,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [878.97205, 800.5565, 663.46967, 675.2202, 671.9492, 773.84875, 735.17957, 753.40564, 886.4933, 893.8838]
2025-09-12 02:28:44,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [169.0, 162.0, 130.0, 123.0, 132.0, 138.0, 140.0, 143.0, 169.0, 172.0]
2025-09-12 02:28:44,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (773.30) for latency ExtremeClogL1U23
2025-09-12 02:28:44,168 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 30/100 (estimated time remaining: 17 hours, 19 minutes, 5 seconds)
2025-09-12 02:42:52,013 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:42:52,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:43:21,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 546.20740 ± 231.489
2025-09-12 02:43:21,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [850.7763, 840.9617, 130.7683, 639.1286, 644.464, 659.28406, 180.19643, 571.0126, 412.77917, 532.7031]
2025-09-12 02:43:21,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [165.0, 163.0, 25.0, 123.0, 141.0, 142.0, 35.0, 120.0, 92.0, 96.0]
2025-09-12 02:43:21,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 31/100 (estimated time remaining: 17 hours, 4 minutes, 48 seconds)
2025-09-12 02:57:39,363 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:57:39,365 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:58:19,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 773.64221 ± 257.240
2025-09-12 02:58:19,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [767.213, 497.64023, 1183.5813, 760.9022, 439.8066, 710.6636, 597.94037, 595.3323, 955.7865, 1227.5553]
2025-09-12 02:58:19,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [140.0, 91.0, 221.0, 140.0, 83.0, 135.0, 112.0, 110.0, 200.0, 244.0]
2025-09-12 02:58:19,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (773.64) for latency ExtremeClogL1U23
2025-09-12 02:58:19,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 32/100 (estimated time remaining: 16 hours, 55 minutes, 59 seconds)
2025-09-12 03:12:18,998 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:12:19,002 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:12:58,415 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 755.98730 ± 248.484
2025-09-12 03:12:58,415 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1017.115, 720.71716, 667.1748, 1009.26514, 753.99304, 165.47174, 568.1989, 1051.92, 803.16736, 802.8501]
2025-09-12 03:12:58,415 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [200.0, 131.0, 119.0, 192.0, 141.0, 32.0, 123.0, 193.0, 163.0, 163.0]
2025-09-12 03:12:58,429 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 33/100 (estimated time remaining: 16 hours, 40 minutes, 28 seconds)
2025-09-12 03:27:03,547 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:27:03,549 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:27:46,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 836.92108 ± 358.391
2025-09-12 03:27:46,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1221.7606, 1022.2256, 939.84406, 1031.9374, 125.058105, 476.98596, 643.62494, 1190.3014, 502.44885, 1215.024]
2025-09-12 03:27:46,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [248.0, 191.0, 167.0, 194.0, 24.0, 88.0, 120.0, 220.0, 93.0, 223.0]
2025-09-12 03:27:46,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (836.92) for latency ExtremeClogL1U23
2025-09-12 03:27:46,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 34/100 (estimated time remaining: 16 hours, 26 minutes, 52 seconds)
2025-09-12 03:41:53,515 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:41:53,517 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:42:28,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 679.02472 ± 131.030
2025-09-12 03:42:28,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [650.75446, 698.8476, 523.63837, 729.6985, 889.68036, 711.5359, 841.06665, 660.7847, 673.0298, 411.2105]
2025-09-12 03:42:28,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [127.0, 133.0, 106.0, 159.0, 165.0, 131.0, 156.0, 144.0, 127.0, 77.0]
2025-09-12 03:42:28,956 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 35/100 (estimated time remaining: 16 hours, 13 minutes, 27 seconds)
2025-09-12 03:56:39,730 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:56:39,732 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:57:21,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 820.95184 ± 168.135
2025-09-12 03:57:21,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [619.38184, 897.4277, 597.23975, 631.99835, 797.14734, 1043.8477, 1060.6971, 782.2025, 1015.799, 763.77747]
2025-09-12 03:57:21,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [112.0, 176.0, 111.0, 119.0, 146.0, 189.0, 207.0, 142.0, 201.0, 141.0]
2025-09-12 03:57:21,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 36/100 (estimated time remaining: 16 hours, 2 minutes, 1 second)
2025-09-12 04:11:28,833 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:11:28,837 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:12:09,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 778.35437 ± 332.064
2025-09-12 04:12:09,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [755.94116, 387.93744, 1300.488, 687.9044, 942.70233, 580.642, 994.7335, 1097.6257, 916.7113, 118.85732]
2025-09-12 04:12:09,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [149.0, 72.0, 248.0, 127.0, 182.0, 126.0, 189.0, 203.0, 181.0, 23.0]
2025-09-12 04:12:09,335 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 37/100 (estimated time remaining: 15 hours, 45 minutes, 4 seconds)
2025-09-12 04:26:16,130 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:26:16,131 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:26:46,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 586.48889 ± 289.440
2025-09-12 04:26:46,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [795.1886, 812.2181, 757.64185, 114.34877, 328.73108, 350.67126, 375.20798, 433.72662, 809.2974, 1087.8573]
2025-09-12 04:26:46,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [145.0, 148.0, 144.0, 22.0, 58.0, 78.0, 80.0, 91.0, 155.0, 199.0]
2025-09-12 04:26:46,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 38/100 (estimated time remaining: 15 hours, 29 minutes, 54 seconds)
2025-09-12 04:40:49,403 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:40:49,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:41:20,587 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 609.92725 ± 325.376
2025-09-12 04:41:20,587 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1153.2905, 898.6532, 950.4792, 790.9755, 125.81128, 375.15198, 616.0629, 135.57625, 564.90015, 488.37103]
2025-09-12 04:41:20,587 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [215.0, 170.0, 178.0, 154.0, 24.0, 68.0, 118.0, 26.0, 111.0, 104.0]
2025-09-12 04:41:20,597 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 39/100 (estimated time remaining: 15 hours, 12 minutes, 18 seconds)
2025-09-12 04:55:24,765 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:55:24,773 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:56:06,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 805.24500 ± 383.472
2025-09-12 04:56:06,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1086.3734, 566.43085, 933.6652, 819.2087, 1430.4437, 933.07587, 883.0242, 181.25272, 141.25423, 1077.7214]
2025-09-12 04:56:06,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [203.0, 105.0, 185.0, 162.0, 295.0, 180.0, 164.0, 35.0, 27.0, 214.0]
2025-09-12 04:56:06,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 40/100 (estimated time remaining: 14 hours, 58 minutes, 18 seconds)
2025-09-12 05:10:14,773 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:10:14,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:10:52,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 714.76459 ± 347.663
2025-09-12 05:10:52,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [407.51318, 656.4093, 84.564026, 621.149, 865.2101, 1278.1272, 536.3992, 620.83545, 799.5239, 1277.9144]
2025-09-12 05:10:52,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [76.0, 131.0, 17.0, 125.0, 156.0, 237.0, 117.0, 112.0, 163.0, 263.0]
2025-09-12 05:10:52,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 41/100 (estimated time remaining: 14 hours, 42 minutes, 10 seconds)
2025-09-12 05:25:02,264 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:25:02,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:25:49,514 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 921.66876 ± 344.217
2025-09-12 05:25:49,514 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1532.7789, 305.71152, 811.60175, 822.73206, 962.28906, 808.4172, 1445.9072, 995.52606, 583.1252, 948.5983]
2025-09-12 05:25:49,514 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [292.0, 53.0, 149.0, 163.0, 173.0, 148.0, 276.0, 190.0, 108.0, 189.0]
2025-09-12 05:25:49,514 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (921.67) for latency ExtremeClogL1U23
2025-09-12 05:25:49,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 42/100 (estimated time remaining: 14 hours, 29 minutes, 18 seconds)
2025-09-12 05:39:53,057 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:39:53,059 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:40:35,902 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 813.23865 ± 377.469
2025-09-12 05:40:35,902 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1130.838, 937.2015, 861.955, 386.41656, 1096.4275, 1029.8099, 919.35675, 119.422005, 323.48123, 1327.4775]
2025-09-12 05:40:35,902 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [222.0, 193.0, 179.0, 73.0, 204.0, 189.0, 166.0, 23.0, 61.0, 261.0]
2025-09-12 05:40:35,911 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 43/100 (estimated time remaining: 14 hours, 16 minutes, 20 seconds)
2025-09-12 05:54:47,843 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:54:47,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:55:37,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 969.94611 ± 366.736
2025-09-12 05:55:37,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [730.6677, 1233.7935, 752.80884, 1101.325, 1385.7719, 1076.4148, 1490.7123, 915.9743, 869.8771, 142.11642]
2025-09-12 05:55:37,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [136.0, 221.0, 142.0, 214.0, 274.0, 195.0, 284.0, 168.0, 176.0, 27.0]
2025-09-12 05:55:37,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (969.95) for latency ExtremeClogL1U23
2025-09-12 05:55:37,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 44/100 (estimated time remaining: 14 hours, 6 minutes, 50 seconds)
2025-09-12 06:09:42,147 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:09:42,154 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:10:32,502 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 970.03699 ± 534.416
2025-09-12 06:10:32,502 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1691.0197, 1227.4192, 1083.2056, 814.5577, 107.23811, 1364.575, 803.3533, 1704.0454, 792.03186, 112.92415]
2025-09-12 06:10:32,502 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [315.0, 228.0, 200.0, 154.0, 21.0, 249.0, 155.0, 362.0, 153.0, 22.0]
2025-09-12 06:10:32,502 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (970.04) for latency ExtremeClogL1U23
2025-09-12 06:10:32,513 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 45/100 (estimated time remaining: 13 hours, 53 minutes, 35 seconds)
2025-09-12 06:24:46,657 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:24:46,659 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:25:35,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 953.71698 ± 573.999
2025-09-12 06:25:35,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1494.9933, 1381.8605, 799.93835, 1591.449, 1436.4845, 929.3511, 171.69508, 162.8325, 128.54672, 1440.0186]
2025-09-12 06:25:35,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [276.0, 256.0, 146.0, 298.0, 281.0, 177.0, 33.0, 31.0, 25.0, 262.0]
2025-09-12 06:25:35,669 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 46/100 (estimated time remaining: 13 hours, 41 minutes, 51 seconds)
2025-09-12 06:39:35,777 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:39:35,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:40:33,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1081.74146 ± 349.697
2025-09-12 06:40:33,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1063.4525, 1015.9575, 1459.8741, 1089.9165, 907.93896, 595.99945, 530.9818, 1774.7789, 1222.8335, 1155.6809]
2025-09-12 06:40:33,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [208.0, 191.0, 304.0, 199.0, 166.0, 129.0, 93.0, 336.0, 250.0, 234.0]
2025-09-12 06:40:33,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1081.74) for latency ExtremeClogL1U23
2025-09-12 06:40:33,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 47/100 (estimated time remaining: 13 hours, 27 minutes, 5 seconds)
2025-09-12 06:54:40,549 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:54:40,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:55:43,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1210.68079 ± 561.084
2025-09-12 06:55:43,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1199.0188, 1186.6815, 889.2163, 1288.6595, 1081.5317, 958.2416, 1081.1116, 2104.147, 2192.838, 125.36245]
2025-09-12 06:55:43,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [211.0, 218.0, 177.0, 231.0, 211.0, 185.0, 204.0, 402.0, 433.0, 25.0]
2025-09-12 06:55:43,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1210.68) for latency ExtremeClogL1U23
2025-09-12 06:55:43,585 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 48/100 (estimated time remaining: 13 hours, 16 minutes, 21 seconds)
2025-09-12 07:09:53,823 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:09:53,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:10:52,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1137.49719 ± 287.172
2025-09-12 07:10:52,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [593.26685, 1399.9617, 874.9967, 1165.5957, 1166.4214, 942.579, 1238.4358, 1540.6904, 1511.5345, 941.49]
2025-09-12 07:10:52,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [109.0, 250.0, 165.0, 218.0, 226.0, 167.0, 236.0, 290.0, 295.0, 175.0]
2025-09-12 07:10:52,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 49/100 (estimated time remaining: 13 hours, 2 minutes, 30 seconds)
2025-09-12 07:24:54,853 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:24:54,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:25:50,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1078.85132 ± 581.273
2025-09-12 07:25:50,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [202.87453, 909.97296, 708.51556, 1425.8827, 2446.9387, 1482.0603, 1033.0331, 976.0128, 1051.3135, 551.9081]
2025-09-12 07:25:50,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [39.0, 171.0, 129.0, 265.0, 470.0, 281.0, 188.0, 183.0, 200.0, 109.0]
2025-09-12 07:25:50,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 50/100 (estimated time remaining: 12 hours, 48 minutes, 4 seconds)
2025-09-12 07:40:04,836 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:40:04,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:40:57,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 982.40839 ± 171.122
2025-09-12 07:40:57,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1112.9077, 1057.448, 993.1078, 736.365, 1030.9296, 973.6921, 1343.8243, 993.7284, 751.73615, 830.34467]
2025-09-12 07:40:57,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [222.0, 214.0, 196.0, 139.0, 210.0, 187.0, 257.0, 182.0, 143.0, 162.0]
2025-09-12 07:40:57,338 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 51/100 (estimated time remaining: 12 hours, 33 minutes, 36 seconds)
2025-09-12 07:55:05,811 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:55:05,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:56:07,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1204.65112 ± 685.502
2025-09-12 07:56:07,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [484.01337, 1067.7935, 1722.1497, 1496.9624, 185.9369, 1294.0426, 948.3364, 808.59064, 2797.8184, 1240.8682]
2025-09-12 07:56:07,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [86.0, 218.0, 330.0, 281.0, 36.0, 230.0, 172.0, 146.0, 533.0, 229.0]
2025-09-12 07:56:07,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 52/100 (estimated time remaining: 12 hours, 20 minutes, 31 seconds)
2025-09-12 08:10:18,734 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:10:18,736 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:11:08,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 945.66418 ± 545.318
2025-09-12 08:11:08,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1365.5627, 1290.6403, 857.6404, 831.9679, 1914.6862, 96.84138, 482.95685, 1580.8112, 602.96436, 432.56927]
2025-09-12 08:11:08,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [245.0, 240.0, 174.0, 163.0, 364.0, 19.0, 92.0, 307.0, 107.0, 96.0]
2025-09-12 08:11:08,045 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 53/100 (estimated time remaining: 12 hours, 3 minutes, 54 seconds)
2025-09-12 08:25:22,300 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:25:22,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:26:20,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1089.52808 ± 677.160
2025-09-12 08:26:20,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1156.8071, 2783.7341, 1541.5837, 529.72644, 611.283, 559.97485, 660.98193, 1561.1727, 886.8282, 603.18805]
2025-09-12 08:26:20,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [225.0, 578.0, 298.0, 113.0, 110.0, 107.0, 136.0, 288.0, 177.0, 123.0]
2025-09-12 08:26:20,902 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 54/100 (estimated time remaining: 11 hours, 49 minutes, 30 seconds)
2025-09-12 08:40:49,918 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:40:49,922 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:41:54,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1288.17419 ± 750.132
2025-09-12 08:41:54,349 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2678.8035, 1327.0184, 125.43053, 1828.6151, 1080.6981, 1458.8676, 1807.8981, 95.873634, 1612.7097, 865.82733]
2025-09-12 08:41:54,349 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [487.0, 242.0, 24.0, 339.0, 191.0, 276.0, 339.0, 19.0, 315.0, 159.0]
2025-09-12 08:41:54,349 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1288.17) for latency ExtremeClogL1U23
2025-09-12 08:41:54,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 55/100 (estimated time remaining: 11 hours, 39 minutes, 46 seconds)
2025-09-12 08:55:46,221 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:55:46,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:56:49,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1245.43958 ± 470.290
2025-09-12 08:56:49,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1036.7683, 782.7942, 1213.3499, 726.2309, 761.63153, 919.10596, 2014.4912, 2041.3528, 1545.0543, 1413.6161]
2025-09-12 08:56:49,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [204.0, 152.0, 227.0, 134.0, 152.0, 176.0, 378.0, 382.0, 277.0, 271.0]
2025-09-12 08:56:49,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 56/100 (estimated time remaining: 11 hours, 22 minutes, 49 seconds)
2025-09-12 09:10:58,935 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:10:58,937 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:11:55,097 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1091.89038 ± 671.458
2025-09-12 09:11:55,097 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [498.1915, 2537.187, 722.31995, 1355.9946, 1244.4487, 555.0768, 757.35614, 1253.0294, 162.85141, 1832.4495]
2025-09-12 09:11:55,097 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [91.0, 476.0, 137.0, 269.0, 242.0, 102.0, 129.0, 232.0, 31.0, 339.0]
2025-09-12 09:11:55,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 57/100 (estimated time remaining: 11 hours, 7 minutes, 1 second)
2025-09-12 09:26:05,774 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:26:05,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:27:24,434 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1512.99805 ± 828.109
2025-09-12 09:27:24,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [724.4359, 551.9548, 1384.2312, 1164.9309, 1089.3916, 3553.1865, 1141.9612, 1627.4547, 1554.7186, 2337.7148]
2025-09-12 09:27:24,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [147.0, 102.0, 266.0, 217.0, 203.0, 652.0, 222.0, 312.0, 298.0, 437.0]
2025-09-12 09:27:24,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1513.00) for latency ExtremeClogL1U23
2025-09-12 09:27:24,464 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 58/100 (estimated time remaining: 10 hours, 55 minutes, 57 seconds)
2025-09-12 09:41:30,000 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:41:30,003 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:42:35,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1278.83313 ± 465.069
2025-09-12 09:42:35,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1446.5109, 1818.265, 896.0624, 1365.1478, 787.3072, 2282.1414, 1119.9373, 1408.6543, 817.0914, 847.2128]
2025-09-12 09:42:35,144 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [280.0, 324.0, 164.0, 263.0, 143.0, 416.0, 207.0, 266.0, 149.0, 154.0]
2025-09-12 09:42:35,152 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 59/100 (estimated time remaining: 10 hours, 40 minutes, 23 seconds)
2025-09-12 09:57:26,080 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:57:26,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:58:35,908 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1363.93811 ± 422.280
2025-09-12 09:58:35,909 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1227.5133, 1469.6841, 872.90924, 2079.399, 1016.6468, 1877.3114, 907.499, 1879.4012, 989.43896, 1319.5779]
2025-09-12 09:58:35,909 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [230.0, 261.0, 162.0, 394.0, 196.0, 359.0, 164.0, 364.0, 187.0, 250.0]
2025-09-12 09:58:35,937 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 60/100 (estimated time remaining: 10 hours, 28 minutes, 52 seconds)
2025-09-12 10:12:16,958 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:12:16,961 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:13:30,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1438.87939 ± 434.090
2025-09-12 10:13:30,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1047.7324, 1007.8915, 1584.2495, 939.78577, 2159.702, 2110.8276, 1581.2263, 1485.558, 1551.6947, 920.12726]
2025-09-12 10:13:30,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [200.0, 183.0, 287.0, 169.0, 378.0, 403.0, 298.0, 288.0, 288.0, 172.0]
2025-09-12 10:13:30,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 61/100 (estimated time remaining: 10 hours, 13 minutes, 27 seconds)
2025-09-12 10:27:52,912 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:27:52,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:28:53,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1165.59412 ± 470.608
2025-09-12 10:28:53,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [766.8902, 674.1395, 1100.845, 1583.2344, 139.99734, 1377.0532, 1281.8916, 1628.9789, 1632.1477, 1470.7638]
2025-09-12 10:28:53,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [140.0, 130.0, 235.0, 308.0, 27.0, 258.0, 230.0, 323.0, 322.0, 267.0]
2025-09-12 10:28:53,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 62/100 (estimated time remaining: 10 hours, 23 seconds)
2025-09-12 10:43:10,065 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:43:10,084 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:44:24,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1447.69250 ± 1347.149
2025-09-12 10:44:24,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1483.0859, 853.07697, 1774.5763, 491.89246, 1652.9451, 802.68835, 1059.5406, 5218.2046, 1022.2431, 118.670876]
2025-09-12 10:44:24,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [289.0, 158.0, 342.0, 90.0, 303.0, 148.0, 214.0, 963.0, 193.0, 23.0]
2025-09-12 10:44:24,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 63/100 (estimated time remaining: 9 hours, 45 minutes, 15 seconds)
2025-09-12 10:58:15,630 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:58:15,657 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:59:53,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1894.45532 ± 1184.563
2025-09-12 10:59:53,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1591.8611, 1293.9934, 3655.9888, 266.98566, 1212.7391, 3943.55, 747.11646, 1222.9951, 3075.905, 1933.4174]
2025-09-12 10:59:53,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [307.0, 239.0, 682.0, 49.0, 213.0, 729.0, 153.0, 221.0, 564.0, 394.0]
2025-09-12 10:59:53,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1894.46) for latency ExtremeClogL1U23
2025-09-12 10:59:53,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 64/100 (estimated time remaining: 9 hours, 32 minutes, 2 seconds)
2025-09-12 11:14:52,101 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:14:52,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:16:06,853 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1478.13452 ± 829.852
2025-09-12 11:16:06,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [673.05804, 1307.0858, 715.79517, 1081.4213, 386.3444, 1560.3794, 1511.2867, 2008.4277, 3342.3677, 2195.1802]
2025-09-12 11:16:06,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [120.0, 242.0, 151.0, 200.0, 85.0, 317.0, 275.0, 359.0, 595.0, 405.0]
2025-09-12 11:16:06,898 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 65/100 (estimated time remaining: 9 hours, 18 minutes, 6 seconds)
2025-09-12 11:29:45,760 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:29:45,762 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:30:45,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1165.65540 ± 675.696
2025-09-12 11:30:45,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2743.22, 1983.5511, 752.41956, 989.0295, 411.3321, 504.79706, 838.8401, 1392.1829, 1014.4221, 1026.7601]
2025-09-12 11:30:45,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [493.0, 377.0, 140.0, 186.0, 73.0, 95.0, 154.0, 269.0, 206.0, 186.0]
2025-09-12 11:30:45,412 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 66/100 (estimated time remaining: 9 hours, 45 seconds)
2025-09-12 11:44:38,004 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:44:38,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:45:55,476 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1525.48853 ± 1274.216
2025-09-12 11:45:55,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1570.6237, 1973.6067, 2845.301, 111.41828, 172.2723, 1028.2614, 4386.104, 251.359, 1011.9698, 1903.97]
2025-09-12 11:45:55,479 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [288.0, 368.0, 539.0, 22.0, 33.0, 192.0, 813.0, 48.0, 188.0, 357.0]
2025-09-12 11:45:55,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 67/100 (estimated time remaining: 8 hours, 43 minutes, 49 seconds)
2025-09-12 12:00:02,966 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:00:02,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:01:32,378 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1793.01343 ± 1403.123
2025-09-12 12:01:32,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1164.6583, 2624.7957, 1738.3678, 1951.8347, 5127.732, 1432.8073, 2814.7407, 502.1208, 95.134766, 477.94302]
2025-09-12 12:01:32,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [219.0, 501.0, 317.0, 362.0, 907.0, 267.0, 512.0, 99.0, 19.0, 97.0]
2025-09-12 12:01:32,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 68/100 (estimated time remaining: 8 hours, 29 minutes, 1 second)
2025-09-12 12:15:47,234 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:15:47,237 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:17:20,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1785.58032 ± 904.309
2025-09-12 12:17:20,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1280.8646, 3799.9954, 1884.2373, 2758.431, 1820.1555, 749.19464, 2316.8247, 989.3561, 964.5974, 1292.1465]
2025-09-12 12:17:20,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [246.0, 719.0, 365.0, 532.0, 339.0, 135.0, 446.0, 180.0, 176.0, 243.0]
2025-09-12 12:17:20,403 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 15 minutes, 41 seconds)
2025-09-12 12:32:04,788 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:32:04,790 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:33:27,543 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1644.65063 ± 1152.840
2025-09-12 12:33:27,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [3556.29, 3633.1672, 1480.8894, 2426.523, 1184.4968, 1323.5485, 1625.0375, 550.7125, 545.619, 120.221954]
2025-09-12 12:33:27,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [665.0, 692.0, 275.0, 442.0, 216.0, 238.0, 300.0, 101.0, 106.0, 23.0]
2025-09-12 12:33:27,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 70/100 (estimated time remaining: 7 hours, 59 minutes, 32 seconds)
2025-09-12 12:47:13,276 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:47:13,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:48:54,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2036.49219 ± 1520.883
2025-09-12 12:48:54,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1174.7406, 2854.666, 1258.5225, 132.87007, 1660.092, 112.97388, 1915.4648, 3330.8936, 5469.041, 2455.6572]
2025-09-12 12:48:54,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [214.0, 510.0, 235.0, 26.0, 323.0, 22.0, 341.0, 615.0, 1000.0, 462.0]
2025-09-12 12:48:54,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2036.49) for latency ExtremeClogL1U23
2025-09-12 12:48:54,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 71/100 (estimated time remaining: 7 hours, 48 minutes, 56 seconds)
2025-09-12 13:03:14,983 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:03:14,988 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:04:18,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1250.43445 ± 507.038
2025-09-12 13:04:18,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1704.6279, 646.1529, 1112.1604, 702.4765, 505.47272, 1126.5366, 1540.0093, 1766.3684, 1278.9684, 2121.5715]
2025-09-12 13:04:18,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [335.0, 137.0, 202.0, 126.0, 106.0, 217.0, 285.0, 325.0, 233.0, 382.0]
2025-09-12 13:04:18,365 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 72/100 (estimated time remaining: 7 hours, 34 minutes, 36 seconds)
2025-09-12 13:18:22,692 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:18:22,694 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:19:45,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1636.08765 ± 1501.580
2025-09-12 13:19:45,781 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [637.1587, 4182.153, 1485.095, 948.0717, 1725.008, 377.5725, 845.49695, 574.2559, 4860.594, 725.47034]
2025-09-12 13:19:45,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [115.0, 759.0, 270.0, 190.0, 343.0, 70.0, 153.0, 103.0, 899.0, 133.0]
2025-09-12 13:19:45,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 73/100 (estimated time remaining: 7 hours, 18 minutes, 3 seconds)
2025-09-12 13:34:25,180 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:34:25,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:36:07,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2051.65869 ± 1513.400
2025-09-12 13:36:07,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [789.566, 516.7348, 5155.0977, 4529.055, 1534.6138, 1875.6998, 1800.95, 2442.4392, 633.0587, 1239.3702]
2025-09-12 13:36:07,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [141.0, 109.0, 946.0, 808.0, 293.0, 358.0, 336.0, 447.0, 116.0, 222.0]
2025-09-12 13:36:07,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2051.66) for latency ExtremeClogL1U23
2025-09-12 13:36:07,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 74/100 (estimated time remaining: 7 hours, 5 minutes, 25 seconds)
2025-09-12 13:50:18,397 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:50:18,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:51:49,495 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1806.23303 ± 1358.940
2025-09-12 13:51:49,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1187.2268, 1973.0647, 3839.5996, 1624.3197, 2217.4626, 231.49579, 1002.1975, 4537.8203, 1353.6913, 95.451]
2025-09-12 13:51:49,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [215.0, 350.0, 699.0, 290.0, 411.0, 46.0, 200.0, 842.0, 244.0, 19.0]
2025-09-12 13:51:49,519 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 75/100 (estimated time remaining: 6 hours, 47 minutes, 30 seconds)
2025-09-12 14:05:43,964 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:05:43,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:07:36,006 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2156.22607 ± 1767.512
2025-09-12 14:07:36,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [3888.5122, 4077.7341, 5158.3887, 105.74972, 962.0089, 1126.5181, 3602.785, 102.84165, 575.925, 1961.7955]
2025-09-12 14:07:36,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [754.0, 764.0, 971.0, 21.0, 175.0, 202.0, 653.0, 20.0, 104.0, 404.0]
2025-09-12 14:07:36,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2156.23) for latency ExtremeClogL1U23
2025-09-12 14:07:36,052 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 33 minutes, 25 seconds)
2025-09-12 14:21:20,741 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:21:20,744 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:22:47,781 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1729.02954 ± 1496.721
2025-09-12 14:22:47,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [927.0876, 1053.3912, 96.32836, 1816.8773, 611.2019, 2189.3584, 2969.461, 826.80963, 5548.267, 1251.5115]
2025-09-12 14:22:47,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [175.0, 186.0, 19.0, 351.0, 112.0, 410.0, 561.0, 153.0, 1000.0, 244.0]
2025-09-12 14:22:47,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 16 minutes, 45 seconds)
2025-09-12 14:36:54,268 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:36:54,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:38:28,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1896.93713 ± 1025.855
2025-09-12 14:38:28,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2341.9106, 1676.7571, 628.0277, 2445.076, 107.768906, 3954.645, 1218.0109, 1949.1586, 2550.9336, 2097.0813]
2025-09-12 14:38:28,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [438.0, 291.0, 127.0, 422.0, 21.0, 715.0, 227.0, 369.0, 474.0, 383.0]
2025-09-12 14:38:28,343 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 78/100 (estimated time remaining: 6 hours, 2 minutes, 3 seconds)
2025-09-12 14:52:31,919 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:52:31,921 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:54:57,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2830.88330 ± 1767.630
2025-09-12 14:54:57,014 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [146.64777, 1644.1971, 4669.6475, 5449.56, 575.9733, 3022.6362, 3347.9802, 5161.444, 1582.715, 2708.0322]
2025-09-12 14:54:57,014 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [28.0, 328.0, 878.0, 1000.0, 114.0, 556.0, 633.0, 951.0, 301.0, 540.0]
2025-09-12 14:54:57,014 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2830.88) for latency ExtremeClogL1U23
2025-09-12 14:54:57,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 46 minutes, 50 seconds)
2025-09-12 15:09:03,032 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:09:03,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:10:40,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1969.98657 ± 1767.182
2025-09-12 15:10:40,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1098.123, 2027.6359, 430.2157, 272.6176, 3533.6443, 1718.5428, 129.84659, 5647.912, 779.6001, 4061.729]
2025-09-12 15:10:40,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [207.0, 365.0, 87.0, 53.0, 646.0, 309.0, 25.0, 1000.0, 153.0, 745.0]
2025-09-12 15:10:40,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 31 minutes, 8 seconds)
2025-09-12 15:25:43,999 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:25:44,002 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:27:09,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1726.64526 ± 728.102
2025-09-12 15:27:09,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1878.7592, 2133.5337, 1285.7567, 1331.2291, 1752.6162, 2200.136, 1266.8275, 3369.1724, 458.14383, 1590.2775]
2025-09-12 15:27:09,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [342.0, 387.0, 239.0, 255.0, 335.0, 402.0, 244.0, 602.0, 98.0, 281.0]
2025-09-12 15:27:09,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 18 minutes, 14 seconds)
2025-09-12 15:40:28,952 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:40:28,956 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:42:49,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2864.36035 ± 1605.938
2025-09-12 15:42:49,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5454.1157, 2373.4978, 1607.2458, 1721.1073, 5561.039, 2157.615, 4158.918, 849.5916, 3375.153, 1385.3195]
2025-09-12 15:42:49,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 427.0, 296.0, 305.0, 1000.0, 392.0, 747.0, 154.0, 625.0, 241.0]
2025-09-12 15:42:49,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2864.36) for latency ExtremeClogL1U23
2025-09-12 15:42:49,788 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 82/100 (estimated time remaining: 5 hours, 4 minutes, 7 seconds)
2025-09-12 15:57:40,340 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:57:40,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:59:12,076 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1799.24670 ± 1522.869
2025-09-12 15:59:12,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [904.1969, 718.2846, 2319.8425, 1442.615, 2560.1902, 3001.0664, 141.04436, 5414.369, 1361.9866, 128.8714]
2025-09-12 15:59:12,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [168.0, 159.0, 432.0, 276.0, 489.0, 561.0, 27.0, 1000.0, 251.0, 26.0]
2025-09-12 15:59:12,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 50 minutes, 37 seconds)
2025-09-12 16:13:19,878 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:13:19,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:14:50,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1645.29395 ± 981.863
2025-09-12 16:14:50,217 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1479.3228, 911.7792, 1333.3759, 1669.1995, 1439.8865, 3926.0505, 1755.2224, 2811.7925, 467.7679, 658.54364]
2025-09-12 16:14:50,217 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [285.0, 180.0, 260.0, 358.0, 291.0, 791.0, 352.0, 553.0, 101.0, 139.0]
2025-09-12 16:14:50,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 31 minutes, 36 seconds)
2025-09-12 16:28:28,848 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:28:28,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:29:32,087 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1260.51831 ± 1058.199
2025-09-12 16:29:32,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1830.3046, 102.548744, 748.24304, 2688.144, 1225.0781, 1496.3635, 96.80875, 596.9547, 404.4265, 3416.312]
2025-09-12 16:29:32,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [342.0, 20.0, 135.0, 497.0, 222.0, 281.0, 19.0, 117.0, 75.0, 613.0]
2025-09-12 16:29:32,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 12 minutes, 22 seconds)
2025-09-12 16:43:44,499 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:43:44,504 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:45:24,332 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1967.45251 ± 1575.218
2025-09-12 16:45:24,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [647.206, 144.80879, 4042.5034, 1758.3525, 4673.586, 1806.3022, 3700.018, 147.05614, 2097.5107, 657.1823]
2025-09-12 16:45:24,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [132.0, 28.0, 748.0, 332.0, 872.0, 337.0, 685.0, 28.0, 393.0, 123.0]
2025-09-12 16:45:24,344 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 54 minutes, 43 seconds)
2025-09-12 16:59:25,493 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:59:25,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:01:13,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2124.90015 ± 1457.836
2025-09-12 17:01:13,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5345.819, 845.66644, 1797.1831, 1575.1633, 157.529, 3397.8193, 473.1286, 2417.1848, 2481.7854, 2757.722]
2025-09-12 17:01:13,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [973.0, 164.0, 341.0, 304.0, 30.0, 633.0, 84.0, 439.0, 469.0, 504.0]
2025-09-12 17:01:13,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 39 minutes, 29 seconds)
2025-09-12 17:15:08,438 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:15:08,441 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:17:02,388 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2258.01904 ± 1801.932
2025-09-12 17:17:02,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [107.049576, 2428.5515, 535.04083, 5458.8774, 2953.8303, 755.14954, 5432.0557, 2268.6567, 1430.9119, 1210.0671]
2025-09-12 17:17:02,393 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [21.0, 464.0, 107.0, 1000.0, 553.0, 144.0, 1000.0, 441.0, 266.0, 219.0]
2025-09-12 17:17:02,402 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 22 minutes, 22 seconds)
2025-09-12 17:31:36,183 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:31:36,186 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:33:58,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2829.29614 ± 1659.590
2025-09-12 17:33:58,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [528.4438, 1232.8276, 5369.7886, 5267.888, 2617.277, 4874.521, 2700.3486, 1945.6353, 2359.815, 1396.4159]
2025-09-12 17:33:58,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [100.0, 227.0, 1000.0, 1000.0, 476.0, 927.0, 505.0, 362.0, 427.0, 252.0]
2025-09-12 17:33:58,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 9 minutes, 56 seconds)
2025-09-12 17:47:45,013 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:47:45,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:49:23,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1983.67773 ± 1507.212
2025-09-12 17:49:23,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1807.3604, 926.2914, 5153.628, 4107.813, 693.89215, 884.9077, 84.67314, 2480.7559, 1497.5752, 2199.8784]
2025-09-12 17:49:23,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [325.0, 176.0, 968.0, 739.0, 130.0, 164.0, 17.0, 466.0, 270.0, 396.0]
2025-09-12 17:49:23,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 55 minutes, 41 seconds)
2025-09-12 18:04:00,046 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:04:00,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:06:17,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2690.39722 ± 1485.224
2025-09-12 18:06:17,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [4057.467, 5402.216, 1831.7314, 3077.9775, 3649.853, 591.03314, 1078.8201, 3389.296, 834.4451, 2991.133]
2025-09-12 18:06:17,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [742.0, 1000.0, 340.0, 579.0, 689.0, 111.0, 211.0, 641.0, 152.0, 548.0]
2025-09-12 18:06:17,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 41 minutes, 47 seconds)
2025-09-12 18:19:57,268 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:19:57,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:23:03,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 3638.29419 ± 1849.698
2025-09-12 18:23:03,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [4060.2183, 4970.757, 5314.717, 5288.797, 1178.5579, 5240.845, 5480.4443, 2884.5867, 1112.4291, 851.5882]
2025-09-12 18:23:03,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [755.0, 949.0, 1000.0, 1000.0, 231.0, 968.0, 1000.0, 564.0, 206.0, 163.0]
2025-09-12 18:23:03,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (3638.29) for latency ExtremeClogL1U23
2025-09-12 18:23:03,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 27 minutes, 18 seconds)
2025-09-12 18:37:51,220 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:37:51,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:40:17,657 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2967.82617 ± 1295.044
2025-09-12 18:40:17,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [893.6563, 3515.234, 3803.822, 3556.2344, 3778.2617, 757.1197, 2560.2434, 4818.78, 3994.078, 2000.8328]
2025-09-12 18:40:17,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [173.0, 650.0, 681.0, 640.0, 699.0, 134.0, 456.0, 863.0, 714.0, 393.0]
2025-09-12 18:40:17,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 93/100 (estimated time remaining: 2 hours, 13 minutes, 12 seconds)
2025-09-12 18:53:45,165 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:53:45,168 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:55:12,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1737.39880 ± 1419.040
2025-09-12 18:55:12,075 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2240.6733, 2493.5862, 1755.908, 764.99945, 662.74945, 1873.7349, 102.1814, 1526.5979, 589.12415, 5364.433]
2025-09-12 18:55:12,075 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [404.0, 457.0, 324.0, 139.0, 112.0, 345.0, 20.0, 278.0, 119.0, 1000.0]
2025-09-12 18:55:12,084 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 53 minutes, 42 seconds)
2025-09-12 19:09:50,628 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:09:50,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:11:31,511 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1941.60388 ± 1110.351
2025-09-12 19:11:31,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [628.27625, 2092.176, 4140.051, 2461.6348, 1267.0804, 3627.4573, 1084.4706, 1916.8097, 986.11694, 1211.9666]
2025-09-12 19:11:31,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [124.0, 399.0, 778.0, 461.0, 242.0, 706.0, 201.0, 368.0, 190.0, 239.0]
2025-09-12 19:11:31,521 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 38 minutes, 33 seconds)
2025-09-12 19:25:17,044 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:25:17,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:27:18,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2406.94043 ± 1270.646
2025-09-12 19:27:18,525 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1242.5446, 1620.805, 5486.0303, 3649.2144, 1496.3416, 1957.54, 1751.028, 2307.529, 3185.7344, 1372.6383]
2025-09-12 19:27:18,525 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [231.0, 319.0, 1000.0, 670.0, 281.0, 351.0, 309.0, 416.0, 578.0, 249.0]
2025-09-12 19:27:18,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 21 minutes)
2025-09-12 19:41:29,474 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:41:29,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:43:30,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2416.59766 ± 2046.841
2025-09-12 19:43:30,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5551.2236, 4086.078, 5553.885, 932.4094, 3857.0496, 113.70093, 133.55145, 747.2608, 1089.8191, 2101.0007]
2025-09-12 19:43:30,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 741.0, 1000.0, 169.0, 691.0, 22.0, 26.0, 140.0, 205.0, 389.0]
2025-09-12 19:43:30,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 4 minutes, 21 seconds)
2025-09-12 19:57:41,390 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:57:41,394 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:59:13,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1779.18872 ± 1231.173
2025-09-12 19:59:13,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2676.551, 113.076706, 1533.3818, 4362.4053, 1349.1597, 145.65474, 2096.171, 2871.7925, 949.4843, 1694.21]
2025-09-12 19:59:13,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [501.0, 22.0, 287.0, 830.0, 254.0, 28.0, 397.0, 557.0, 191.0, 340.0]
2025-09-12 19:59:13,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 98/100 (estimated time remaining: 47 minutes, 21 seconds)
2025-09-12 20:13:09,920 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:13:09,922 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:14:43,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1798.07104 ± 1505.427
2025-09-12 20:14:43,447 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1924.8416, 1599.1859, 861.66223, 5104.843, 1569.263, 2423.4724, 189.87889, 214.2355, 444.48032, 3648.8477]
2025-09-12 20:14:43,447 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [371.0, 303.0, 160.0, 937.0, 301.0, 459.0, 37.0, 43.0, 84.0, 673.0]
2025-09-12 20:14:43,456 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 99/100 (estimated time remaining: 31 minutes, 48 seconds)
2025-09-12 20:28:58,165 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:28:58,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:30:45,616 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2140.21704 ± 1675.307
2025-09-12 20:30:45,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [993.7261, 3675.2046, 4030.63, 1180.7213, 5514.242, 613.2873, 351.19205, 1621.444, 2794.291, 627.4332]
2025-09-12 20:30:45,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [181.0, 673.0, 731.0, 219.0, 1000.0, 113.0, 64.0, 306.0, 526.0, 117.0]
2025-09-12 20:30:45,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 100/100 (estimated time remaining: 15 minutes, 50 seconds)
2025-09-12 20:45:51,934 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:45:51,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:47:05,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1438.29370 ± 1301.613
2025-09-12 20:47:05,792 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [586.01624, 3605.1008, 651.46924, 3642.0984, 110.17261, 404.3212, 1822.2511, 994.72797, 2470.7214, 96.05814]
2025-09-12 20:47:05,792 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [114.0, 683.0, 120.0, 670.0, 22.0, 78.0, 350.0, 185.0, 454.0, 19.0]
2025-09-12 20:47:05,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1251 [DEBUG]: Training session finished
