2025-09-13 07:52:07,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc7/noiseperc10-humanoid/ExtremeSparseL4U32-mbpac-highdim-memdelay
2025-09-13 07:52:07,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc7/noiseperc10-humanoid/ExtremeSparseL4U32-mbpac-highdim-memdelay
2025-09-13 07:52:07,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeSparseL4U32': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x153acffd8dd0>}
2025-09-13 07:52:07,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1111 [DEBUG]: using device: cuda
2025-09-13 07:52:07,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1133 [INFO]: Creating new trainer
2025-09-13 07:52:07,748 baseline-mbpac-noiseperc10-humanoid:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (tanh_refit): NNTanhRefit(
    scale: tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000,
             0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000]]), shift: tensor([[-0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000]])
  )
)
2025-09-13 07:52:07,748 baseline-mbpac-noiseperc10-humanoid:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=393, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-13 07:52:07,759 baseline-mbpac-noiseperc10-humanoid:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=376, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=17, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-13 07:52:11,238 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1194 [DEBUG]: Starting training session...
2025-09-13 07:52:11,239 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 1/100
2025-09-13 08:04:04,037 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 08:04:04,044 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 08:04:23,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 320.40335 ± 147.981
2025-09-13 08:04:23,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [380.8886, 364.48676, 367.7191, 385.4785, 154.26715, 143.81587, 576.96277, 145.42859, 501.72635, 183.25978]
2025-09-13 08:04:23,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [77.0, 74.0, 73.0, 81.0, 30.0, 28.0, 125.0, 28.0, 101.0, 40.0]
2025-09-13 08:04:23,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (320.40) for latency ExtremeSparseL4U32
2025-09-13 08:04:23,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 2/100 (estimated time remaining: 20 hours, 8 minutes, 48 seconds)
2025-09-13 08:15:57,869 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 08:15:57,875 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 08:16:17,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 352.37143 ± 45.029
2025-09-13 08:16:17,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [317.46344, 352.25708, 342.55814, 316.5997, 353.9894, 317.6484, 349.6986, 333.96628, 360.4229, 479.11026]
2025-09-13 08:16:17,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [60.0, 68.0, 64.0, 58.0, 66.0, 66.0, 65.0, 62.0, 66.0, 90.0]
2025-09-13 08:16:17,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (352.37) for latency ExtremeSparseL4U32
2025-09-13 08:16:17,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 3/100 (estimated time remaining: 19 hours, 41 minutes, 23 seconds)
2025-09-13 08:27:55,298 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 08:27:55,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 08:28:13,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 306.43903 ± 128.697
2025-09-13 08:28:13,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [256.51828, 586.70056, 319.25653, 312.02112, 329.19617, 395.44717, 107.915886, 130.48822, 365.80612, 261.03995]
2025-09-13 08:28:13,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [55.0, 110.0, 59.0, 57.0, 61.0, 80.0, 21.0, 25.0, 67.0, 55.0]
2025-09-13 08:28:13,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 4/100 (estimated time remaining: 19 hours, 24 minutes, 59 seconds)
2025-09-13 08:39:53,542 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 08:39:53,548 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 08:40:25,500 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 533.36780 ± 144.407
2025-09-13 08:40:25,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [625.5392, 316.52448, 535.1036, 361.266, 572.2972, 825.5597, 433.5791, 435.48004, 658.36646, 569.96246]
2025-09-13 08:40:25,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [125.0, 64.0, 99.0, 68.0, 108.0, 165.0, 80.0, 97.0, 125.0, 109.0]
2025-09-13 08:40:25,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (533.37) for latency ExtremeSparseL4U32
2025-09-13 08:40:25,513 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 5/100 (estimated time remaining: 19 hours, 17 minutes, 42 seconds)
2025-09-13 08:51:59,320 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 08:51:59,327 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 08:52:27,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 467.17773 ± 119.560
2025-09-13 08:52:27,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [371.81845, 421.16644, 478.8477, 457.54163, 378.98953, 331.7568, 483.63232, 548.21277, 777.8393, 421.97244]
2025-09-13 08:52:27,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [82.0, 77.0, 102.0, 84.0, 69.0, 70.0, 90.0, 104.0, 164.0, 95.0]
2025-09-13 08:52:27,587 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 6/100 (estimated time remaining: 19 hours, 5 minutes, 10 seconds)
2025-09-13 09:04:09,897 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 09:04:09,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 09:04:41,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 527.69415 ± 107.381
2025-09-13 09:04:41,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [488.53152, 630.9552, 433.66345, 579.5428, 726.9746, 494.38162, 556.8898, 322.8599, 584.3108, 458.8319]
2025-09-13 09:04:41,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [102.0, 134.0, 80.0, 108.0, 140.0, 94.0, 103.0, 66.0, 111.0, 101.0]
2025-09-13 09:04:41,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 7/100 (estimated time remaining: 18 hours, 53 minutes, 36 seconds)
2025-09-13 09:16:18,170 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 09:16:18,176 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 09:16:43,484 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 442.78134 ± 88.916
2025-09-13 09:16:43,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [412.00745, 444.96655, 498.43835, 371.67883, 393.89682, 565.3572, 369.47345, 630.2832, 399.68973, 342.0215]
2025-09-13 09:16:43,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [78.0, 81.0, 94.0, 68.0, 75.0, 110.0, 69.0, 134.0, 74.0, 64.0]
2025-09-13 09:16:43,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 8/100 (estimated time remaining: 18 hours, 43 minutes, 57 seconds)
2025-09-13 09:28:16,333 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 09:28:16,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 09:28:45,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 495.96671 ± 79.818
2025-09-13 09:28:45,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [516.1203, 517.85333, 454.11493, 481.28976, 461.52377, 425.77347, 706.0092, 393.23407, 502.63855, 501.1095]
2025-09-13 09:28:45,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [95.0, 99.0, 99.0, 88.0, 87.0, 80.0, 137.0, 85.0, 110.0, 94.0]
2025-09-13 09:28:45,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 9/100 (estimated time remaining: 18 hours, 33 minutes, 58 seconds)
2025-09-13 09:40:35,808 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 09:40:35,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 09:41:07,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 548.51648 ± 106.693
2025-09-13 09:41:07,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [558.54083, 661.2693, 393.5198, 395.4993, 557.6315, 660.31934, 487.28302, 456.69394, 708.1509, 606.2572]
2025-09-13 09:41:07,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [121.0, 122.0, 73.0, 73.0, 102.0, 124.0, 90.0, 84.0, 135.0, 114.0]
2025-09-13 09:41:07,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (548.52) for latency ExtremeSparseL4U32
2025-09-13 09:41:07,718 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 10/100 (estimated time remaining: 18 hours, 24 minutes, 48 seconds)
2025-09-13 09:52:37,289 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 09:52:37,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 09:53:10,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 554.15002 ± 100.773
2025-09-13 09:53:10,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [469.16418, 562.5985, 498.68396, 477.84195, 565.5318, 729.8618, 686.87354, 535.1159, 381.22403, 634.6047]
2025-09-13 09:53:10,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [92.0, 115.0, 94.0, 87.0, 103.0, 150.0, 141.0, 98.0, 82.0, 119.0]
2025-09-13 09:53:10,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (554.15) for latency ExtremeSparseL4U32
2025-09-13 09:53:10,670 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 11/100 (estimated time remaining: 18 hours, 12 minutes, 55 seconds)
2025-09-13 10:04:45,558 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 10:04:45,565 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 10:05:15,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 530.99768 ± 244.006
2025-09-13 10:05:15,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [166.22655, 140.35481, 640.87054, 398.47168, 696.278, 615.9908, 706.26385, 366.75647, 954.21, 624.5541]
2025-09-13 10:05:15,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [32.0, 27.0, 118.0, 76.0, 136.0, 108.0, 131.0, 70.0, 179.0, 117.0]
2025-09-13 10:05:15,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 12/100 (estimated time remaining: 17 hours, 57 minutes, 54 seconds)
2025-09-13 10:16:50,979 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 10:16:51,000 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 10:17:19,444 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 479.39560 ± 122.886
2025-09-13 10:17:19,444 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [489.08844, 475.58475, 564.10474, 497.57425, 515.08966, 638.8392, 523.7305, 140.78677, 478.50784, 470.65012]
2025-09-13 10:17:19,445 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [89.0, 95.0, 113.0, 104.0, 94.0, 134.0, 96.0, 27.0, 89.0, 87.0]
2025-09-13 10:17:19,456 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 13/100 (estimated time remaining: 17 hours, 46 minutes, 32 seconds)
2025-09-13 10:29:01,695 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 10:29:01,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 10:29:34,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 554.90448 ± 183.590
2025-09-13 10:29:34,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [541.8195, 554.3315, 639.342, 482.5236, 694.3424, 349.0945, 157.1114, 690.69214, 594.21234, 845.5756]
2025-09-13 10:29:34,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [98.0, 106.0, 118.0, 89.0, 137.0, 69.0, 30.0, 124.0, 130.0, 161.0]
2025-09-13 10:29:34,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (554.90) for latency ExtremeSparseL4U32
2025-09-13 10:29:34,063 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 14/100 (estimated time remaining: 17 hours, 38 minutes, 3 seconds)
2025-09-13 10:41:10,256 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 10:41:10,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 10:41:44,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 603.56653 ± 111.004
2025-09-13 10:41:44,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [492.87344, 594.94086, 628.32904, 516.7212, 748.20825, 799.06165, 488.3752, 448.3482, 659.6009, 659.2062]
2025-09-13 10:41:44,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [94.0, 119.0, 117.0, 93.0, 140.0, 147.0, 86.0, 81.0, 123.0, 126.0]
2025-09-13 10:41:44,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (603.57) for latency ExtremeSparseL4U32
2025-09-13 10:41:44,053 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 15/100 (estimated time remaining: 17 hours, 22 minutes, 24 seconds)
2025-09-13 10:53:23,688 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 10:53:23,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 10:53:53,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 524.52771 ± 156.391
2025-09-13 10:53:53,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [468.78104, 615.40247, 582.9526, 490.3102, 511.44708, 715.97626, 114.29892, 499.39224, 582.41785, 664.2985]
2025-09-13 10:53:53,928 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [101.0, 114.0, 103.0, 92.0, 109.0, 131.0, 22.0, 91.0, 114.0, 119.0]
2025-09-13 10:53:53,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 16/100 (estimated time remaining: 17 hours, 12 minutes, 16 seconds)
2025-09-13 11:05:34,401 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:05:34,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:06:12,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 633.98334 ± 344.028
2025-09-13 11:06:12,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [145.93245, 407.19977, 418.94012, 630.6699, 521.4491, 1331.2177, 522.82794, 1207.596, 599.24866, 554.7521]
2025-09-13 11:06:12,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [28.0, 89.0, 91.0, 115.0, 97.0, 267.0, 111.0, 231.0, 112.0, 103.0]
2025-09-13 11:06:12,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (633.98) for latency ExtremeSparseL4U32
2025-09-13 11:06:12,322 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 17/100 (estimated time remaining: 17 hours, 4 minutes)
2025-09-13 11:17:44,449 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:17:44,456 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:18:14,666 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 540.44165 ± 220.399
2025-09-13 11:18:14,666 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [733.3475, 622.92737, 807.27356, 426.9464, 136.23085, 475.05045, 527.8288, 323.6727, 906.14374, 444.99496]
2025-09-13 11:18:14,666 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [143.0, 113.0, 142.0, 79.0, 26.0, 87.0, 97.0, 61.0, 174.0, 87.0]
2025-09-13 11:18:14,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 18/100 (estimated time remaining: 16 hours, 51 minutes, 16 seconds)
2025-09-13 11:29:50,891 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:29:50,899 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:30:32,643 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 732.15491 ± 149.495
2025-09-13 11:30:32,644 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [433.42072, 947.54333, 578.7333, 844.23883, 767.60004, 640.00854, 807.9421, 822.6266, 621.4752, 857.9606]
2025-09-13 11:30:32,644 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [80.0, 179.0, 114.0, 160.0, 141.0, 120.0, 171.0, 146.0, 115.0, 158.0]
2025-09-13 11:30:32,644 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (732.15) for latency ExtremeSparseL4U32
2025-09-13 11:30:32,658 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 19/100 (estimated time remaining: 16 hours, 40 minutes)
2025-09-13 11:42:14,454 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:42:14,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:42:45,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 556.22076 ± 233.792
2025-09-13 11:42:45,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [545.9465, 938.8621, 171.49515, 162.0499, 657.34, 545.1414, 816.31476, 625.2188, 466.71152, 633.1275]
2025-09-13 11:42:45,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [102.0, 166.0, 33.0, 31.0, 117.0, 99.0, 142.0, 126.0, 86.0, 120.0]
2025-09-13 11:42:45,168 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 20/100 (estimated time remaining: 16 hours, 28 minutes, 30 seconds)
2025-09-13 11:54:23,301 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:54:23,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:55:08,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 758.97351 ± 310.025
2025-09-13 11:55:08,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1395.9326, 558.0304, 392.31296, 616.8425, 573.0564, 1098.6523, 386.14798, 764.46735, 778.7681, 1025.5243]
2025-09-13 11:55:08,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [258.0, 116.0, 77.0, 115.0, 107.0, 208.0, 85.0, 159.0, 145.0, 191.0]
2025-09-13 11:55:08,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (758.97) for latency ExtremeSparseL4U32
2025-09-13 11:55:08,422 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 21/100 (estimated time remaining: 16 hours, 19 minutes, 51 seconds)
2025-09-13 12:06:43,238 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:06:43,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:07:21,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 663.13928 ± 210.534
2025-09-13 12:07:21,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [544.91833, 514.99536, 726.7596, 534.51404, 718.7763, 518.9984, 385.83734, 608.1142, 1035.3936, 1043.0859]
2025-09-13 12:07:21,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [105.0, 101.0, 137.0, 100.0, 130.0, 100.0, 79.0, 115.0, 205.0, 197.0]
2025-09-13 12:07:21,107 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 22/100 (estimated time remaining: 16 hours, 6 minutes, 6 seconds)
2025-09-13 12:18:56,722 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:18:56,729 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:19:32,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 623.46783 ± 345.457
2025-09-13 12:19:32,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [512.8024, 170.95941, 114.23331, 650.1815, 1041.193, 1049.6967, 417.88712, 354.60037, 869.2981, 1053.8265]
2025-09-13 12:19:32,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [95.0, 33.0, 22.0, 136.0, 199.0, 194.0, 77.0, 66.0, 158.0, 198.0]
2025-09-13 12:19:32,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 23/100 (estimated time remaining: 15 hours, 56 minutes, 7 seconds)
2025-09-13 12:31:03,900 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:31:03,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:31:41,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 662.66248 ± 187.930
2025-09-13 12:31:41,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [416.58374, 842.97833, 648.26654, 445.7618, 616.6915, 592.562, 575.7923, 984.91907, 553.25, 949.8197]
2025-09-13 12:31:41,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [79.0, 149.0, 113.0, 81.0, 114.0, 109.0, 105.0, 185.0, 101.0, 197.0]
2025-09-13 12:31:41,465 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 24/100 (estimated time remaining: 15 hours, 41 minutes, 39 seconds)
2025-09-13 12:43:27,353 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:43:27,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:43:58,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 564.16394 ± 294.659
2025-09-13 12:43:58,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [473.99054, 613.4929, 1135.1235, 774.7474, 662.0384, 371.24493, 621.6641, 108.62389, 755.83734, 124.875916]
2025-09-13 12:43:58,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [102.0, 109.0, 203.0, 142.0, 119.0, 69.0, 115.0, 21.0, 151.0, 24.0]
2025-09-13 12:43:58,901 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 25/100 (estimated time remaining: 15 hours, 30 minutes, 40 seconds)
2025-09-13 12:55:34,932 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:55:34,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:56:09,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 606.16321 ± 302.537
2025-09-13 12:56:09,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [485.18835, 1030.306, 987.4164, 862.0129, 179.61768, 155.9951, 424.50198, 815.94684, 411.77493, 708.87213]
2025-09-13 12:56:09,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [88.0, 185.0, 190.0, 162.0, 35.0, 30.0, 78.0, 155.0, 76.0, 134.0]
2025-09-13 12:56:09,989 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 26/100 (estimated time remaining: 15 hours, 15 minutes, 23 seconds)
2025-09-13 13:07:49,059 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:07:49,067 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:08:44,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 947.93079 ± 423.160
2025-09-13 13:08:44,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [436.37155, 1068.5005, 879.63885, 870.93066, 617.08923, 1258.3043, 841.2511, 796.1224, 673.16394, 2037.9346]
2025-09-13 13:08:44,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [87.0, 203.0, 173.0, 156.0, 114.0, 237.0, 152.0, 164.0, 129.0, 401.0]
2025-09-13 13:08:44,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (947.93) for latency ExtremeSparseL4U32
2025-09-13 13:08:44,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 27/100 (estimated time remaining: 15 hours, 8 minutes, 29 seconds)
2025-09-13 13:20:09,778 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:20:09,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:20:51,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 754.55371 ± 263.189
2025-09-13 13:20:51,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [181.87585, 982.5837, 927.21045, 674.9609, 394.38077, 1006.6152, 762.90137, 704.4927, 945.7143, 964.8018]
2025-09-13 13:20:51,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [35.0, 181.0, 187.0, 123.0, 75.0, 186.0, 144.0, 131.0, 176.0, 173.0]
2025-09-13 13:20:51,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 28/100 (estimated time remaining: 14 hours, 55 minutes, 26 seconds)
2025-09-13 13:32:34,972 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:32:34,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:33:20,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 776.79791 ± 392.870
2025-09-13 13:33:20,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [167.3932, 594.25555, 1425.7085, 645.9794, 620.84894, 1525.9379, 458.30615, 787.85516, 852.75256, 688.94135]
2025-09-13 13:33:20,557 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [32.0, 113.0, 284.0, 121.0, 125.0, 295.0, 83.0, 150.0, 163.0, 131.0]
2025-09-13 13:33:20,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 29/100 (estimated time remaining: 14 hours, 47 minutes, 47 seconds)
2025-09-13 13:44:54,005 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:44:54,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:45:26,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 546.59741 ± 239.730
2025-09-13 13:45:26,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [468.30893, 472.95853, 1069.4524, 670.6271, 277.69714, 521.48694, 452.30356, 713.93225, 661.6248, 157.58255]
2025-09-13 13:45:26,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [90.0, 98.0, 196.0, 137.0, 52.0, 107.0, 84.0, 137.0, 122.0, 30.0]
2025-09-13 13:45:26,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 30/100 (estimated time remaining: 14 hours, 32 minutes, 42 seconds)
2025-09-13 13:57:09,684 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:57:09,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:58:17,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1213.55676 ± 500.978
2025-09-13 13:58:17,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [843.9272, 579.8006, 1948.9502, 1403.4235, 787.0481, 1358.1875, 996.6158, 1518.95, 637.0699, 2061.5947]
2025-09-13 13:58:17,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [157.0, 109.0, 372.0, 269.0, 143.0, 247.0, 189.0, 270.0, 127.0, 379.0]
2025-09-13 13:58:17,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1213.56) for latency ExtremeSparseL4U32
2025-09-13 13:58:17,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 31/100 (estimated time remaining: 14 hours, 29 minutes, 40 seconds)
2025-09-13 14:09:47,297 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:09:47,304 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:10:33,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 823.12732 ± 334.780
2025-09-13 14:10:33,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1168.8519, 1169.1127, 822.3693, 756.71814, 927.2243, 1012.16943, 373.1267, 738.2611, 1154.9698, 108.46943]
2025-09-13 14:10:33,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [215.0, 225.0, 157.0, 137.0, 168.0, 182.0, 69.0, 137.0, 213.0, 21.0]
2025-09-13 14:10:33,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 32/100 (estimated time remaining: 14 hours, 13 minutes, 12 seconds)
2025-09-13 14:22:18,820 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:22:18,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:22:55,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 646.19208 ± 407.986
2025-09-13 14:22:55,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1098.0659, 712.1683, 488.9965, 732.8938, 155.93323, 109.02039, 943.22943, 1452.2717, 484.83176, 284.51044]
2025-09-13 14:22:55,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [206.0, 136.0, 97.0, 140.0, 30.0, 21.0, 178.0, 283.0, 91.0, 51.0]
2025-09-13 14:22:55,680 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 33/100 (estimated time remaining: 14 hours, 4 minutes, 2 seconds)
2025-09-13 14:34:40,987 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:34:40,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:35:40,974 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1053.92163 ± 469.246
2025-09-13 14:35:40,974 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1245.8567, 1157.6526, 2119.7021, 1402.981, 1267.8816, 935.5659, 595.81366, 653.33417, 486.22473, 674.2049]
2025-09-13 14:35:40,974 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [237.0, 220.0, 389.0, 258.0, 235.0, 183.0, 128.0, 118.0, 91.0, 136.0]
2025-09-13 14:35:40,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 34/100 (estimated time remaining: 13 hours, 55 minutes, 21 seconds)
2025-09-13 14:47:06,309 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:47:06,317 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:47:52,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 805.14838 ± 278.865
2025-09-13 14:47:52,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [965.3738, 909.9559, 1068.6063, 880.35455, 663.4092, 1129.1377, 144.69357, 1008.39453, 709.0872, 572.4708]
2025-09-13 14:47:52,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [188.0, 179.0, 199.0, 172.0, 125.0, 225.0, 28.0, 199.0, 131.0, 103.0]
2025-09-13 14:47:52,956 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 35/100 (estimated time remaining: 13 hours, 44 minutes, 14 seconds)
2025-09-13 14:59:35,805 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:59:35,812 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:00:21,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 779.50061 ± 261.763
2025-09-13 15:00:21,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1108.2887, 829.78516, 1163.483, 533.7616, 861.0598, 468.33234, 728.98145, 422.25958, 1090.4073, 588.6481]
2025-09-13 15:00:21,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [207.0, 171.0, 225.0, 114.0, 158.0, 102.0, 145.0, 77.0, 197.0, 120.0]
2025-09-13 15:00:21,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 36/100 (estimated time remaining: 13 hours, 26 minutes, 59 seconds)
2025-09-13 15:12:09,419 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:12:09,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:13:06,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 988.13251 ± 271.174
2025-09-13 15:13:06,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1224.7894, 903.35016, 713.5597, 1443.8712, 1210.9248, 933.04156, 551.76294, 1174.3226, 662.94904, 1062.7535]
2025-09-13 15:13:06,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [245.0, 179.0, 144.0, 274.0, 226.0, 174.0, 105.0, 229.0, 131.0, 192.0]
2025-09-13 15:13:06,179 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 37/100 (estimated time remaining: 13 hours, 20 minutes, 30 seconds)
2025-09-13 15:24:50,558 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:24:50,565 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:25:38,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 852.92560 ± 656.707
2025-09-13 15:25:38,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [644.0178, 1400.3832, 389.16476, 130.84009, 129.28311, 583.58624, 827.3301, 761.6303, 1234.7037, 2428.3164]
2025-09-13 15:25:38,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [124.0, 258.0, 71.0, 25.0, 25.0, 105.0, 151.0, 146.0, 233.0, 472.0]
2025-09-13 15:25:38,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 38/100 (estimated time remaining: 13 hours, 10 minutes, 13 seconds)
2025-09-13 15:36:58,486 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:36:58,494 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:37:51,360 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 918.59631 ± 512.021
2025-09-13 15:37:51,360 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [984.33636, 1088.2645, 1624.9813, 107.9123, 870.5273, 1668.0432, 886.3158, 323.69684, 1312.1066, 319.77817]
2025-09-13 15:37:51,360 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [195.0, 212.0, 300.0, 21.0, 157.0, 327.0, 178.0, 62.0, 240.0, 59.0]
2025-09-13 15:37:51,370 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 39/100 (estimated time remaining: 12 hours, 50 minutes, 56 seconds)
2025-09-13 15:49:42,011 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:49:42,020 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:50:34,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 894.23840 ± 408.476
2025-09-13 15:50:34,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [701.20935, 1094.3177, 1052.8209, 1250.9552, 621.8258, 1168.8132, 981.98254, 343.22086, 165.03185, 1562.2067]
2025-09-13 15:50:34,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [144.0, 197.0, 204.0, 246.0, 128.0, 225.0, 185.0, 63.0, 32.0, 289.0]
2025-09-13 15:50:34,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 40/100 (estimated time remaining: 12 hours, 44 minutes, 47 seconds)
2025-09-13 16:02:17,869 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:02:17,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:03:09,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 918.31873 ± 426.759
2025-09-13 16:03:09,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [700.5879, 654.55756, 437.31525, 1249.3093, 788.8463, 2002.3672, 1040.8312, 525.376, 874.4588, 909.5383]
2025-09-13 16:03:09,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [146.0, 125.0, 79.0, 234.0, 148.0, 364.0, 196.0, 96.0, 156.0, 176.0]
2025-09-13 16:03:09,654 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 41/100 (estimated time remaining: 12 hours, 33 minutes, 34 seconds)
2025-09-13 16:14:45,340 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:14:45,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:15:35,906 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 906.54315 ± 378.206
2025-09-13 16:15:35,906 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [775.7815, 833.66724, 605.3016, 175.58786, 1625.6722, 1372.3351, 761.96387, 1052.813, 879.53894, 982.77026]
2025-09-13 16:15:35,906 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [143.0, 147.0, 112.0, 34.0, 299.0, 251.0, 143.0, 194.0, 160.0, 183.0]
2025-09-13 16:15:35,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 42/100 (estimated time remaining: 12 hours, 17 minutes, 26 seconds)
2025-09-13 16:27:00,205 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:27:00,213 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:27:44,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 770.26593 ± 559.267
2025-09-13 16:27:44,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1360.5721, 1227.9747, 1559.7301, 119.75611, 144.56747, 119.3925, 1366.0393, 731.0951, 885.16003, 188.37134]
2025-09-13 16:27:44,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [255.0, 236.0, 298.0, 23.0, 28.0, 23.0, 263.0, 140.0, 159.0, 36.0]
2025-09-13 16:27:44,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 43/100 (estimated time remaining: 12 hours, 15 seconds)
2025-09-13 16:39:38,345 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:39:38,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:40:21,004 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 737.94672 ± 512.832
2025-09-13 16:40:21,004 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1095.6218, 172.37096, 1306.691, 1465.2872, 848.5392, 167.06256, 788.7706, 119.99261, 1257.6707, 157.46046]
2025-09-13 16:40:21,004 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [224.0, 33.0, 253.0, 269.0, 162.0, 32.0, 145.0, 23.0, 234.0, 30.0]
2025-09-13 16:40:21,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 44/100 (estimated time remaining: 11 hours, 52 minutes, 25 seconds)
2025-09-13 16:51:46,832 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:51:46,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:52:30,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 779.22327 ± 548.463
2025-09-13 16:52:30,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [135.269, 523.29144, 859.05225, 907.17346, 943.1101, 536.339, 1904.1555, 102.964134, 372.74045, 1508.1371]
2025-09-13 16:52:30,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [26.0, 95.0, 153.0, 169.0, 183.0, 99.0, 354.0, 20.0, 69.0, 270.0]
2025-09-13 16:52:30,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 45/100 (estimated time remaining: 11 hours, 33 minutes, 44 seconds)
2025-09-13 17:04:20,289 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:04:20,298 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:05:36,939 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1388.35767 ± 647.626
2025-09-13 17:05:36,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1232.877, 389.53015, 1898.976, 1808.2367, 2379.2202, 1148.9691, 1465.5997, 1772.528, 168.03479, 1619.6047]
2025-09-13 17:05:36,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [218.0, 71.0, 330.0, 321.0, 434.0, 236.0, 261.0, 316.0, 33.0, 293.0]
2025-09-13 17:05:36,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1388.36) for latency ExtremeSparseL4U32
2025-09-13 17:05:36,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 46/100 (estimated time remaining: 11 hours, 27 minutes)
2025-09-13 17:17:06,487 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:17:06,494 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:18:14,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1200.34692 ± 749.233
2025-09-13 17:18:14,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [970.7427, 1702.0529, 1202.0891, 2731.5312, 745.22546, 1120.7858, 2120.0852, 165.27986, 334.07388, 911.60284]
2025-09-13 17:18:14,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [178.0, 314.0, 222.0, 498.0, 141.0, 216.0, 392.0, 32.0, 63.0, 169.0]
2025-09-13 17:18:14,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 47/100 (estimated time remaining: 11 hours, 16 minutes, 28 seconds)
2025-09-13 17:30:17,824 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:30:17,832 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:30:39,786 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 359.86984 ± 310.463
2025-09-13 17:30:39,786 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [166.17627, 551.8909, 639.5791, 1095.9167, 155.75981, 114.51131, 475.41797, 130.39294, 129.62752, 139.42601]
2025-09-13 17:30:39,786 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [32.0, 116.0, 133.0, 235.0, 30.0, 22.0, 91.0, 25.0, 25.0, 27.0]
2025-09-13 17:30:39,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 48/100 (estimated time remaining: 11 hours, 7 minutes, 2 seconds)
2025-09-13 17:42:08,552 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:42:08,559 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:43:18,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1211.53223 ± 255.631
2025-09-13 17:43:18,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1375.0771, 1215.2306, 1315.4506, 1679.7202, 1186.4802, 1349.8041, 767.6125, 928.764, 934.09467, 1363.0885]
2025-09-13 17:43:18,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [254.0, 226.0, 239.0, 328.0, 229.0, 257.0, 160.0, 176.0, 175.0, 260.0]
2025-09-13 17:43:18,458 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 49/100 (estimated time remaining: 10 hours, 54 minutes, 45 seconds)
2025-09-13 17:55:11,253 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:55:11,262 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:56:17,911 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1135.81079 ± 767.753
2025-09-13 17:56:17,912 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1229.9141, 1148.4448, 752.17267, 850.6634, 749.73566, 3116.6377, 606.7614, 1115.258, 1664.7147, 123.806145]
2025-09-13 17:56:17,912 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [241.0, 217.0, 147.0, 163.0, 137.0, 586.0, 123.0, 216.0, 310.0, 24.0]
2025-09-13 17:56:17,918 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 50/100 (estimated time remaining: 10 hours, 50 minutes, 38 seconds)
2025-09-13 18:08:01,130 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:08:01,138 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:09:11,320 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1243.08337 ± 1112.018
2025-09-13 18:09:11,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1276.9215, 1234.6315, 154.51808, 150.4138, 329.95947, 2174.3909, 4011.1982, 468.92905, 1235.875, 1393.9962]
2025-09-13 18:09:11,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [239.0, 224.0, 30.0, 29.0, 65.0, 398.0, 741.0, 85.0, 227.0, 268.0]
2025-09-13 18:09:11,349 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 51/100 (estimated time remaining: 10 hours, 35 minutes, 43 seconds)
2025-09-13 18:20:42,391 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:20:42,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:21:49,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1195.91492 ± 485.128
2025-09-13 18:21:49,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1417.0474, 1011.62805, 1688.3561, 1313.0508, 989.8174, 1140.2443, 1907.0488, 734.803, 154.32274, 1602.8322]
2025-09-13 18:21:49,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [272.0, 183.0, 299.0, 249.0, 177.0, 215.0, 341.0, 138.0, 30.0, 285.0]
2025-09-13 18:21:49,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 52/100 (estimated time remaining: 10 hours, 23 minutes, 7 seconds)
2025-09-13 18:33:24,111 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:33:24,118 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:34:33,077 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1197.39136 ± 429.361
2025-09-13 18:34:33,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1087.265, 945.5831, 1020.40656, 2085.9639, 1290.9124, 1652.186, 785.92267, 1248.5902, 473.01175, 1384.0717]
2025-09-13 18:34:33,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [212.0, 170.0, 187.0, 385.0, 235.0, 310.0, 148.0, 225.0, 90.0, 276.0]
2025-09-13 18:34:33,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 53/100 (estimated time remaining: 10 hours, 13 minutes, 19 seconds)
2025-09-13 18:46:20,605 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:46:20,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:46:55,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 612.08984 ± 407.013
2025-09-13 18:46:55,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [166.15216, 102.26756, 896.30994, 863.1173, 831.002, 1206.995, 177.6598, 635.27594, 1102.076, 140.0423]
2025-09-13 18:46:55,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [32.0, 20.0, 164.0, 158.0, 162.0, 220.0, 34.0, 117.0, 218.0, 27.0]
2025-09-13 18:46:55,748 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 54/100 (estimated time remaining: 9 hours, 58 minutes, 2 seconds)
2025-09-13 18:58:31,695 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:58:31,702 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:59:59,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1536.41284 ± 950.897
2025-09-13 18:59:59,995 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [733.76544, 744.022, 1781.7758, 3749.4885, 1123.1914, 1387.9344, 1804.0597, 160.79573, 2345.8936, 1533.2026]
2025-09-13 18:59:59,995 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [135.0, 141.0, 323.0, 688.0, 220.0, 259.0, 367.0, 31.0, 449.0, 281.0]
2025-09-13 18:59:59,995 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1536.41) for latency ExtremeSparseL4U32
2025-09-13 19:00:00,022 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 55/100 (estimated time remaining: 9 hours, 46 minutes, 3 seconds)
2025-09-13 19:12:19,730 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:12:19,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:13:48,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1612.74487 ± 994.761
2025-09-13 19:13:48,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1370.4231, 698.0309, 1811.037, 2996.432, 3476.0793, 355.56436, 360.54907, 2052.5703, 1180.6782, 1826.0859]
2025-09-13 19:13:48,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [247.0, 129.0, 328.0, 541.0, 631.0, 65.0, 72.0, 363.0, 213.0, 333.0]
2025-09-13 19:13:48,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1612.74) for latency ExtremeSparseL4U32
2025-09-13 19:13:48,878 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 56/100 (estimated time remaining: 9 hours, 41 minutes, 37 seconds)
2025-09-13 19:24:59,687 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:24:59,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:26:33,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1675.40393 ± 517.113
2025-09-13 19:26:33,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1353.0222, 1210.3386, 1878.8716, 1648.6934, 1314.2174, 1649.6064, 922.0153, 1637.6221, 2493.6318, 2646.0198]
2025-09-13 19:26:33,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [252.0, 224.0, 353.0, 311.0, 232.0, 294.0, 169.0, 288.0, 462.0, 479.0]
2025-09-13 19:26:33,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1675.40) for latency ExtremeSparseL4U32
2025-09-13 19:26:33,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 57/100 (estimated time remaining: 9 hours, 29 minutes, 40 seconds)
2025-09-13 19:38:31,694 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:38:31,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:39:51,148 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1377.65625 ± 471.196
2025-09-13 19:39:51,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [888.1609, 1618.2302, 1278.2517, 1236.0319, 2168.568, 1721.5697, 744.7128, 1800.5715, 672.8926, 1647.5732]
2025-09-13 19:39:51,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [177.0, 303.0, 236.0, 230.0, 410.0, 328.0, 145.0, 319.0, 119.0, 317.0]
2025-09-13 19:39:51,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 58/100 (estimated time remaining: 9 hours, 21 minutes, 35 seconds)
2025-09-13 19:51:29,446 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:51:29,474 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:52:13,239 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 793.92865 ± 605.097
2025-09-13 19:52:13,239 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [129.58084, 1487.7822, 579.4121, 390.87994, 1918.9958, 1291.2382, 1123.3309, 777.86346, 130.79074, 109.41314]
2025-09-13 19:52:13,239 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [25.0, 266.0, 107.0, 73.0, 347.0, 236.0, 199.0, 162.0, 25.0, 21.0]
2025-09-13 19:52:13,248 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 59/100 (estimated time remaining: 9 hours, 8 minutes, 26 seconds)
2025-09-13 20:03:48,016 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:03:48,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:04:45,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1011.69739 ± 484.249
2025-09-13 20:04:45,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [544.1257, 1815.8221, 1113.1135, 225.98561, 1163.8611, 1353.7888, 1030.3169, 1019.14264, 335.22638, 1515.5907]
2025-09-13 20:04:45,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [100.0, 335.0, 203.0, 43.0, 224.0, 247.0, 185.0, 196.0, 63.0, 291.0]
2025-09-13 20:04:45,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 60/100 (estimated time remaining: 8 hours, 50 minutes, 58 seconds)
2025-09-13 20:16:26,274 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:16:26,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:17:25,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1056.86365 ± 606.726
2025-09-13 20:17:25,936 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1585.4703, 847.6612, 1458.6627, 160.65295, 134.95457, 669.78217, 2014.7025, 1594.1666, 1357.3739, 745.21014]
2025-09-13 20:17:25,936 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [296.0, 161.0, 260.0, 31.0, 26.0, 129.0, 381.0, 296.0, 264.0, 139.0]
2025-09-13 20:17:25,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 61/100 (estimated time remaining: 8 hours, 28 minutes, 56 seconds)
2025-09-13 20:28:54,042 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:28:54,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:30:19,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1526.90210 ± 1190.926
2025-09-13 20:30:19,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2177.3787, 3291.7258, 191.84329, 388.80072, 1179.1062, 1022.1561, 3808.3596, 1837.5237, 198.01573, 1174.1104]
2025-09-13 20:30:19,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [400.0, 618.0, 37.0, 71.0, 214.0, 189.0, 726.0, 338.0, 38.0, 224.0]
2025-09-13 20:30:19,814 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 62/100 (estimated time remaining: 8 hours, 17 minutes, 26 seconds)
2025-09-13 20:42:14,978 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:42:14,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:43:24,628 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1191.86902 ± 851.882
2025-09-13 20:43:24,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2719.3196, 1473.3354, 1806.2156, 404.9884, 108.00898, 108.73386, 1183.2535, 1596.7203, 444.45227, 2073.6624]
2025-09-13 20:43:24,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [512.0, 298.0, 330.0, 76.0, 21.0, 21.0, 222.0, 312.0, 96.0, 386.0]
2025-09-13 20:43:24,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 63/100 (estimated time remaining: 8 hours, 3 minutes, 2 seconds)
2025-09-13 20:54:56,192 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:54:56,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:56:38,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1809.81177 ± 1160.151
2025-09-13 20:56:38,237 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1703.0887, 2297.0186, 1365.5824, 944.0262, 1837.7562, 1127.089, 508.9063, 4823.0205, 2458.8933, 1032.7367]
2025-09-13 20:56:38,237 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [312.0, 444.0, 245.0, 176.0, 352.0, 203.0, 97.0, 910.0, 457.0, 214.0]
2025-09-13 20:56:38,237 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (1809.81) for latency ExtremeSparseL4U32
2025-09-13 20:56:38,252 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 64/100 (estimated time remaining: 7 hours, 56 minutes, 41 seconds)
2025-09-13 21:08:19,268 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:08:19,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:09:47,075 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1581.99390 ± 934.944
2025-09-13 21:09:47,077 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2820.8652, 1466.2767, 908.0631, 2292.5774, 494.88568, 1982.6543, 3083.407, 907.06805, 114.452065, 1749.6888]
2025-09-13 21:09:47,077 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [510.0, 265.0, 164.0, 404.0, 91.0, 383.0, 570.0, 184.0, 22.0, 322.0]
2025-09-13 21:09:47,089 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 65/100 (estimated time remaining: 7 hours, 48 minutes, 13 seconds)
2025-09-13 21:21:17,950 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:21:17,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:22:13,424 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 992.27246 ± 631.737
2025-09-13 21:22:13,424 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [512.07043, 1679.8773, 1233.8881, 257.47928, 1753.5978, 129.09099, 1462.248, 161.587, 1661.0663, 1071.8188]
2025-09-13 21:22:13,424 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [95.0, 313.0, 226.0, 53.0, 318.0, 25.0, 263.0, 31.0, 312.0, 205.0]
2025-09-13 21:22:13,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 66/100 (estimated time remaining: 7 hours, 33 minutes, 32 seconds)
2025-09-13 21:34:43,593 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:34:43,601 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:35:43,281 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1026.88477 ± 1059.926
2025-09-13 21:35:43,281 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [159.7218, 435.67917, 134.40349, 2000.4099, 134.224, 988.3321, 625.69446, 2558.3513, 134.49054, 3097.5408]
2025-09-13 21:35:43,281 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [31.0, 86.0, 26.0, 382.0, 26.0, 201.0, 117.0, 494.0, 26.0, 577.0]
2025-09-13 21:35:43,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 67/100 (estimated time remaining: 7 hours, 24 minutes, 39 seconds)
2025-09-13 21:47:01,831 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:47:01,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:49:13,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2273.62695 ± 1382.876
2025-09-13 21:49:13,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1044.5286, 2743.805, 2792.6758, 719.27386, 1358.0167, 3592.3528, 2709.8132, 1235.9938, 5376.6597, 1163.1503]
2025-09-13 21:49:13,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [192.0, 536.0, 525.0, 134.0, 263.0, 681.0, 513.0, 238.0, 1000.0, 221.0]
2025-09-13 21:49:13,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2273.63) for latency ExtremeSparseL4U32
2025-09-13 21:49:13,955 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 68/100 (estimated time remaining: 7 hours, 14 minutes, 25 seconds)
2025-09-13 22:00:24,074 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:00:24,082 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:01:41,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1346.86450 ± 848.282
2025-09-13 22:01:41,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3000.9521, 625.22784, 1670.2898, 141.14514, 670.9314, 439.69223, 2310.0046, 1526.0692, 1722.1857, 1362.1478]
2025-09-13 22:01:41,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [563.0, 133.0, 321.0, 27.0, 120.0, 82.0, 433.0, 293.0, 329.0, 258.0]
2025-09-13 22:01:41,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 69/100 (estimated time remaining: 6 hours, 56 minutes, 19 seconds)
2025-09-13 22:13:27,564 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:13:27,571 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:15:25,401 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2084.56885 ± 1514.242
2025-09-13 22:15:25,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5267.3154, 3493.2188, 1479.1483, 809.9866, 707.5622, 712.6639, 2543.9941, 2870.789, 160.13222, 2800.8774]
2025-09-13 22:15:25,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 636.0, 276.0, 156.0, 141.0, 141.0, 474.0, 513.0, 31.0, 536.0]
2025-09-13 22:15:25,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 70/100 (estimated time remaining: 6 hours, 46 minutes, 57 seconds)
2025-09-13 22:26:58,438 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:26:58,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:29:01,553 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2134.61035 ± 1305.185
2025-09-13 22:29:01,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [710.80994, 177.56851, 3302.8381, 2668.7656, 3757.9023, 1553.8345, 770.4184, 4080.3804, 1488.4143, 2835.172]
2025-09-13 22:29:01,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [145.0, 34.0, 627.0, 501.0, 680.0, 290.0, 141.0, 760.0, 297.0, 534.0]
2025-09-13 22:29:01,576 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 71/100 (estimated time remaining: 6 hours, 40 minutes, 48 seconds)
2025-09-13 22:40:54,046 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:40:54,053 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:42:35,614 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1829.33716 ± 1551.969
2025-09-13 22:42:35,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2056.0483, 4431.1455, 163.9937, 114.47701, 3846.3154, 328.56894, 3814.43, 1112.6714, 1303.3323, 1122.3893]
2025-09-13 22:42:35,617 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [374.0, 804.0, 31.0, 22.0, 703.0, 65.0, 705.0, 217.0, 256.0, 201.0]
2025-09-13 22:42:35,629 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 72/100 (estimated time remaining: 6 hours, 27 minutes, 51 seconds)
2025-09-13 22:54:08,159 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:54:08,166 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:55:41,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1672.18518 ± 1598.875
2025-09-13 22:55:41,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [177.01294, 1805.1056, 721.43555, 2153.2866, 652.8851, 3449.8792, 1822.9891, 357.31314, 171.6924, 5410.2515]
2025-09-13 22:55:41,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [35.0, 331.0, 131.0, 397.0, 127.0, 629.0, 330.0, 65.0, 33.0, 1000.0]
2025-09-13 22:55:41,241 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 73/100 (estimated time remaining: 6 hours, 12 minutes, 8 seconds)
2025-09-13 23:07:48,969 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:07:48,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:09:49,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2110.62793 ± 1225.864
2025-09-13 23:09:49,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3068.6123, 2227.2842, 975.06934, 1723.6743, 4705.2847, 1365.4585, 3654.97, 1198.1987, 1299.0479, 888.6775]
2025-09-13 23:09:49,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [555.0, 420.0, 182.0, 332.0, 869.0, 257.0, 695.0, 238.0, 249.0, 171.0]
2025-09-13 23:09:49,803 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 74/100 (estimated time remaining: 6 hours, 7 minutes, 58 seconds)
2025-09-13 23:21:05,204 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:21:05,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:22:15,599 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1214.20166 ± 400.586
2025-09-13 23:22:15,601 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1164.8264, 1044.9012, 892.16516, 1629.6569, 1904.6301, 1141.1602, 1536.8761, 603.7739, 713.4984, 1510.5282]
2025-09-13 23:22:15,601 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [230.0, 204.0, 166.0, 303.0, 362.0, 216.0, 294.0, 111.0, 144.0, 296.0]
2025-09-13 23:22:15,614 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 75/100 (estimated time remaining: 5 hours, 47 minutes, 33 seconds)
2025-09-13 23:34:44,690 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:34:44,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:36:03,411 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1338.65698 ± 786.093
2025-09-13 23:36:03,412 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1743.1023, 1358.9012, 924.35785, 764.121, 2578.9377, 831.5976, 2804.5232, 1190.4609, 1065.5776, 124.98884]
2025-09-13 23:36:03,412 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [347.0, 265.0, 178.0, 135.0, 509.0, 155.0, 534.0, 231.0, 192.0, 24.0]
2025-09-13 23:36:03,424 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 76/100 (estimated time remaining: 5 hours, 35 minutes, 9 seconds)
2025-09-13 23:46:58,398 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:46:58,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:48:13,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1350.81641 ± 1444.532
2025-09-13 23:48:13,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [970.89343, 1705.505, 531.31586, 561.9466, 747.6629, 828.81525, 217.31067, 1037.9705, 1398.8374, 5507.9067]
2025-09-13 23:48:13,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [173.0, 323.0, 101.0, 103.0, 136.0, 150.0, 42.0, 187.0, 253.0, 1000.0]
2025-09-13 23:48:13,904 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 77/100 (estimated time remaining: 5 hours, 15 minutes, 3 seconds)
2025-09-13 23:59:44,739 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:59:44,748 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:01:54,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2296.48315 ± 989.650
2025-09-14 00:01:54,441 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1344.5708, 2834.9604, 2498.9285, 694.1654, 4691.1714, 2013.0546, 1933.8656, 2195.8965, 2477.7144, 2280.5042]
2025-09-14 00:01:54,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [257.0, 525.0, 485.0, 134.0, 864.0, 368.0, 342.0, 412.0, 441.0, 427.0]
2025-09-14 00:01:54,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2296.48) for latency ExtremeSparseL4U32
2025-09-14 00:01:54,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 78/100 (estimated time remaining: 5 hours, 4 minutes, 36 seconds)
2025-09-14 00:14:38,580 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:14:38,589 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:16:40,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2239.69116 ± 1653.937
2025-09-14 00:16:40,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3836.5283, 4075.4114, 120.67625, 687.06824, 989.88184, 1912.1178, 1726.3959, 575.3571, 5199.87, 3273.6082]
2025-09-14 00:16:40,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [694.0, 736.0, 23.0, 127.0, 184.0, 348.0, 312.0, 107.0, 932.0, 595.0]
2025-09-14 00:16:40,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 79/100 (estimated time remaining: 4 hours, 54 minutes, 9 seconds)
2025-09-14 00:27:22,007 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:27:22,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:29:11,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1952.75232 ± 1637.392
2025-09-14 00:29:11,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [136.52507, 119.62554, 5243.027, 956.2361, 2356.1765, 1062.1111, 3246.076, 1942.1882, 567.4084, 3898.1511]
2025-09-14 00:29:11,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [26.0, 23.0, 954.0, 178.0, 438.0, 198.0, 591.0, 376.0, 105.0, 707.0]
2025-09-14 00:29:11,811 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 80/100 (estimated time remaining: 4 hours, 41 minutes, 8 seconds)
2025-09-14 00:41:02,476 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:41:02,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:41:55,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 950.63409 ± 632.312
2025-09-14 00:41:55,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2123.4019, 1411.753, 423.9566, 136.08214, 140.30559, 1168.9149, 1062.7969, 1656.9476, 934.83307, 447.34863]
2025-09-14 00:41:55,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [387.0, 274.0, 77.0, 26.0, 27.0, 210.0, 209.0, 301.0, 177.0, 88.0]
2025-09-14 00:41:55,864 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 81/100 (estimated time remaining: 4 hours, 23 minutes, 29 seconds)
2025-09-14 00:53:19,927 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:53:19,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:54:37,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1423.34326 ± 961.982
2025-09-14 00:54:37,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2425.2097, 398.21408, 2621.593, 730.11584, 370.38242, 1178.9922, 2942.7725, 1647.4609, 1773.3445, 145.34726]
2025-09-14 00:54:37,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [457.0, 74.0, 468.0, 135.0, 70.0, 215.0, 532.0, 303.0, 329.0, 28.0]
2025-09-14 00:54:37,997 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 12 minutes, 19 seconds)
2025-09-14 01:07:25,273 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:07:25,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:08:27,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1107.23340 ± 811.502
2025-09-14 01:08:27,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [642.9711, 3089.2573, 427.8136, 1533.7904, 1257.364, 1525.1202, 179.28743, 341.7771, 769.2126, 1305.7401]
2025-09-14 01:08:27,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [123.0, 562.0, 83.0, 286.0, 226.0, 286.0, 35.0, 64.0, 144.0, 244.0]
2025-09-14 01:08:27,155 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 83/100 (estimated time remaining: 3 hours, 59 minutes, 33 seconds)
2025-09-14 01:19:50,527 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:19:50,535 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:21:18,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1540.80835 ± 1273.636
2025-09-14 01:21:18,393 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [3167.3184, 488.76086, 3954.0613, 1696.4471, 2007.6292, 648.66113, 576.16205, 169.06252, 183.3213, 2516.6597]
2025-09-14 01:21:18,393 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [559.0, 90.0, 735.0, 315.0, 381.0, 119.0, 111.0, 33.0, 35.0, 484.0]
2025-09-14 01:21:18,405 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 84/100 (estimated time remaining: 3 hours, 39 minutes, 43 seconds)
2025-09-14 01:32:06,539 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:32:06,546 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:34:08,875 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2155.16357 ± 1503.907
2025-09-14 01:34:08,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [4149.75, 140.6405, 531.5842, 991.13403, 3255.8562, 3583.7305, 672.66754, 1208.4955, 4046.5938, 2971.1843]
2025-09-14 01:34:08,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [756.0, 27.0, 98.0, 191.0, 604.0, 656.0, 124.0, 229.0, 757.0, 544.0]
2025-09-14 01:34:08,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 85/100 (estimated time remaining: 3 hours, 27 minutes, 50 seconds)
2025-09-14 01:46:11,839 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:46:11,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:47:42,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1686.25757 ± 1642.975
2025-09-14 01:47:42,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1976.1941, 5730.8667, 428.9216, 124.83157, 363.34814, 1023.9161, 680.8666, 3495.7446, 1487.5144, 1550.3723]
2025-09-14 01:47:42,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [348.0, 1000.0, 76.0, 24.0, 68.0, 192.0, 127.0, 631.0, 261.0, 276.0]
2025-09-14 01:47:42,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 17 minutes, 18 seconds)
2025-09-14 01:59:07,559 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:59:07,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:00:22,290 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1341.13257 ± 1091.442
2025-09-14 02:00:22,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [760.2361, 171.4321, 4379.882, 1703.1729, 898.8817, 1097.065, 1098.5306, 744.0455, 1564.9689, 993.1101]
2025-09-14 02:00:22,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [139.0, 33.0, 807.0, 325.0, 169.0, 209.0, 213.0, 138.0, 284.0, 182.0]
2025-09-14 02:00:22,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 4 minutes, 4 seconds)
2025-09-14 02:12:03,755 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:12:03,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:13:16,732 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1301.71167 ± 965.126
2025-09-14 02:13:16,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [208.20872, 3224.82, 118.54918, 1553.0686, 187.74103, 851.2811, 2163.4927, 2120.2827, 1070.2356, 1519.4374]
2025-09-14 02:13:16,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [40.0, 585.0, 23.0, 275.0, 36.0, 157.0, 395.0, 382.0, 197.0, 276.0]
2025-09-14 02:13:16,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 88/100 (estimated time remaining: 2 hours, 48 minutes, 32 seconds)
2025-09-14 02:25:08,528 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:25:08,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:26:42,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1647.83728 ± 1417.405
2025-09-14 02:26:42,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2069.073, 1453.6188, 5376.3237, 1795.8193, 1447.8746, 1541.8164, 1984.8256, 192.65544, 508.19934, 108.166565]
2025-09-14 02:26:42,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [369.0, 272.0, 1000.0, 351.0, 277.0, 301.0, 372.0, 38.0, 102.0, 21.0]
2025-09-14 02:26:42,448 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 89/100 (estimated time remaining: 2 hours, 36 minutes, 57 seconds)
2025-09-14 02:38:07,202 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:38:07,210 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:40:22,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2418.52002 ± 1625.620
2025-09-14 02:40:22,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1736.6293, 2991.6611, 2199.9521, 5487.5, 2350.1501, 2771.4116, 119.05694, 434.08377, 1312.3182, 4782.438]
2025-09-14 02:40:22,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [308.0, 554.0, 412.0, 1000.0, 451.0, 510.0, 23.0, 80.0, 236.0, 886.0]
2025-09-14 02:40:22,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (2418.52) for latency ExtremeSparseL4U32
2025-09-14 02:40:22,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 25 minutes, 42 seconds)
2025-09-14 02:52:03,295 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:52:03,304 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:55:18,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 3454.59814 ± 1567.230
2025-09-14 02:55:18,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [955.5252, 4420.5347, 5474.063, 5249.6577, 3053.284, 3586.4844, 5143.0093, 2130.7737, 1153.3243, 3379.3242]
2025-09-14 02:55:18,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [189.0, 824.0, 1000.0, 961.0, 583.0, 677.0, 918.0, 394.0, 216.0, 612.0]
2025-09-14 02:55:18,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1226 [INFO]: New best (3454.60) for latency ExtremeSparseL4U32
2025-09-14 02:55:18,292 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 15 minutes, 12 seconds)
2025-09-14 03:06:55,124 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:06:55,131 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:08:30,109 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1688.15796 ± 1435.679
2025-09-14 03:08:30,110 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2480.3506, 1552.9625, 630.00366, 3879.1196, 1958.7494, 183.40892, 4380.988, 313.09427, 1388.4144, 114.49051]
2025-09-14 03:08:30,110 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [467.0, 289.0, 116.0, 716.0, 358.0, 35.0, 787.0, 57.0, 258.0, 22.0]
2025-09-14 03:08:30,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 2 minutes, 38 seconds)
2025-09-14 03:20:11,160 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:20:11,167 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:22:17,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2297.43970 ± 1274.118
2025-09-14 03:22:17,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1551.0922, 2324.582, 921.5208, 1273.508, 2573.8071, 3727.173, 5017.3057, 3178.7878, 1353.6241, 1052.9937]
2025-09-14 03:22:17,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [297.0, 419.0, 162.0, 228.0, 470.0, 674.0, 904.0, 572.0, 251.0, 201.0]
2025-09-14 03:22:17,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 50 minutes, 25 seconds)
2025-09-14 03:34:06,097 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:34:06,105 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:36:12,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2253.21143 ± 1939.814
2025-09-14 03:36:12,995 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1186.5975, 908.1551, 4458.9033, 149.55945, 5441.362, 846.2131, 5477.0034, 1011.25006, 1168.6621, 1884.4076]
2025-09-14 03:36:12,996 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [210.0, 165.0, 831.0, 29.0, 1000.0, 154.0, 1000.0, 187.0, 240.0, 351.0]
2025-09-14 03:36:13,010 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 37 minutes, 18 seconds)
2025-09-14 03:48:07,261 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:48:07,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:50:05,915 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2091.32300 ± 1346.310
2025-09-14 03:50:05,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [1719.8271, 2312.7986, 5472.077, 1961.2025, 2262.7612, 1223.5431, 167.17682, 923.24426, 2885.9678, 1984.6335]
2025-09-14 03:50:05,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [312.0, 422.0, 1000.0, 379.0, 425.0, 226.0, 32.0, 169.0, 535.0, 368.0]
2025-09-14 03:50:05,934 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 23 minutes, 39 seconds)
2025-09-14 04:01:39,108 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:01:39,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:03:18,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1730.10840 ± 1396.782
2025-09-14 04:03:18,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [118.39063, 2537.0103, 1456.0283, 1890.1874, 4418.1436, 1816.6359, 288.37885, 128.90749, 985.7202, 3661.6812]
2025-09-14 04:03:18,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [23.0, 468.0, 280.0, 353.0, 828.0, 330.0, 56.0, 25.0, 182.0, 701.0]
2025-09-14 04:03:18,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 7 minutes, 59 seconds)
2025-09-14 04:14:47,369 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:14:47,377 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:17:04,872 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2569.91968 ± 1893.133
2025-09-14 04:17:04,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5614.5166, 5630.4253, 1289.8036, 3503.8447, 1396.0021, 338.22748, 2737.346, 1371.5731, 177.3421, 3640.118]
2025-09-14 04:17:04,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 239.0, 621.0, 251.0, 62.0, 477.0, 249.0, 34.0, 660.0]
2025-09-14 04:17:04,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 97/100 (estimated time remaining: 54 minutes, 51 seconds)
2025-09-14 04:29:03,042 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:29:03,052 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:31:24,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 2457.51123 ± 1895.059
2025-09-14 04:31:24,325 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [5452.5845, 4679.194, 5467.1074, 673.3472, 751.1196, 1211.2006, 1653.4443, 1596.8744, 2605.3206, 484.92227]
2025-09-14 04:31:24,325 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 871.0, 1000.0, 133.0, 139.0, 232.0, 312.0, 308.0, 489.0, 90.0]
2025-09-14 04:31:24,338 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 98/100 (estimated time remaining: 41 minutes, 27 seconds)
2025-09-14 04:43:14,808 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:43:14,818 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:44:55,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1785.80176 ± 1586.082
2025-09-14 04:44:55,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [138.59401, 5424.3315, 1718.1351, 1607.8754, 1766.1097, 129.75594, 1214.2327, 1064.5874, 3951.3523, 843.0423]
2025-09-14 04:44:55,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [27.0, 1000.0, 317.0, 314.0, 327.0, 25.0, 223.0, 196.0, 711.0, 165.0]
2025-09-14 04:44:55,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 99/100 (estimated time remaining: 27 minutes, 29 seconds)
2025-09-14 04:56:07,665 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:56:07,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:57:17,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1248.03223 ± 1042.760
2025-09-14 04:57:17,841 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [896.7202, 166.73083, 294.3252, 109.03869, 119.093834, 2613.6763, 1764.0938, 1251.935, 2372.0884, 2892.6206]
2025-09-14 04:57:17,841 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [164.0, 32.0, 60.0, 21.0, 23.0, 465.0, 326.0, 227.0, 412.0, 549.0]
2025-09-14 04:57:17,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1199 [INFO]: Iteration 100/100 (estimated time remaining: 13 minutes, 26 seconds)
2025-09-14 05:09:01,388 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 05:09:01,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 05:10:18,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1221 [DEBUG]: Total Reward: 1429.56616 ± 1297.529
2025-09-14 05:10:18,960 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1222 [DEBUG]: All rewards: [2840.0115, 3367.2454, 447.9138, 3572.7507, 143.61227, 124.2041, 345.08624, 1866.8187, 826.9143, 761.1043]
2025-09-14 05:10:18,960 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1223 [DEBUG]: All trajectory lengths: [508.0, 601.0, 81.0, 645.0, 28.0, 24.0, 62.0, 348.0, 151.0, 137.0]
2025-09-14 05:10:18,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-humanoid):1251 [DEBUG]: Training session finished
