2025-09-13 10:41:39,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc7/noiseperc15-humanoid/ExtremeSparseL4U32-mbpac-highdim-memdelay
2025-09-13 10:41:39,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc7/noiseperc15-humanoid/ExtremeSparseL4U32-mbpac-highdim-memdelay
2025-09-13 10:41:39,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeSparseL4U32': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x14bbd09b8550>}
2025-09-13 10:41:39,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1111 [DEBUG]: using device: cuda
2025-09-13 10:41:39,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1133 [INFO]: Creating new trainer
2025-09-13 10:41:39,700 baseline-mbpac-noiseperc15-humanoid:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (tanh_refit): NNTanhRefit(
    scale: tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000,
             0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000]]), shift: tensor([[-0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000]])
  )
)
2025-09-13 10:41:39,700 baseline-mbpac-noiseperc15-humanoid:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=393, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-13 10:41:39,710 baseline-mbpac-noiseperc15-humanoid:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=376, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=376, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=17, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-13 10:41:41,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1194 [DEBUG]: Starting training session...
2025-09-13 10:41:41,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 1/100
2025-09-13 10:53:22,864 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 10:53:22,871 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 10:53:40,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 317.20898 ± 19.763
2025-09-13 10:53:40,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [312.75674, 336.63788, 288.4744, 297.54767, 334.44406, 349.1971, 302.44907, 335.13043, 296.3895, 319.06287]
2025-09-13 10:53:40,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [57.0, 62.0, 53.0, 54.0, 61.0, 64.0, 55.0, 62.0, 54.0, 59.0]
2025-09-13 10:53:40,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (317.21) for latency ExtremeSparseL4U32
2025-09-13 10:53:40,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 2/100 (estimated time remaining: 19 hours, 47 minutes, 33 seconds)
2025-09-13 11:05:21,859 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:05:21,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:05:38,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 278.17532 ± 104.930
2025-09-13 11:05:38,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [330.957, 310.3515, 129.81845, 446.48355, 289.6314, 140.14081, 125.47004, 313.65158, 312.02078, 383.22812]
2025-09-13 11:05:38,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [64.0, 67.0, 25.0, 90.0, 60.0, 27.0, 24.0, 58.0, 68.0, 70.0]
2025-09-13 11:05:38,888 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 3/100 (estimated time remaining: 19 hours, 34 minutes, 14 seconds)
2025-09-13 11:17:20,686 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:17:20,693 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:17:43,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 367.57553 ± 64.019
2025-09-13 11:17:43,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [429.41483, 432.82886, 324.75214, 239.70251, 274.17142, 392.2874, 374.807, 370.85852, 426.79456, 410.13815]
2025-09-13 11:17:43,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [79.0, 80.0, 65.0, 49.0, 60.0, 78.0, 69.0, 72.0, 90.0, 78.0]
2025-09-13 11:17:43,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (367.58) for latency ExtremeSparseL4U32
2025-09-13 11:17:43,252 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 4/100 (estimated time remaining: 19 hours, 25 minutes, 11 seconds)
2025-09-13 11:29:29,550 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:29:29,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:29:52,876 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 395.73276 ± 215.849
2025-09-13 11:29:52,876 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [147.34572, 103.14289, 524.29706, 520.2101, 890.72314, 450.27377, 439.2428, 283.7118, 351.6647, 246.71562]
2025-09-13 11:29:52,876 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [28.0, 20.0, 98.0, 100.0, 175.0, 85.0, 82.0, 52.0, 64.0, 54.0]
2025-09-13 11:29:52,876 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (395.73) for latency ExtremeSparseL4U32
2025-09-13 11:29:52,892 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 5/100 (estimated time remaining: 19 hours, 16 minutes, 44 seconds)
2025-09-13 11:41:27,287 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:41:27,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:41:47,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 359.66974 ± 49.862
2025-09-13 11:41:47,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [359.20496, 487.6499, 354.6221, 386.5604, 354.47363, 343.47046, 330.9232, 316.01855, 292.07797, 371.69595]
2025-09-13 11:41:47,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [67.0, 93.0, 65.0, 72.0, 64.0, 63.0, 60.0, 58.0, 54.0, 68.0]
2025-09-13 11:41:47,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 6/100 (estimated time remaining: 19 hours, 2 minutes, 4 seconds)
2025-09-13 11:53:23,745 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 11:53:23,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 11:53:47,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 389.91611 ± 113.932
2025-09-13 11:53:47,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [444.43332, 667.00977, 345.4646, 306.26318, 504.20474, 338.20963, 314.50534, 354.17902, 255.98433, 368.90732]
2025-09-13 11:53:47,957 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [96.0, 142.0, 64.0, 59.0, 96.0, 63.0, 70.0, 80.0, 57.0, 71.0]
2025-09-13 11:53:48,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 7/100 (estimated time remaining: 18 hours, 50 minutes, 16 seconds)
2025-09-13 12:05:24,752 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:05:24,758 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:05:46,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 366.02618 ± 136.428
2025-09-13 12:05:46,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [125.87816, 358.198, 502.00565, 413.82828, 264.947, 167.25351, 461.63367, 396.7281, 583.7593, 386.03006]
2025-09-13 12:05:46,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [24.0, 66.0, 94.0, 89.0, 48.0, 32.0, 85.0, 86.0, 109.0, 84.0]
2025-09-13 12:05:46,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 8/100 (estimated time remaining: 18 hours, 38 minutes, 26 seconds)
2025-09-13 12:17:27,172 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:17:27,180 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:17:53,825 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 429.18765 ± 113.763
2025-09-13 12:17:53,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [378.24454, 458.59604, 354.3774, 503.19666, 468.26947, 379.92816, 165.73396, 545.8828, 441.88745, 595.76]
2025-09-13 12:17:53,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [70.0, 99.0, 77.0, 107.0, 100.0, 70.0, 32.0, 105.0, 82.0, 114.0]
2025-09-13 12:17:53,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (429.19) for latency ExtremeSparseL4U32
2025-09-13 12:17:53,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 9/100 (estimated time remaining: 18 hours, 27 minutes, 14 seconds)
2025-09-13 12:29:29,729 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:29:29,736 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:29:57,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 471.56049 ± 174.272
2025-09-13 12:29:57,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [474.62, 102.15295, 707.8808, 510.54694, 511.96722, 611.19995, 344.17383, 702.7009, 391.06107, 359.3014]
2025-09-13 12:29:57,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [88.0, 20.0, 133.0, 94.0, 95.0, 119.0, 64.0, 137.0, 73.0, 77.0]
2025-09-13 12:29:57,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (471.56) for latency ExtremeSparseL4U32
2025-09-13 12:29:57,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 10/100 (estimated time remaining: 18 hours, 13 minutes, 18 seconds)
2025-09-13 12:41:37,696 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:41:37,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:42:08,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 519.60272 ± 65.419
2025-09-13 12:42:08,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [567.48334, 573.41956, 584.684, 449.90207, 393.2323, 539.769, 453.19254, 605.53613, 526.42847, 502.38004]
2025-09-13 12:42:08,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [107.0, 107.0, 111.0, 84.0, 87.0, 99.0, 98.0, 122.0, 99.0, 93.0]
2025-09-13 12:42:08,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (519.60) for latency ExtremeSparseL4U32
2025-09-13 12:42:08,389 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 11/100 (estimated time remaining: 18 hours, 6 minutes, 14 seconds)
2025-09-13 12:53:35,362 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 12:53:35,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 12:54:04,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 495.57635 ± 77.070
2025-09-13 12:54:04,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [450.33002, 526.83966, 654.393, 509.53683, 568.5193, 451.96384, 432.77756, 373.2117, 445.0477, 543.14417]
2025-09-13 12:54:04,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [93.0, 98.0, 129.0, 94.0, 110.0, 86.0, 80.0, 68.0, 83.0, 107.0]
2025-09-13 12:54:04,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 12/100 (estimated time remaining: 17 hours, 52 minutes, 44 seconds)
2025-09-13 13:05:38,392 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:05:38,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:05:58,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 343.01550 ± 183.473
2025-09-13 13:05:58,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [146.5346, 140.67798, 396.35797, 451.61615, 124.14873, 396.30368, 624.7294, 587.4293, 431.09024, 131.26683]
2025-09-13 13:05:58,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [28.0, 28.0, 73.0, 84.0, 24.0, 72.0, 119.0, 117.0, 79.0, 25.0]
2025-09-13 13:05:58,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 13/100 (estimated time remaining: 17 hours, 39 minutes, 21 seconds)
2025-09-13 13:17:36,760 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:17:36,768 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:18:01,811 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 438.70712 ± 81.691
2025-09-13 13:18:01,812 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [452.41003, 337.13766, 526.1021, 484.01596, 365.9669, 381.6143, 518.11304, 421.8997, 574.70447, 325.10696]
2025-09-13 13:18:01,812 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [82.0, 61.0, 95.0, 89.0, 68.0, 69.0, 102.0, 78.0, 108.0, 59.0]
2025-09-13 13:18:01,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 14/100 (estimated time remaining: 17 hours, 26 minutes, 18 seconds)
2025-09-13 13:29:36,627 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:29:36,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:30:04,129 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 430.16144 ± 65.963
2025-09-13 13:30:04,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [430.13873, 542.07227, 459.45627, 432.86026, 317.53128, 362.00244, 369.50168, 406.42606, 515.21136, 466.41394]
2025-09-13 13:30:04,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [80.0, 116.0, 101.0, 93.0, 70.0, 80.0, 79.0, 89.0, 98.0, 100.0]
2025-09-13 13:30:04,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 15/100 (estimated time remaining: 17 hours, 13 minutes, 59 seconds)
2025-09-13 13:41:36,801 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:41:36,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:42:00,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 406.75604 ± 182.559
2025-09-13 13:42:00,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [474.4292, 555.334, 108.91957, 130.48291, 280.1798, 585.4602, 622.1427, 586.78986, 435.41806, 288.40402]
2025-09-13 13:42:00,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [88.0, 104.0, 21.0, 26.0, 57.0, 110.0, 117.0, 111.0, 80.0, 61.0]
2025-09-13 13:42:00,274 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 16/100 (estimated time remaining: 16 hours, 57 minutes, 42 seconds)
2025-09-13 13:53:36,517 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 13:53:36,524 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 13:54:00,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 405.62814 ± 130.781
2025-09-13 13:54:00,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [404.42096, 496.75186, 424.54602, 377.7129, 194.80199, 282.66443, 302.21118, 704.54596, 447.22867, 421.39737]
2025-09-13 13:54:00,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [87.0, 90.0, 93.0, 70.0, 38.0, 54.0, 67.0, 137.0, 83.0, 77.0]
2025-09-13 13:54:00,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 17/100 (estimated time remaining: 16 hours, 47 minutes, 5 seconds)
2025-09-13 14:05:40,694 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:05:40,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:06:13,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 571.28485 ± 130.299
2025-09-13 14:06:13,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [782.21893, 691.2459, 512.76227, 634.68054, 710.8819, 460.7759, 442.61374, 406.8745, 422.28857, 648.5061]
2025-09-13 14:06:13,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [149.0, 132.0, 97.0, 119.0, 135.0, 86.0, 82.0, 76.0, 81.0, 122.0]
2025-09-13 14:06:13,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (571.28) for latency ExtremeSparseL4U32
2025-09-13 14:06:13,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 18/100 (estimated time remaining: 16 hours, 40 minutes, 8 seconds)
2025-09-13 14:17:45,979 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:17:45,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:18:12,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 448.05841 ± 112.463
2025-09-13 14:18:12,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [151.24136, 421.67587, 532.3041, 554.0945, 440.7787, 505.92157, 521.5166, 485.9155, 499.48856, 367.64722]
2025-09-13 14:18:12,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [29.0, 76.0, 114.0, 118.0, 84.0, 92.0, 113.0, 88.0, 91.0, 69.0]
2025-09-13 14:18:12,790 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 19/100 (estimated time remaining: 16 hours, 26 minutes, 59 seconds)
2025-09-13 14:29:52,531 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:29:52,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:30:21,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 510.47174 ± 144.473
2025-09-13 14:30:21,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [517.4909, 526.09686, 534.0905, 548.33777, 589.8993, 135.56908, 388.37418, 550.43805, 655.08923, 659.33124]
2025-09-13 14:30:21,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [95.0, 97.0, 99.0, 105.0, 113.0, 27.0, 73.0, 103.0, 124.0, 127.0]
2025-09-13 14:30:21,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 20/100 (estimated time remaining: 16 hours, 16 minutes, 45 seconds)
2025-09-13 14:41:54,632 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:41:54,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:42:21,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 460.58759 ± 139.835
2025-09-13 14:42:21,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [588.15283, 513.92773, 605.8314, 114.17166, 461.64417, 512.24634, 437.73315, 444.71112, 589.93646, 337.5208]
2025-09-13 14:42:21,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [111.0, 96.0, 117.0, 22.0, 85.0, 95.0, 80.0, 83.0, 125.0, 60.0]
2025-09-13 14:42:21,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 21/100 (estimated time remaining: 16 hours, 5 minutes, 43 seconds)
2025-09-13 14:54:00,286 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 14:54:00,293 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 14:54:39,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 662.72388 ± 140.592
2025-09-13 14:54:39,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [729.8603, 824.8238, 535.6045, 920.4837, 742.54095, 527.45557, 540.8431, 700.0486, 454.83276, 650.7454]
2025-09-13 14:54:39,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [139.0, 156.0, 99.0, 178.0, 141.0, 98.0, 116.0, 133.0, 98.0, 124.0]
2025-09-13 14:54:39,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (662.72) for latency ExtremeSparseL4U32
2025-09-13 14:54:39,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 22/100 (estimated time remaining: 15 hours, 58 minutes, 13 seconds)
2025-09-13 15:06:14,768 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:06:14,775 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:06:50,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 603.81519 ± 111.732
2025-09-13 15:06:50,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [611.45844, 589.3021, 535.05273, 885.1972, 588.0067, 480.5079, 599.3518, 633.83905, 457.61472, 657.8212]
2025-09-13 15:06:50,931 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [131.0, 115.0, 103.0, 167.0, 109.0, 89.0, 113.0, 130.0, 101.0, 131.0]
2025-09-13 15:06:50,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 23/100 (estimated time remaining: 15 hours, 45 minutes, 49 seconds)
2025-09-13 15:18:25,828 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:18:25,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:18:57,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 531.28796 ± 155.611
2025-09-13 15:18:57,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [393.2999, 465.55814, 674.70123, 507.43726, 475.03415, 943.70105, 509.18698, 487.02182, 408.34027, 448.59866]
2025-09-13 15:18:57,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [87.0, 98.0, 131.0, 94.0, 91.0, 183.0, 105.0, 89.0, 76.0, 83.0]
2025-09-13 15:18:57,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 24/100 (estimated time remaining: 15 hours, 35 minutes, 35 seconds)
2025-09-13 15:30:32,655 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:30:32,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:31:04,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 538.47040 ± 119.297
2025-09-13 15:31:04,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [570.89996, 578.9037, 382.0626, 458.91602, 556.2928, 624.9026, 497.75916, 804.5497, 369.38608, 541.0311]
2025-09-13 15:31:04,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [107.0, 108.0, 71.0, 85.0, 106.0, 117.0, 93.0, 154.0, 81.0, 113.0]
2025-09-13 15:31:04,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 25/100 (estimated time remaining: 15 hours, 22 minutes, 44 seconds)
2025-09-13 15:42:38,278 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:42:38,286 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:43:20,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 704.88184 ± 240.962
2025-09-13 15:43:20,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [676.4466, 789.0752, 428.15295, 507.07605, 571.63806, 560.024, 1054.0781, 1018.29517, 1036.2869, 407.74503]
2025-09-13 15:43:20,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [145.0, 152.0, 92.0, 96.0, 106.0, 121.0, 207.0, 192.0, 202.0, 88.0]
2025-09-13 15:43:20,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (704.88) for latency ExtremeSparseL4U32
2025-09-13 15:43:20,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 26/100 (estimated time remaining: 15 hours, 14 minutes, 45 seconds)
2025-09-13 15:54:59,306 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 15:54:59,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 15:55:31,612 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 558.12732 ± 115.098
2025-09-13 15:55:31,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [727.78375, 599.1491, 469.70093, 389.1656, 599.33984, 693.89856, 419.36707, 454.1463, 546.8352, 681.88776]
2025-09-13 15:55:31,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [154.0, 111.0, 87.0, 72.0, 109.0, 144.0, 76.0, 84.0, 101.0, 129.0]
2025-09-13 15:55:31,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 27/100 (estimated time remaining: 15 hours, 49 seconds)
2025-09-13 16:07:09,659 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:07:09,666 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:07:32,769 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 401.00555 ± 227.430
2025-09-13 16:07:32,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [156.11139, 113.28993, 525.13715, 537.6465, 654.892, 599.1298, 506.51083, 667.3519, 158.63412, 91.35158]
2025-09-13 16:07:32,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [30.0, 22.0, 97.0, 99.0, 128.0, 117.0, 93.0, 125.0, 31.0, 18.0]
2025-09-13 16:07:32,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 28/100 (estimated time remaining: 14 hours, 46 minutes, 10 seconds)
2025-09-13 16:19:11,430 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:19:11,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:19:40,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 498.55322 ± 193.697
2025-09-13 16:19:40,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [628.6867, 519.66724, 422.63745, 737.7174, 151.51602, 570.64795, 641.2988, 582.11523, 596.39044, 134.85426]
2025-09-13 16:19:40,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [118.0, 96.0, 86.0, 157.0, 29.0, 105.0, 119.0, 109.0, 113.0, 27.0]
2025-09-13 16:19:40,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 29/100 (estimated time remaining: 14 hours, 34 minutes, 13 seconds)
2025-09-13 16:31:14,837 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:31:14,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:31:55,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 699.36859 ± 119.810
2025-09-13 16:31:55,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [978.11414, 727.69305, 653.001, 637.5355, 702.79755, 823.55743, 548.79895, 577.78375, 623.7041, 720.7006]
2025-09-13 16:31:55,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [183.0, 131.0, 119.0, 120.0, 134.0, 152.0, 101.0, 112.0, 117.0, 138.0]
2025-09-13 16:31:55,101 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 30/100 (estimated time remaining: 14 hours, 24 minutes, 2 seconds)
2025-09-13 16:43:36,353 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:43:36,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:44:16,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 657.34442 ± 135.471
2025-09-13 16:44:16,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [748.90015, 519.4608, 463.08475, 746.3724, 643.11224, 535.94464, 850.6378, 498.1295, 803.6696, 764.1325]
2025-09-13 16:44:16,479 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [141.0, 96.0, 98.0, 160.0, 138.0, 108.0, 162.0, 108.0, 153.0, 143.0]
2025-09-13 16:44:16,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 31/100 (estimated time remaining: 14 hours, 13 minutes)
2025-09-13 16:56:00,718 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 16:56:00,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 16:56:43,254 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 681.86029 ± 232.220
2025-09-13 16:56:43,254 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [727.5177, 527.66254, 331.33362, 980.78033, 704.7646, 551.21246, 739.94763, 502.39285, 583.0387, 1169.953]
2025-09-13 16:56:43,254 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [157.0, 114.0, 69.0, 193.0, 145.0, 114.0, 138.0, 108.0, 125.0, 243.0]
2025-09-13 16:56:43,298 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 32/100 (estimated time remaining: 14 hours, 4 minutes, 28 seconds)
2025-09-13 17:08:16,944 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:08:16,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:08:59,661 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 704.30804 ± 160.830
2025-09-13 17:08:59,661 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [685.3389, 608.80884, 1009.23236, 595.34247, 742.4453, 862.11017, 883.7989, 628.1154, 452.80048, 575.08795]
2025-09-13 17:08:59,661 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [148.0, 130.0, 199.0, 113.0, 156.0, 165.0, 166.0, 116.0, 97.0, 107.0]
2025-09-13 17:08:59,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 33/100 (estimated time remaining: 13 hours, 55 minutes, 42 seconds)
2025-09-13 17:20:34,891 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:20:34,900 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:21:21,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 808.63391 ± 220.434
2025-09-13 17:21:21,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [860.0094, 1145.6829, 465.2946, 583.6623, 662.89435, 720.26996, 1035.4669, 1114.3326, 849.8494, 648.8763]
2025-09-13 17:21:21,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [166.0, 213.0, 86.0, 110.0, 126.0, 133.0, 196.0, 211.0, 162.0, 123.0]
2025-09-13 17:21:21,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (808.63) for latency ExtremeSparseL4U32
2025-09-13 17:21:21,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 34/100 (estimated time remaining: 13 hours, 46 minutes, 32 seconds)
2025-09-13 17:32:58,244 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:32:58,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:33:35,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 641.98907 ± 123.537
2025-09-13 17:33:35,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [777.82166, 702.9715, 570.1139, 583.3885, 662.14825, 342.18808, 810.82635, 685.9441, 618.58716, 665.9012]
2025-09-13 17:33:35,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [157.0, 131.0, 108.0, 109.0, 123.0, 74.0, 158.0, 130.0, 122.0, 123.0]
2025-09-13 17:33:35,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 35/100 (estimated time remaining: 13 hours, 34 minutes, 5 seconds)
2025-09-13 17:45:11,068 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:45:11,075 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:45:41,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 533.40637 ± 235.759
2025-09-13 17:45:41,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [671.99927, 466.68427, 506.22565, 153.62788, 109.31021, 534.09814, 520.08105, 724.0495, 855.0653, 792.9225]
2025-09-13 17:45:41,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [124.0, 88.0, 94.0, 30.0, 21.0, 97.0, 97.0, 138.0, 158.0, 152.0]
2025-09-13 17:45:41,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 36/100 (estimated time remaining: 13 hours, 18 minutes, 27 seconds)
2025-09-13 17:57:19,764 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 17:57:19,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 17:58:03,896 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 758.51300 ± 155.068
2025-09-13 17:58:03,896 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [610.84534, 745.74005, 581.3935, 830.75793, 940.37286, 732.9649, 1083.366, 565.4123, 797.6318, 696.64496]
2025-09-13 17:58:03,896 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [113.0, 141.0, 114.0, 160.0, 183.0, 151.0, 198.0, 108.0, 156.0, 134.0]
2025-09-13 17:58:03,907 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 37/100 (estimated time remaining: 13 hours, 5 minutes, 11 seconds)
2025-09-13 18:09:43,416 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:09:43,423 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:10:23,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 672.90723 ± 222.424
2025-09-13 18:10:23,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [844.2976, 816.34607, 909.67163, 530.71826, 101.54533, 702.0745, 569.64435, 812.36115, 664.55457, 777.85864]
2025-09-13 18:10:23,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [162.0, 158.0, 187.0, 116.0, 20.0, 131.0, 104.0, 151.0, 138.0, 148.0]
2025-09-13 18:10:23,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 38/100 (estimated time remaining: 12 hours, 53 minutes, 33 seconds)
2025-09-13 18:21:53,636 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:21:53,644 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:22:44,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 869.49353 ± 233.109
2025-09-13 18:22:44,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [757.9075, 1322.7197, 1065.6288, 1206.4576, 590.4464, 642.5947, 704.8642, 755.76416, 828.91113, 819.6412]
2025-09-13 18:22:44,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [145.0, 254.0, 207.0, 228.0, 124.0, 135.0, 131.0, 142.0, 158.0, 166.0]
2025-09-13 18:22:44,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (869.49) for latency ExtremeSparseL4U32
2025-09-13 18:22:44,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 39/100 (estimated time remaining: 12 hours, 41 minutes, 13 seconds)
2025-09-13 18:34:18,859 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:34:18,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:35:06,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 812.04608 ± 280.934
2025-09-13 18:35:06,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1227.9546, 880.34814, 1040.1356, 407.62894, 647.5352, 508.13553, 492.97974, 751.76874, 969.2589, 1194.7151]
2025-09-13 18:35:06,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [249.0, 166.0, 194.0, 78.0, 121.0, 94.0, 107.0, 146.0, 181.0, 229.0]
2025-09-13 18:35:06,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 40/100 (estimated time remaining: 12 hours, 30 minutes, 24 seconds)
2025-09-13 18:46:42,442 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:46:42,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:47:20,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 664.47498 ± 233.633
2025-09-13 18:47:20,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [714.6721, 755.5936, 708.1766, 936.11786, 494.5318, 890.79706, 923.2206, 632.701, 157.62738, 431.31183]
2025-09-13 18:47:20,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [149.0, 138.0, 142.0, 170.0, 91.0, 165.0, 168.0, 115.0, 32.0, 81.0]
2025-09-13 18:47:20,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 41/100 (estimated time remaining: 12 hours, 19 minutes, 46 seconds)
2025-09-13 18:59:04,214 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 18:59:04,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 18:59:43,508 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 659.73181 ± 353.534
2025-09-13 18:59:43,508 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [382.4261, 1151.3661, 514.3168, 1160.9943, 1063.8624, 118.5686, 413.41354, 829.0804, 323.70673, 639.5829]
2025-09-13 18:59:43,508 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [71.0, 219.0, 94.0, 240.0, 202.0, 23.0, 74.0, 164.0, 72.0, 127.0]
2025-09-13 18:59:43,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 42/100 (estimated time remaining: 12 hours, 7 minutes, 35 seconds)
2025-09-13 19:11:19,403 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:11:19,411 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:12:05,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 797.65564 ± 161.872
2025-09-13 19:12:05,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [699.27814, 775.0537, 797.8144, 642.0525, 712.7843, 878.12463, 967.1639, 517.2684, 871.2816, 1115.7344]
2025-09-13 19:12:05,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [132.0, 143.0, 152.0, 125.0, 128.0, 168.0, 189.0, 94.0, 164.0, 212.0]
2025-09-13 19:12:05,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 43/100 (estimated time remaining: 11 hours, 55 minutes, 40 seconds)
2025-09-13 19:23:42,963 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:23:42,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:24:18,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 588.43835 ± 225.020
2025-09-13 19:24:18,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [101.61918, 981.0601, 589.05536, 628.4274, 854.065, 594.0095, 661.8389, 426.41367, 562.999, 484.89508]
2025-09-13 19:24:18,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [20.0, 182.0, 109.0, 120.0, 159.0, 116.0, 129.0, 92.0, 117.0, 104.0]
2025-09-13 19:24:18,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 44/100 (estimated time remaining: 11 hours, 41 minutes, 43 seconds)
2025-09-13 19:35:47,591 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:35:47,599 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:36:37,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 856.53986 ± 359.513
2025-09-13 19:36:37,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1447.2712, 510.21036, 594.6818, 840.38605, 682.07007, 1162.2152, 1018.4309, 1004.8432, 1152.4124, 152.87741]
2025-09-13 19:36:37,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [283.0, 106.0, 115.0, 154.0, 129.0, 227.0, 193.0, 189.0, 239.0, 29.0]
2025-09-13 19:36:37,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 45/100 (estimated time remaining: 11 hours, 29 minutes, 8 seconds)
2025-09-13 19:48:13,087 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 19:48:13,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 19:48:49,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 622.86597 ± 299.500
2025-09-13 19:48:49,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [684.71936, 904.2183, 129.40132, 107.79067, 461.60413, 670.0102, 945.6094, 648.5386, 639.31226, 1037.4554]
2025-09-13 19:48:49,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [128.0, 163.0, 25.0, 21.0, 89.0, 134.0, 174.0, 134.0, 117.0, 200.0]
2025-09-13 19:48:49,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 46/100 (estimated time remaining: 11 hours, 16 minutes, 13 seconds)
2025-09-13 20:00:32,562 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:00:32,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:01:10,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 669.08917 ± 454.147
2025-09-13 20:01:10,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [854.8228, 1700.4192, 565.8773, 134.06613, 1078.4342, 118.3002, 246.23848, 566.8918, 754.3617, 671.4794]
2025-09-13 20:01:10,401 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [154.0, 318.0, 104.0, 27.0, 206.0, 23.0, 47.0, 105.0, 143.0, 123.0]
2025-09-13 20:01:10,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 47/100 (estimated time remaining: 11 hours, 3 minutes, 38 seconds)
2025-09-13 20:12:46,915 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:12:46,929 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:13:39,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 888.75879 ± 344.465
2025-09-13 20:13:39,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [995.62085, 1331.653, 1525.296, 1148.0956, 885.42957, 651.6635, 681.93646, 421.05487, 468.51358, 778.3244]
2025-09-13 20:13:39,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [185.0, 248.0, 290.0, 215.0, 173.0, 142.0, 125.0, 82.0, 86.0, 147.0]
2025-09-13 20:13:39,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (888.76) for latency ExtremeSparseL4U32
2025-09-13 20:13:39,240 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 48/100 (estimated time remaining: 10 hours, 52 minutes, 37 seconds)
2025-09-13 20:25:21,690 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:25:21,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:26:21,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1037.49658 ± 334.168
2025-09-13 20:26:21,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [725.63086, 1292.9764, 1288.6887, 706.6632, 1494.1368, 847.09863, 588.9653, 1392.896, 1341.9556, 695.954]
2025-09-13 20:26:21,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [135.0, 242.0, 256.0, 129.0, 303.0, 156.0, 108.0, 256.0, 259.0, 147.0]
2025-09-13 20:26:21,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1037.50) for latency ExtremeSparseL4U32
2025-09-13 20:26:21,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 49/100 (estimated time remaining: 10 hours, 45 minutes, 25 seconds)
2025-09-13 20:38:04,206 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:38:04,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:38:54,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 864.54163 ± 434.701
2025-09-13 20:38:54,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [955.48724, 752.93085, 726.6871, 1133.4092, 932.0011, 1046.6141, 346.22214, 138.52509, 780.91766, 1832.622]
2025-09-13 20:38:54,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [181.0, 141.0, 141.0, 209.0, 182.0, 208.0, 74.0, 27.0, 146.0, 338.0]
2025-09-13 20:38:54,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 50/100 (estimated time remaining: 10 hours, 35 minutes, 16 seconds)
2025-09-13 20:50:18,464 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 20:50:18,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 20:50:50,002 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 540.75378 ± 311.106
2025-09-13 20:50:50,003 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [802.91437, 665.96674, 1042.3024, 911.3726, 459.80167, 535.4514, 574.0069, 108.60521, 165.24242, 141.87473]
2025-09-13 20:50:50,003 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [149.0, 136.0, 198.0, 166.0, 92.0, 98.0, 109.0, 21.0, 32.0, 28.0]
2025-09-13 20:50:50,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 51/100 (estimated time remaining: 10 hours, 20 minutes, 9 seconds)
2025-09-13 21:02:31,342 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:02:31,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:03:23,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 905.85779 ± 445.652
2025-09-13 21:03:23,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [829.70624, 1112.1599, 1005.0237, 177.47607, 1608.8673, 184.0008, 1470.4249, 1087.7255, 836.8813, 746.313]
2025-09-13 21:03:23,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [155.0, 209.0, 191.0, 34.0, 299.0, 36.0, 287.0, 209.0, 150.0, 136.0]
2025-09-13 21:03:23,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 52/100 (estimated time remaining: 10 hours, 9 minutes, 44 seconds)
2025-09-13 21:15:08,512 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:15:08,543 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:15:58,384 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 815.02875 ± 196.156
2025-09-13 21:15:58,385 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [712.5991, 1177.2039, 651.5313, 668.8499, 848.21484, 638.5523, 1069.9451, 557.30566, 835.1056, 990.9798]
2025-09-13 21:15:58,385 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [146.0, 243.0, 133.0, 128.0, 159.0, 136.0, 203.0, 118.0, 154.0, 185.0]
2025-09-13 21:15:58,403 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 53/100 (estimated time remaining: 9 hours, 58 minutes, 15 seconds)
2025-09-13 21:27:34,562 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:27:34,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:28:31,736 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 959.27283 ± 524.746
2025-09-13 21:28:31,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [128.21553, 1060.4987, 971.45905, 1151.4506, 1648.7266, 1982.556, 496.4727, 909.7285, 465.69577, 777.92584]
2025-09-13 21:28:31,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [25.0, 206.0, 191.0, 228.0, 323.0, 372.0, 113.0, 181.0, 93.0, 162.0]
2025-09-13 21:28:31,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 54/100 (estimated time remaining: 9 hours, 44 minutes, 22 seconds)
2025-09-13 21:40:11,491 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:40:11,499 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:41:16,157 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1085.67322 ± 424.519
2025-09-13 21:41:16,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [305.14502, 1209.4418, 1097.254, 1052.6057, 1643.0461, 1134.6992, 1864.6108, 616.8446, 901.7531, 1031.3314]
2025-09-13 21:41:16,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [59.0, 230.0, 200.0, 207.0, 322.0, 222.0, 364.0, 129.0, 169.0, 210.0]
2025-09-13 21:41:16,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1085.67) for latency ExtremeSparseL4U32
2025-09-13 21:41:16,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 55/100 (estimated time remaining: 9 hours, 33 minutes, 41 seconds)
2025-09-13 21:53:14,544 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 21:53:14,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 21:54:00,254 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 812.28815 ± 326.297
2025-09-13 21:54:00,254 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1180.0474, 991.0421, 493.40067, 508.0936, 140.24908, 939.68134, 793.94763, 934.32465, 863.3601, 1278.7349]
2025-09-13 21:54:00,254 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [214.0, 186.0, 88.0, 92.0, 27.0, 174.0, 146.0, 177.0, 166.0, 227.0]
2025-09-13 21:54:00,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 56/100 (estimated time remaining: 9 hours, 28 minutes, 32 seconds)
2025-09-13 22:05:29,080 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:05:29,089 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:06:34,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1105.57532 ± 456.241
2025-09-13 22:06:34,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [638.8803, 724.8816, 1091.9298, 1125.5674, 1353.3668, 1190.5006, 1376.5021, 682.20013, 668.2826, 2203.6423]
2025-09-13 22:06:34,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [118.0, 146.0, 209.0, 210.0, 271.0, 242.0, 267.0, 124.0, 124.0, 407.0]
2025-09-13 22:06:34,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1105.58) for latency ExtremeSparseL4U32
2025-09-13 22:06:34,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 57/100 (estimated time remaining: 9 hours, 15 minutes, 56 seconds)
2025-09-13 22:17:59,444 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:17:59,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:19:16,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1325.97754 ± 602.650
2025-09-13 22:19:16,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1802.144, 1381.505, 1204.9283, 924.447, 2223.248, 1210.8508, 2410.0947, 863.76715, 518.518, 720.2722]
2025-09-13 22:19:16,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [343.0, 264.0, 227.0, 187.0, 409.0, 229.0, 455.0, 164.0, 111.0, 133.0]
2025-09-13 22:19:16,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1325.98) for latency ExtremeSparseL4U32
2025-09-13 22:19:16,245 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 58/100 (estimated time remaining: 9 hours, 4 minutes, 21 seconds)
2025-09-13 22:30:54,867 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:30:54,875 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:32:01,429 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1167.53491 ± 709.521
2025-09-13 22:32:01,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [904.3141, 1164.8842, 1811.2621, 569.9975, 2502.1748, 1854.0253, 800.6791, 428.73975, 1533.245, 106.02625]
2025-09-13 22:32:01,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [165.0, 213.0, 335.0, 104.0, 472.0, 361.0, 153.0, 79.0, 303.0, 21.0]
2025-09-13 22:32:01,439 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 59/100 (estimated time remaining: 8 hours, 53 minutes, 21 seconds)
2025-09-13 22:43:47,143 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:43:47,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:44:56,476 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1204.03394 ± 488.274
2025-09-13 22:44:56,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1537.9524, 808.6476, 1130.526, 1516.6357, 662.787, 1515.4442, 683.50214, 623.15515, 2198.841, 1362.8474]
2025-09-13 22:44:56,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [306.0, 152.0, 219.0, 291.0, 122.0, 283.0, 128.0, 117.0, 426.0, 251.0]
2025-09-13 22:44:56,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 60/100 (estimated time remaining: 8 hours, 42 minutes, 6 seconds)
2025-09-13 22:56:37,227 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 22:56:37,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 22:57:55,413 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1337.26660 ± 620.567
2025-09-13 22:57:55,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [793.8573, 571.2229, 1544.2825, 1324.1329, 2426.3867, 1424.759, 2277.1824, 663.28406, 1586.6337, 760.92474]
2025-09-13 22:57:55,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [155.0, 108.0, 307.0, 248.0, 471.0, 268.0, 426.0, 126.0, 323.0, 151.0]
2025-09-13 22:57:55,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1337.27) for latency ExtremeSparseL4U32
2025-09-13 22:57:55,425 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 61/100 (estimated time remaining: 8 hours, 31 minutes, 21 seconds)
2025-09-13 23:09:20,182 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:09:20,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:10:39,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1347.31189 ± 1051.822
2025-09-13 23:10:39,954 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [95.625755, 416.9999, 3129.373, 908.40924, 2833.4023, 688.8403, 143.6984, 1660.8407, 1220.2488, 2375.6804]
2025-09-13 23:10:39,954 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [19.0, 76.0, 592.0, 184.0, 558.0, 140.0, 28.0, 324.0, 228.0, 454.0]
2025-09-13 23:10:39,954 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1347.31) for latency ExtremeSparseL4U32
2025-09-13 23:10:39,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 62/100 (estimated time remaining: 8 hours, 19 minutes, 58 seconds)
2025-09-13 23:22:10,480 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:22:10,488 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:23:12,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1089.26819 ± 720.803
2025-09-13 23:23:12,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [714.47217, 1277.7256, 644.0689, 1156.522, 920.6022, 1255.4965, 2942.7095, 108.35669, 538.50586, 1334.223]
2025-09-13 23:23:12,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [133.0, 232.0, 121.0, 213.0, 167.0, 235.0, 570.0, 21.0, 108.0, 260.0]
2025-09-13 23:23:12,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 63/100 (estimated time remaining: 8 hours, 5 minutes, 59 seconds)
2025-09-13 23:35:02,391 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:35:02,398 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:35:32,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 525.07825 ± 275.212
2025-09-13 23:35:32,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [580.4405, 102.57092, 690.8477, 156.86363, 125.01288, 716.3161, 908.2464, 733.61896, 683.31287, 553.5527]
2025-09-13 23:35:32,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [107.0, 20.0, 129.0, 31.0, 24.0, 131.0, 177.0, 134.0, 125.0, 117.0]
2025-09-13 23:35:32,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 64/100 (estimated time remaining: 7 hours, 50 minutes, 3 seconds)
2025-09-13 23:47:09,433 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:47:09,439 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-13 23:48:08,756 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 989.76056 ± 782.181
2025-09-13 23:48:08,756 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [688.50195, 413.21075, 94.60785, 1612.7471, 2325.167, 1341.7412, 2016.3866, 1165.475, 126.4048, 113.363045]
2025-09-13 23:48:08,756 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [142.0, 79.0, 19.0, 307.0, 450.0, 273.0, 400.0, 221.0, 24.0, 22.0]
2025-09-13 23:48:08,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 65/100 (estimated time remaining: 7 hours, 35 minutes, 4 seconds)
2025-09-13 23:59:42,257 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-13 23:59:42,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:01:17,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1681.06714 ± 1421.831
2025-09-14 00:01:17,700 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [874.4149, 5037.6807, 2114.524, 1816.0748, 96.412155, 604.21533, 3364.8965, 896.81445, 1162.9241, 842.7156]
2025-09-14 00:01:17,700 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [177.0, 959.0, 398.0, 343.0, 19.0, 122.0, 628.0, 159.0, 206.0, 147.0]
2025-09-14 00:01:17,700 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (1681.07) for latency ExtremeSparseL4U32
2025-09-14 00:01:17,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 66/100 (estimated time remaining: 7 hours, 23 minutes, 35 seconds)
2025-09-14 00:13:04,933 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:13:04,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:15:05,458 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2085.82764 ± 1128.364
2025-09-14 00:15:05,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2406.9392, 553.7426, 2422.6343, 3861.2485, 4049.5188, 1288.6259, 1375.8795, 1562.0974, 2504.9548, 832.63635]
2025-09-14 00:15:05,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [447.0, 113.0, 483.0, 734.0, 757.0, 248.0, 253.0, 294.0, 481.0, 165.0]
2025-09-14 00:15:05,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2085.83) for latency ExtremeSparseL4U32
2025-09-14 00:15:05,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 67/100 (estimated time remaining: 7 hours, 18 minutes, 5 seconds)
2025-09-14 00:26:40,981 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:26:40,988 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:29:21,335 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2810.02124 ± 1278.307
2025-09-14 00:29:21,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2982.6711, 4833.7656, 2848.9663, 782.6872, 2282.8477, 2344.49, 1595.5542, 2567.3962, 2596.285, 5265.5493]
2025-09-14 00:29:21,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [560.0, 918.0, 544.0, 147.0, 424.0, 436.0, 299.0, 476.0, 491.0, 1000.0]
2025-09-14 00:29:21,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (2810.02) for latency ExtremeSparseL4U32
2025-09-14 00:29:21,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 68/100 (estimated time remaining: 7 hours, 16 minutes, 31 seconds)
2025-09-14 00:41:17,182 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:41:17,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:42:23,830 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1125.89575 ± 686.594
2025-09-14 00:42:23,832 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [408.26385, 2036.7211, 2167.06, 1098.5345, 2045.6548, 959.491, 1081.8612, 147.77863, 735.3293, 578.26294]
2025-09-14 00:42:23,832 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [83.0, 380.0, 423.0, 219.0, 388.0, 199.0, 217.0, 29.0, 150.0, 113.0]
2025-09-14 00:42:23,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 69/100 (estimated time remaining: 7 hours, 7 minutes, 51 seconds)
2025-09-14 00:53:50,621 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 00:53:50,628 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 00:55:19,367 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1575.10022 ± 948.202
2025-09-14 00:55:19,384 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [4220.714, 1179.6846, 1397.3503, 1415.265, 1880.2444, 1853.3489, 1033.7227, 1058.756, 773.7117, 938.2038]
2025-09-14 00:55:19,384 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [787.0, 213.0, 257.0, 265.0, 355.0, 359.0, 192.0, 196.0, 142.0, 175.0]
2025-09-14 00:55:19,422 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 70/100 (estimated time remaining: 6 hours, 56 minutes, 30 seconds)
2025-09-14 01:07:00,478 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:07:00,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:08:34,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1648.80151 ± 452.206
2025-09-14 01:08:34,532 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1244.6115, 1640.6255, 1908.2013, 725.9996, 1286.8705, 1920.8466, 1455.7104, 2193.5652, 2270.5925, 1840.9935]
2025-09-14 01:08:34,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [230.0, 307.0, 365.0, 153.0, 239.0, 361.0, 271.0, 414.0, 423.0, 336.0]
2025-09-14 01:08:34,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 71/100 (estimated time remaining: 6 hours, 43 minutes, 41 seconds)
2025-09-14 01:20:00,175 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:20:00,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:21:14,792 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1253.26587 ± 1041.236
2025-09-14 01:21:14,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1978.0002, 176.03358, 2510.9375, 609.0721, 3484.1338, 553.7516, 973.83154, 131.31065, 1431.9261, 683.6622]
2025-09-14 01:21:14,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [379.0, 34.0, 483.0, 127.0, 682.0, 122.0, 187.0, 26.0, 276.0, 140.0]
2025-09-14 01:21:14,811 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 72/100 (estimated time remaining: 6 hours, 23 minutes, 42 seconds)
2025-09-14 01:32:59,532 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:32:59,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:35:08,229 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2290.46777 ± 1511.012
2025-09-14 01:35:08,230 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5317.381, 1668.4578, 2198.509, 1534.9285, 4351.209, 2499.8616, 178.70894, 1449.6176, 656.47473, 3049.5317]
2025-09-14 01:35:08,230 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [974.0, 303.0, 399.0, 297.0, 813.0, 465.0, 35.0, 261.0, 131.0, 570.0]
2025-09-14 01:35:08,245 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 73/100 (estimated time remaining: 6 hours, 8 minutes, 22 seconds)
2025-09-14 01:47:08,565 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:47:08,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 01:48:16,888 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1194.49243 ± 577.335
2025-09-14 01:48:16,895 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1102.6443, 858.6854, 1307.582, 2696.6895, 1057.1545, 1089.0817, 1650.8381, 694.0972, 588.8333, 899.3183]
2025-09-14 01:48:16,895 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [200.0, 158.0, 237.0, 529.0, 195.0, 190.0, 311.0, 146.0, 111.0, 167.0]
2025-09-14 01:48:16,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 74/100 (estimated time remaining: 5 hours, 55 minutes, 46 seconds)
2025-09-14 01:59:34,314 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 01:59:34,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:01:19,684 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1786.06934 ± 1224.802
2025-09-14 02:01:19,686 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2272.9736, 102.92618, 1181.2239, 2183.7754, 4018.088, 1979.0828, 146.82779, 1995.818, 653.21643, 3326.7605]
2025-09-14 02:01:19,686 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [422.0, 20.0, 224.0, 419.0, 755.0, 401.0, 28.0, 375.0, 136.0, 645.0]
2025-09-14 02:01:19,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 75/100 (estimated time remaining: 5 hours, 43 minutes, 13 seconds)
2025-09-14 02:12:45,817 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:12:45,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:14:45,395 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2091.36230 ± 980.529
2025-09-14 02:14:45,396 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1796.7815, 1435.1957, 2064.5334, 1319.8082, 2089.7566, 663.9928, 3917.9468, 3834.1865, 1977.0568, 1814.3651]
2025-09-14 02:14:45,396 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [325.0, 259.0, 381.0, 259.0, 384.0, 143.0, 718.0, 725.0, 361.0, 339.0]
2025-09-14 02:14:45,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 76/100 (estimated time remaining: 5 hours, 30 minutes, 54 seconds)
2025-09-14 02:27:05,660 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:27:05,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:28:56,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1930.58374 ± 683.929
2025-09-14 02:28:56,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1439.6771, 1354.4819, 3502.3262, 1806.9026, 1199.9882, 2424.0835, 1977.0945, 2236.9727, 2239.9387, 1124.3705]
2025-09-14 02:28:56,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [265.0, 260.0, 657.0, 327.0, 222.0, 448.0, 386.0, 415.0, 432.0, 210.0]
2025-09-14 02:28:56,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 77/100 (estimated time remaining: 5 hours, 24 minutes, 54 seconds)
2025-09-14 02:39:49,828 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:39:49,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:42:45,996 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 3092.69995 ± 1752.576
2025-09-14 02:42:45,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2212.1216, 418.99496, 5259.8384, 5378.1006, 4535.6807, 4233.225, 4420.8535, 1398.6157, 1519.7628, 1549.8059]
2025-09-14 02:42:45,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [413.0, 75.0, 1000.0, 1000.0, 840.0, 812.0, 806.0, 284.0, 281.0, 287.0]
2025-09-14 02:42:45,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1226 [INFO]: New best (3092.70) for latency ExtremeSparseL4U32
2025-09-14 02:42:46,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 78/100 (estimated time remaining: 5 hours, 11 minutes, 5 seconds)
2025-09-14 02:55:15,698 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 02:55:15,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 02:56:51,341 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1641.90991 ± 1275.950
2025-09-14 02:56:51,343 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1008.92944, 378.5391, 4213.5063, 1810.4769, 102.74915, 363.2695, 917.2852, 1938.9688, 3108.3787, 2576.9966]
2025-09-14 02:56:51,343 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [201.0, 78.0, 787.0, 347.0, 20.0, 68.0, 184.0, 381.0, 599.0, 485.0]
2025-09-14 02:56:51,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 1 minute, 43 seconds)
2025-09-14 03:08:18,390 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:08:18,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:10:25,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2242.68750 ± 1515.136
2025-09-14 03:10:25,532 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1136.9493, 745.9857, 2930.488, 1168.5527, 817.79285, 1053.2545, 4149.9116, 5347.2764, 1841.5874, 3235.077]
2025-09-14 03:10:25,532 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [207.0, 149.0, 559.0, 214.0, 147.0, 197.0, 762.0, 1000.0, 333.0, 589.0]
2025-09-14 03:10:25,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 80/100 (estimated time remaining: 4 hours, 50 minutes, 12 seconds)
2025-09-14 03:22:26,756 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:22:26,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:24:03,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1644.51343 ± 1483.438
2025-09-14 03:24:03,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1828.4188, 1126.8666, 306.4788, 113.87216, 537.3701, 5400.55, 881.95087, 1563.5784, 2923.306, 1762.7439]
2025-09-14 03:24:03,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [360.0, 212.0, 60.0, 22.0, 95.0, 1000.0, 177.0, 295.0, 573.0, 317.0]
2025-09-14 03:24:03,206 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 81/100 (estimated time remaining: 4 hours, 37 minutes, 11 seconds)
2025-09-14 03:35:14,358 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:35:14,365 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:37:20,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2155.49072 ± 1504.614
2025-09-14 03:37:20,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [956.55975, 454.45404, 3795.469, 1287.5731, 1599.804, 1207.9452, 5170.8057, 3638.018, 2720.299, 723.9797]
2025-09-14 03:37:20,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [172.0, 91.0, 709.0, 249.0, 303.0, 230.0, 1000.0, 673.0, 515.0, 134.0]
2025-09-14 03:37:20,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 19 minutes, 57 seconds)
2025-09-14 03:48:42,592 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 03:48:42,600 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 03:51:08,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2463.15967 ± 1084.920
2025-09-14 03:51:08,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1328.7317, 2606.5518, 2941.261, 3373.9314, 1982.8905, 1227.1119, 2554.919, 3225.7427, 831.2533, 4559.201]
2025-09-14 03:51:08,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [269.0, 486.0, 570.0, 632.0, 398.0, 239.0, 483.0, 621.0, 154.0, 867.0]
2025-09-14 03:51:08,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 6 minutes, 7 seconds)
2025-09-14 04:03:01,808 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:03:01,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:05:08,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2229.74585 ± 1713.968
2025-09-14 04:05:08,698 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [3187.536, 875.3004, 2512.3655, 1659.885, 1801.8213, 4860.6704, 5400.784, 1739.499, 123.30617, 136.28818]
2025-09-14 04:05:08,698 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [609.0, 178.0, 461.0, 306.0, 346.0, 899.0, 1000.0, 315.0, 24.0, 26.0]
2025-09-14 04:05:08,712 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 84/100 (estimated time remaining: 3 hours, 52 minutes, 11 seconds)
2025-09-14 04:16:32,602 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:16:32,611 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:18:24,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1975.54163 ± 1491.082
2025-09-14 04:18:24,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5411.6885, 3326.7932, 1063.1049, 1905.4779, 95.95402, 157.45215, 1536.296, 1746.03, 2766.816, 1745.8026]
2025-09-14 04:18:24,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 625.0, 192.0, 352.0, 19.0, 31.0, 283.0, 321.0, 516.0, 320.0]
2025-09-14 04:18:25,002 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 85/100 (estimated time remaining: 3 hours, 37 minutes, 34 seconds)
2025-09-14 04:29:53,325 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:29:53,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:32:05,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2251.03149 ± 1638.335
2025-09-14 04:32:05,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [3841.0571, 466.06482, 353.0301, 1682.3862, 128.26031, 2323.9004, 2911.6462, 3897.536, 5276.9487, 1629.4843]
2025-09-14 04:32:05,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [752.0, 94.0, 69.0, 317.0, 25.0, 437.0, 548.0, 747.0, 1000.0, 310.0]
2025-09-14 04:32:05,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 24 minutes, 6 seconds)
2025-09-14 04:44:19,297 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:44:19,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:46:14,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1976.27051 ± 1630.277
2025-09-14 04:46:14,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [108.31461, 5353.695, 1862.1693, 1836.5913, 964.4989, 4536.71, 1348.2617, 2147.3174, 135.142, 1470.0046]
2025-09-14 04:46:14,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [21.0, 1000.0, 337.0, 352.0, 196.0, 850.0, 248.0, 404.0, 26.0, 271.0]
2025-09-14 04:46:14,444 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 12 minutes, 53 seconds)
2025-09-14 04:57:29,422 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 04:57:29,440 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 04:59:29,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2100.49121 ± 1329.715
2025-09-14 04:59:29,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [911.19666, 4669.551, 766.41235, 1623.8645, 113.3535, 2433.593, 1671.3779, 3279.7295, 3521.1335, 2014.7006]
2025-09-14 04:59:29,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [173.0, 882.0, 142.0, 299.0, 22.0, 459.0, 311.0, 635.0, 648.0, 375.0]
2025-09-14 04:59:29,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 88/100 (estimated time remaining: 2 hours, 57 minutes, 42 seconds)
2025-09-14 05:11:25,556 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 05:11:25,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 05:12:56,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1584.43774 ± 579.267
2025-09-14 05:12:56,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [2657.1511, 2140.1455, 2013.3593, 632.4714, 1118.4789, 1920.2498, 1047.2965, 1718.6522, 1351.0895, 1245.4833]
2025-09-14 05:12:56,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [496.0, 388.0, 374.0, 134.0, 223.0, 348.0, 187.0, 340.0, 241.0, 226.0]
2025-09-14 05:12:56,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 89/100 (estimated time remaining: 2 hours, 42 minutes, 43 seconds)
2025-09-14 05:24:08,063 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 05:24:08,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 05:24:49,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 696.65216 ± 545.043
2025-09-14 05:24:49,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [890.1714, 765.40173, 1490.5328, 1351.8229, 386.45276, 1466.9789, 162.97232, 138.65755, 144.73856, 168.79231]
2025-09-14 05:24:49,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [182.0, 162.0, 285.0, 272.0, 81.0, 276.0, 32.0, 27.0, 28.0, 33.0]
2025-09-14 05:24:49,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 26 minutes, 6 seconds)
2025-09-14 05:36:37,830 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 05:36:37,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 05:37:29,600 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 939.49182 ± 783.169
2025-09-14 05:37:29,600 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1479.1094, 717.3081, 968.8431, 774.1588, 554.4251, 1045.1301, 132.48193, 246.23308, 3002.7896, 474.43964]
2025-09-14 05:37:29,600 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [261.0, 129.0, 170.0, 146.0, 100.0, 186.0, 26.0, 47.0, 538.0, 89.0]
2025-09-14 05:37:29,611 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 10 minutes, 48 seconds)
2025-09-14 05:49:36,828 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 05:49:36,837 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 05:50:58,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1489.95044 ± 746.836
2025-09-14 05:50:58,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [670.59216, 1491.0841, 3090.132, 1260.5504, 2008.1487, 2150.759, 765.1114, 1835.3567, 693.33716, 934.4332]
2025-09-14 05:50:58,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [128.0, 267.0, 570.0, 224.0, 359.0, 386.0, 136.0, 327.0, 122.0, 172.0]
2025-09-14 05:50:58,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 92/100 (estimated time remaining: 1 hour, 56 minutes, 31 seconds)
2025-09-14 06:02:39,759 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 06:02:39,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 06:04:19,447 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1698.96130 ± 1450.211
2025-09-14 06:04:19,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [960.3448, 2061.8335, 498.54123, 125.103714, 1272.9376, 1940.4972, 5168.3354, 3008.6792, 1813.5349, 139.80719]
2025-09-14 06:04:19,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [182.0, 404.0, 90.0, 24.0, 240.0, 366.0, 1000.0, 580.0, 335.0, 27.0]
2025-09-14 06:04:19,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 43 minutes, 44 seconds)
2025-09-14 06:15:18,985 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 06:15:19,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 06:16:47,929 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1561.54443 ± 1669.058
2025-09-14 06:16:47,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [146.04887, 122.595375, 112.86165, 126.53074, 2869.2239, 2218.1445, 1031.1982, 5344.5386, 648.24664, 2996.0544]
2025-09-14 06:16:47,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [28.0, 24.0, 22.0, 25.0, 522.0, 412.0, 198.0, 1000.0, 133.0, 556.0]
2025-09-14 06:16:47,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 29 minutes, 23 seconds)
2025-09-14 06:28:42,017 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 06:28:42,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 06:30:32,898 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1960.40649 ± 1586.464
2025-09-14 06:30:32,899 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [3516.5469, 639.9682, 139.89565, 118.81076, 4167.5073, 3744.1033, 3489.0093, 471.88913, 789.27594, 2527.059]
2025-09-14 06:30:32,899 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [645.0, 116.0, 27.0, 23.0, 781.0, 671.0, 636.0, 88.0, 165.0, 465.0]
2025-09-14 06:30:32,910 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 18 minutes, 51 seconds)
2025-09-14 06:42:23,553 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 06:42:23,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 06:44:03,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1772.00452 ± 1427.970
2025-09-14 06:44:03,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [140.31221, 1118.4933, 1075.944, 2307.3953, 767.6223, 1459.1647, 3028.1003, 1657.7815, 802.54205, 5362.688]
2025-09-14 06:44:03,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [27.0, 210.0, 206.0, 423.0, 161.0, 261.0, 560.0, 302.0, 156.0, 1000.0]
2025-09-14 06:44:03,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 6 minutes, 34 seconds)
2025-09-14 06:55:08,858 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 06:55:08,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 06:57:46,262 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2711.20923 ± 1810.445
2025-09-14 06:57:46,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1453.4277, 3870.796, 4484.344, 920.5734, 1386.7433, 3812.6704, 360.07053, 4566.9766, 5496.058, 760.431]
2025-09-14 06:57:46,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [262.0, 731.0, 850.0, 175.0, 254.0, 705.0, 70.0, 856.0, 1000.0, 136.0]
2025-09-14 06:57:46,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 97/100 (estimated time remaining: 53 minutes, 26 seconds)
2025-09-14 07:09:37,651 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 07:09:37,658 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 07:11:07,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1527.98584 ± 1963.869
2025-09-14 07:11:07,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5410.7495, 5266.609, 762.7617, 133.47353, 153.16324, 139.87885, 516.8587, 979.1548, 1731.7302, 185.47821]
2025-09-14 07:11:07,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 145.0, 26.0, 30.0, 27.0, 105.0, 184.0, 356.0, 36.0]
2025-09-14 07:11:07,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 98/100 (estimated time remaining: 40 minutes, 4 seconds)
2025-09-14 07:22:39,273 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 07:22:39,281 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 07:24:24,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1825.70374 ± 1498.957
2025-09-14 07:24:24,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [3720.395, 590.20905, 474.04312, 914.7626, 497.8216, 997.4003, 2535.8628, 1737.5398, 1548.9784, 5240.025]
2025-09-14 07:24:24,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [697.0, 106.0, 95.0, 172.0, 109.0, 208.0, 487.0, 317.0, 285.0, 1000.0]
2025-09-14 07:24:24,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 99/100 (estimated time remaining: 27 minutes, 2 seconds)
2025-09-14 07:36:10,463 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 07:36:10,465 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 07:38:17,304 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 2229.42578 ± 1830.284
2025-09-14 07:38:17,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [5432.574, 667.5584, 2707.654, 1180.2426, 1098.632, 140.39197, 492.5032, 5445.4595, 2215.9858, 2913.2551]
2025-09-14 07:38:17,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [1000.0, 132.0, 497.0, 225.0, 210.0, 27.0, 104.0, 1000.0, 401.0, 556.0]
2025-09-14 07:38:17,317 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1199 [INFO]: Iteration 100/100 (estimated time remaining: 13 minutes, 32 seconds)
2025-09-14 07:49:38,283 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-14 07:49:38,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1214 [DEBUG]: Evaluating for latency ExtremeSparseL4U32...
2025-09-14 07:51:02,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1221 [DEBUG]: Total Reward: 1501.98401 ± 946.361
2025-09-14 07:51:02,928 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1222 [DEBUG]: All rewards: [1822.8096, 916.3327, 1060.4646, 794.8979, 3889.8271, 139.81853, 1583.8362, 1958.6212, 1301.5381, 1551.6936]
2025-09-14 07:51:02,928 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1223 [DEBUG]: All trajectory lengths: [337.0, 164.0, 186.0, 160.0, 716.0, 27.0, 306.0, 365.0, 241.0, 269.0]
2025-09-14 07:51:02,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-humanoid):1251 [DEBUG]: Training session finished
