2025-09-11 18:57:36,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc15-hopper/ExtremeClogL1U23-mbpac_memdelay
2025-09-11 18:57:36,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc15-hopper/ExtremeClogL1U23-mbpac_memdelay
2025-09-11 18:57:36,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x1481758dc050>}
2025-09-11 18:57:36,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1111 [DEBUG]: using device: cuda
2025-09-11 18:57:36,076 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1133 [INFO]: Creating new trainer
2025-09-11 18:57:36,081 baseline-mbpac-noiseperc15-hopper:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=384, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=3, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(3,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=3, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(3,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2.]]), shift: tensor([[-1., -1., -1.]]))
)
2025-09-11 18:57:36,081 baseline-mbpac-noiseperc15-hopper:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=14, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 18:57:36,089 baseline-mbpac-noiseperc15-hopper:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=384, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=11, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=11, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=11, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=384, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=3, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 384, batch_first=True)
)
2025-09-11 18:57:37,065 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1194 [DEBUG]: Starting training session...
2025-09-11 18:57:37,065 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 1/100
2025-09-11 19:07:52,792 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:07:52,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:08:10,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 114.64954 ± 18.320
2025-09-11 19:08:10,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [102.773865, 107.2215, 94.46676, 107.13654, 109.17994, 118.85369, 126.197624, 108.37919, 108.285126, 164.00107]
2025-09-11 19:08:10,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [58.0, 61.0, 55.0, 59.0, 64.0, 64.0, 73.0, 59.0, 64.0, 88.0]
2025-09-11 19:08:10,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (114.65) for latency ExtremeClogL1U23
2025-09-11 19:08:10,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 2/100 (estimated time remaining: 17 hours, 24 minutes, 53 seconds)
2025-09-11 19:19:49,058 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:19:49,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:20:17,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 143.66843 ± 103.492
2025-09-11 19:20:17,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [35.632027, 13.960106, 16.411404, 138.85039, 179.73048, 201.0074, 276.2601, 255.98424, 42.992672, 275.85547]
2025-09-11 19:20:17,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [32.0, 17.0, 17.0, 108.0, 133.0, 128.0, 186.0, 164.0, 41.0, 199.0]
2025-09-11 19:20:17,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (143.67) for latency ExtremeClogL1U23
2025-09-11 19:20:17,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 3/100 (estimated time remaining: 18 hours, 30 minutes, 58 seconds)
2025-09-11 19:32:03,290 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:32:03,292 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:32:36,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 203.49667 ± 152.938
2025-09-11 19:32:36,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [197.43027, 619.2002, 118.99521, 121.62, 186.16173, 12.092843, 235.33556, 211.11052, 109.64627, 223.37413]
2025-09-11 19:32:36,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [110.0, 403.0, 69.0, 81.0, 123.0, 15.0, 116.0, 117.0, 63.0, 124.0]
2025-09-11 19:32:36,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (203.50) for latency ExtremeClogL1U23
2025-09-11 19:32:36,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 4/100 (estimated time remaining: 18 hours, 51 minutes, 34 seconds)
2025-09-11 19:44:16,044 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:44:16,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:45:05,670 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 274.51532 ± 163.614
2025-09-11 19:45:05,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [238.22575, 12.820136, 435.3895, 486.8408, 311.60507, 148.44453, 311.83475, 15.353566, 474.01886, 310.6203]
2025-09-11 19:45:05,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [210.0, 22.0, 229.0, 308.0, 157.0, 99.0, 216.0, 23.0, 402.0, 148.0]
2025-09-11 19:45:05,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (274.52) for latency ExtremeClogL1U23
2025-09-11 19:45:05,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 5/100 (estimated time remaining: 18 hours, 59 minutes, 26 seconds)
2025-09-11 19:56:42,459 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:56:42,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:57:30,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 279.97714 ± 174.579
2025-09-11 19:57:30,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [312.76984, 344.3805, 342.76926, 547.1895, 538.80646, 297.91003, 72.98076, 13.180786, 74.394264, 255.39]
2025-09-11 19:57:30,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [151.0, 216.0, 239.0, 361.0, 373.0, 154.0, 42.0, 20.0, 45.0, 158.0]
2025-09-11 19:57:30,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (279.98) for latency ExtremeClogL1U23
2025-09-11 19:57:30,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 6/100 (estimated time remaining: 18 hours, 57 minutes, 59 seconds)
2025-09-11 20:09:02,799 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:09:02,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:09:40,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 253.69893 ± 154.644
2025-09-11 20:09:40,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [368.69183, 34.251534, 67.80949, 35.489887, 346.0636, 307.80743, 466.50595, 430.2093, 179.56604, 300.5943]
2025-09-11 20:09:40,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [175.0, 40.0, 43.0, 43.0, 171.0, 132.0, 278.0, 247.0, 85.0, 125.0]
2025-09-11 20:09:40,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 7/100 (estimated time remaining: 19 hours, 16 minutes, 6 seconds)
2025-09-11 20:21:20,223 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:21:20,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:22:01,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 322.04752 ± 180.248
2025-09-11 20:22:01,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [61.610962, 431.92358, 172.03874, 679.82526, 333.47116, 358.2717, 89.604706, 416.39124, 460.23892, 217.09882]
2025-09-11 20:22:01,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [37.0, 210.0, 107.0, 368.0, 143.0, 145.0, 58.0, 167.0, 182.0, 100.0]
2025-09-11 20:22:01,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (322.05) for latency ExtremeClogL1U23
2025-09-11 20:22:01,691 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 8/100 (estimated time remaining: 19 hours, 8 minutes, 18 seconds)
2025-09-11 20:33:49,901 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:33:49,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:34:35,200 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 311.47345 ± 212.303
2025-09-11 20:34:35,200 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [435.13416, 625.311, 14.518601, 419.1363, 618.0291, 190.16093, 226.32785, 164.26619, 410.08936, 11.760896]
2025-09-11 20:34:35,200 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [243.0, 278.0, 22.0, 236.0, 293.0, 154.0, 152.0, 100.0, 165.0, 16.0]
2025-09-11 20:34:35,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 9/100 (estimated time remaining: 19 hours, 17 seconds)
2025-09-11 20:46:06,470 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:46:06,473 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:46:51,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 428.52017 ± 220.389
2025-09-11 20:46:51,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [571.1256, 594.9139, 575.7506, 795.20514, 122.7137, 305.20892, 575.6328, 425.00366, 101.58334, 218.06416]
2025-09-11 20:46:51,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [191.0, 207.0, 234.0, 263.0, 70.0, 128.0, 199.0, 158.0, 66.0, 103.0]
2025-09-11 20:46:51,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (428.52) for latency ExtremeClogL1U23
2025-09-11 20:46:51,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 10/100 (estimated time remaining: 18 hours, 43 minutes, 58 seconds)
2025-09-11 20:58:20,902 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:58:20,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:59:04,762 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 400.84482 ± 172.980
2025-09-11 20:59:04,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [577.55035, 187.64958, 589.5444, 491.85025, 538.56024, 401.18686, 516.1087, 442.22107, 134.62169, 129.15526]
2025-09-11 20:59:04,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [198.0, 94.0, 225.0, 173.0, 205.0, 171.0, 186.0, 174.0, 79.0, 74.0]
2025-09-11 20:59:04,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 11/100 (estimated time remaining: 18 hours, 28 minutes, 12 seconds)
2025-09-11 21:11:01,480 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:11:01,482 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:12:03,241 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 580.89001 ± 374.044
2025-09-11 21:12:03,241 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [359.08316, 237.88814, 1092.4117, 1114.8079, 848.8974, 246.89545, 783.6503, 131.79442, 146.61295, 846.8587]
2025-09-11 21:12:03,241 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [144.0, 127.0, 381.0, 413.0, 281.0, 132.0, 282.0, 70.0, 95.0, 325.0]
2025-09-11 21:12:03,241 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (580.89) for latency ExtremeClogL1U23
2025-09-11 21:12:03,255 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 12/100 (estimated time remaining: 18 hours, 30 minutes, 29 seconds)
2025-09-11 21:23:25,250 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:23:25,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:24:27,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 532.07025 ± 233.419
2025-09-11 21:24:27,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [320.9846, 332.32578, 500.90475, 61.29913, 852.2319, 518.72314, 558.2503, 776.58905, 795.7202, 603.67377]
2025-09-11 21:24:27,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [149.0, 144.0, 229.0, 43.0, 316.0, 230.0, 224.0, 295.0, 347.0, 267.0]
2025-09-11 21:24:27,066 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 13/100 (estimated time remaining: 18 hours, 18 minutes, 38 seconds)
2025-09-11 21:36:10,022 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:36:10,034 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:37:03,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 528.13477 ± 165.784
2025-09-11 21:37:03,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [417.68204, 638.8735, 205.29556, 516.2225, 667.29266, 259.96088, 663.4337, 644.47217, 619.0345, 649.08026]
2025-09-11 21:37:03,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [181.0, 217.0, 93.0, 229.0, 216.0, 110.0, 213.0, 215.0, 208.0, 253.0]
2025-09-11 21:37:03,200 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 14/100 (estimated time remaining: 18 hours, 6 minutes, 54 seconds)
2025-09-11 21:48:50,225 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:48:50,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:49:58,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 673.89087 ± 362.406
2025-09-11 21:49:58,358 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [684.9785, 301.21414, 650.7859, 630.8353, 978.42004, 89.897705, 633.6879, 1389.4778, 1024.477, 355.1345]
2025-09-11 21:49:58,358 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [240.0, 126.0, 206.0, 205.0, 315.0, 52.0, 243.0, 542.0, 365.0, 157.0]
2025-09-11 21:49:58,358 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (673.89) for latency ExtremeClogL1U23
2025-09-11 21:49:58,388 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 15/100 (estimated time remaining: 18 hours, 5 minutes, 41 seconds)
2025-09-11 22:01:32,203 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:01:32,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:02:08,248 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 303.92944 ± 354.411
2025-09-11 22:02:08,248 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1176.0895, 385.11697, 11.844178, 240.76494, 152.7166, 21.355953, 12.61854, 26.613636, 342.8427, 669.33167]
2025-09-11 22:02:08,248 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [501.0, 147.0, 15.0, 115.0, 75.0, 21.0, 15.0, 29.0, 139.0, 236.0]
2025-09-11 22:02:08,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 16/100 (estimated time remaining: 17 hours, 51 minutes, 59 seconds)
2025-09-11 22:14:18,665 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:14:18,667 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:15:09,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 482.20303 ± 384.731
2025-09-11 22:15:09,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [19.001923, 376.78705, 253.59863, 757.0735, 562.8284, 428.22882, 216.36658, 452.2222, 272.7891, 1483.1345]
2025-09-11 22:15:09,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [21.0, 171.0, 113.0, 284.0, 236.0, 165.0, 100.0, 191.0, 119.0, 516.0]
2025-09-11 22:15:09,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 17/100 (estimated time remaining: 17 hours, 40 minutes, 5 seconds)
2025-09-11 22:25:37,857 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:25:37,860 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:27:21,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1009.79102 ± 776.920
2025-09-11 22:27:21,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [723.8713, 351.43054, 705.43835, 427.1712, 593.0475, 363.89377, 2493.1042, 1185.8411, 759.67975, 2494.4321]
2025-09-11 22:27:21,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [323.0, 149.0, 221.0, 182.0, 214.0, 150.0, 958.0, 426.0, 261.0, 987.0]
2025-09-11 22:27:21,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (1009.79) for latency ExtremeClogL1U23
2025-09-11 22:27:21,872 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 18/100 (estimated time remaining: 17 hours, 24 minutes, 21 seconds)
2025-09-11 22:38:43,149 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:38:43,152 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:39:32,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 436.71005 ± 390.178
2025-09-11 22:39:32,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [319.08383, 135.04803, 419.23965, 14.484143, 33.67826, 1380.7072, 462.37927, 756.90436, 220.5018, 625.0742]
2025-09-11 22:39:32,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [144.0, 84.0, 186.0, 16.0, 39.0, 496.0, 199.0, 323.0, 108.0, 264.0]
2025-09-11 22:39:32,421 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 19/100 (estimated time remaining: 17 hours, 4 minutes, 47 seconds)
2025-09-11 22:50:28,839 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:50:28,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:52:04,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 932.67267 ± 421.360
2025-09-11 22:52:04,691 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1516.1229, 1149.3433, 1372.6582, 1034.1785, 838.8303, 1214.4844, 207.6859, 824.3671, 186.89632, 982.1598]
2025-09-11 22:52:04,691 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [572.0, 423.0, 500.0, 449.0, 308.0, 437.0, 98.0, 335.0, 92.0, 355.0]
2025-09-11 22:52:04,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 20/100 (estimated time remaining: 16 hours, 46 minutes, 6 seconds)
2025-09-11 23:03:26,644 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:03:26,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:04:19,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 539.54260 ± 342.396
2025-09-11 23:04:19,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [450.9934, 127.09011, 897.72723, 772.7644, 865.965, 794.5649, 12.095602, 814.92017, 644.2762, 15.028434]
2025-09-11 23:04:19,516 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [186.0, 64.0, 319.0, 253.0, 304.0, 321.0, 16.0, 266.0, 259.0, 19.0]
2025-09-11 23:04:19,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 21/100 (estimated time remaining: 16 hours, 35 minutes)
2025-09-11 23:15:16,398 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:15:16,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:16:03,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 404.46307 ± 425.212
2025-09-11 23:16:03,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [310.27582, 75.00942, 693.28503, 249.58379, 78.30047, 683.24426, 65.66524, 222.0376, 169.26128, 1497.9681]
2025-09-11 23:16:03,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [135.0, 73.0, 298.0, 122.0, 47.0, 296.0, 56.0, 100.0, 91.0, 558.0]
2025-09-11 23:16:03,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 22/100 (estimated time remaining: 16 hours, 2 minutes, 24 seconds)
2025-09-11 23:27:26,289 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:27:26,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:28:43,660 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 775.85333 ± 373.728
2025-09-11 23:28:43,661 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1509.9633, 907.4547, 813.77844, 908.8578, 423.90082, 847.19714, 35.75683, 866.6663, 466.83047, 978.1273]
2025-09-11 23:28:43,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [581.0, 325.0, 297.0, 283.0, 186.0, 278.0, 28.0, 354.0, 193.0, 340.0]
2025-09-11 23:28:43,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 23/100 (estimated time remaining: 15 hours, 57 minutes, 16 seconds)
2025-09-11 23:40:12,701 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:40:12,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:41:13,285 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 587.61090 ± 475.816
2025-09-11 23:41:13,286 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [212.10074, 1045.7717, 1654.6119, 152.39867, 162.72478, 720.50836, 429.92862, 928.9452, 460.54483, 108.573685]
2025-09-11 23:41:13,286 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [99.0, 377.0, 586.0, 76.0, 82.0, 232.0, 175.0, 372.0, 184.0, 59.0]
2025-09-11 23:41:13,290 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 24/100 (estimated time remaining: 15 hours, 49 minutes, 53 seconds)
2025-09-11 23:51:56,426 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:51:56,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:52:43,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 433.58228 ± 368.345
2025-09-11 23:52:43,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [749.5507, 15.901344, 392.3195, 1077.4056, 55.758785, 688.44006, 62.096058, 284.38046, 884.53503, 125.43485]
2025-09-11 23:52:43,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [302.0, 17.0, 169.0, 425.0, 38.0, 256.0, 37.0, 129.0, 343.0, 64.0]
2025-09-11 23:52:43,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 25/100 (estimated time remaining: 15 hours, 21 minutes, 55 seconds)
2025-09-12 00:03:56,976 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:03:56,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:05:17,868 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 775.46991 ± 709.515
2025-09-12 00:05:17,869 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [501.60934, 896.2715, 1906.2816, 12.687618, 68.79391, 1856.4271, 233.20901, 448.09323, 1614.0764, 217.24884]
2025-09-12 00:05:17,869 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [201.0, 311.0, 692.0, 15.0, 42.0, 655.0, 115.0, 199.0, 637.0, 119.0]
2025-09-12 00:05:17,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 26/100 (estimated time remaining: 15 hours, 14 minutes, 34 seconds)
2025-09-12 00:16:51,680 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:16:51,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:17:55,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 636.73383 ± 496.795
2025-09-12 00:17:55,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [147.08893, 636.11127, 1406.987, 743.16016, 1510.9283, 775.4922, 758.37604, 49.55479, 20.452662, 319.18716]
2025-09-12 00:17:55,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [77.0, 272.0, 502.0, 260.0, 539.0, 264.0, 294.0, 30.0, 22.0, 140.0]
2025-09-12 00:17:55,657 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 27/100 (estimated time remaining: 15 hours, 15 minutes, 32 seconds)
2025-09-12 00:28:43,586 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:28:43,594 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:29:53,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 721.00525 ± 274.654
2025-09-12 00:29:53,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [704.93286, 725.85583, 444.2603, 878.50433, 545.2414, 819.96173, 1217.7307, 952.7305, 163.98828, 756.8464]
2025-09-12 00:29:53,627 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [287.0, 270.0, 185.0, 294.0, 209.0, 330.0, 406.0, 297.0, 83.0, 279.0]
2025-09-12 00:29:53,636 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 28/100 (estimated time remaining: 14 hours, 53 minutes, 1 second)
2025-09-12 00:41:02,972 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:41:02,980 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:42:23,474 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 785.41156 ± 552.527
2025-09-12 00:42:23,475 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1255.2898, 183.61502, 261.21094, 972.88745, 1491.5533, 12.91498, 90.02857, 1205.9707, 995.0524, 1385.5927]
2025-09-12 00:42:23,475 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [435.0, 115.0, 115.0, 398.0, 563.0, 16.0, 80.0, 451.0, 317.0, 524.0]
2025-09-12 00:42:23,484 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 29/100 (estimated time remaining: 14 hours, 40 minutes, 50 seconds)
2025-09-12 00:53:52,443 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:53:52,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:54:58,425 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 631.00012 ± 496.209
2025-09-12 00:54:58,425 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1518.0868, 969.2056, 78.38598, 769.7299, 671.39056, 57.387558, 1243.7344, 186.71684, 66.314384, 749.0487]
2025-09-12 00:54:58,425 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [528.0, 369.0, 56.0, 271.0, 284.0, 47.0, 433.0, 109.0, 43.0, 302.0]
2025-09-12 00:54:58,438 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 30/100 (estimated time remaining: 14 hours, 43 minutes, 50 seconds)
2025-09-12 01:05:47,831 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:05:47,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:06:32,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 400.45029 ± 468.596
2025-09-12 01:06:32,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1153.6064, 245.40616, 11.671781, 126.156006, 107.49352, 116.78795, 1241.008, 15.047325, 81.709694, 905.6159]
2025-09-12 01:06:32,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [433.0, 132.0, 16.0, 70.0, 60.0, 69.0, 464.0, 18.0, 47.0, 361.0]
2025-09-12 01:06:32,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 31/100 (estimated time remaining: 14 hours, 17 minutes, 21 seconds)
2025-09-12 01:18:18,996 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:18:18,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:19:14,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 530.85583 ± 576.145
2025-09-12 01:19:14,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1782.7549, 128.98647, 81.18772, 50.961742, 877.29944, 53.058075, 711.73145, 380.4285, 13.521723, 1228.6287]
2025-09-12 01:19:14,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [638.0, 77.0, 65.0, 61.0, 281.0, 62.0, 269.0, 154.0, 21.0, 452.0]
2025-09-12 01:19:14,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 32/100 (estimated time remaining: 14 hours, 6 minutes, 9 seconds)
2025-09-12 01:29:57,242 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:29:57,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:31:23,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 912.17072 ± 618.194
2025-09-12 01:31:23,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [137.03091, 885.05804, 2090.1309, 941.8119, 1459.2552, 338.96463, 473.1899, 1354.823, 111.71458, 1329.7284]
2025-09-12 01:31:23,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [70.0, 296.0, 705.0, 341.0, 481.0, 150.0, 200.0, 421.0, 61.0, 480.0]
2025-09-12 01:31:23,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 33/100 (estimated time remaining: 13 hours, 56 minutes, 26 seconds)
2025-09-12 01:42:29,819 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:42:29,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:44:15,033 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1077.14722 ± 951.405
2025-09-12 01:44:15,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [578.0151, 233.5119, 669.98193, 206.30225, 920.92, 1439.7386, 2786.336, 964.46533, 2854.0042, 118.197624]
2025-09-12 01:44:15,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [225.0, 110.0, 255.0, 97.0, 299.0, 490.0, 1000.0, 335.0, 1000.0, 63.0]
2025-09-12 01:44:15,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (1077.15) for latency ExtremeClogL1U23
2025-09-12 01:44:15,047 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 34/100 (estimated time remaining: 13 hours, 48 minutes, 54 seconds)
2025-09-12 01:55:36,167 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:55:36,169 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:56:48,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 718.96387 ± 497.365
2025-09-12 01:56:48,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1621.829, 184.72617, 687.1903, 1265.4501, 683.61414, 1046.9421, 76.139824, 1029.5426, 519.64795, 74.55593]
2025-09-12 01:56:48,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [573.0, 89.0, 284.0, 373.0, 264.0, 423.0, 64.0, 333.0, 197.0, 55.0]
2025-09-12 01:56:48,206 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 35/100 (estimated time remaining: 13 hours, 36 minutes, 8 seconds)
2025-09-12 02:08:13,981 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:08:13,985 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:09:18,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 654.74463 ± 701.393
2025-09-12 02:09:18,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [844.2976, 282.59232, 15.56717, 387.55292, 1401.6843, 812.66675, 2353.455, 265.84186, 14.972409, 168.81593]
2025-09-12 02:09:18,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [287.0, 143.0, 22.0, 169.0, 479.0, 310.0, 812.0, 116.0, 16.0, 87.0]
2025-09-12 02:09:18,807 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 36/100 (estimated time remaining: 13 hours, 36 minutes, 5 seconds)
2025-09-12 02:20:08,412 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:20:08,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:20:55,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 432.37485 ± 627.534
2025-09-12 02:20:55,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [753.828, 90.5446, 16.812037, 775.77454, 84.18292, 2133.447, 19.791317, 190.36003, 124.128624, 134.87941]
2025-09-12 02:20:55,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [263.0, 80.0, 17.0, 309.0, 68.0, 756.0, 23.0, 90.0, 74.0, 80.0]
2025-09-12 02:20:55,237 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 37/100 (estimated time remaining: 13 hours, 9 minutes, 28 seconds)
2025-09-12 02:32:17,823 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:32:17,825 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:33:42,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 864.08191 ± 755.180
2025-09-12 02:33:42,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [650.3331, 17.619408, 1444.1846, 15.465667, 770.5552, 899.61816, 1611.488, 2471.1907, 743.7572, 16.607777]
2025-09-12 02:33:42,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [273.0, 23.0, 529.0, 18.0, 265.0, 335.0, 529.0, 865.0, 275.0, 25.0]
2025-09-12 02:33:42,488 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 38/100 (estimated time remaining: 13 hours, 5 minutes, 6 seconds)
2025-09-12 02:45:27,046 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:45:27,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:46:56,550 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 895.87256 ± 557.295
2025-09-12 02:46:56,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [348.11594, 411.08374, 1861.943, 957.3435, 1112.8048, 721.7097, 565.3511, 1467.376, 21.126482, 1491.8713]
2025-09-12 02:46:56,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [144.0, 172.0, 668.0, 353.0, 417.0, 287.0, 218.0, 528.0, 23.0, 544.0]
2025-09-12 02:46:56,619 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 39/100 (estimated time remaining: 12 hours, 57 minutes, 23 seconds)
2025-09-12 02:57:46,837 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:57:46,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:59:48,298 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1323.11426 ± 738.757
2025-09-12 02:59:48,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [764.9145, 873.00354, 378.47156, 931.92694, 2005.298, 2920.07, 1718.416, 795.40295, 959.1204, 1884.5187]
2025-09-12 02:59:48,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [292.0, 315.0, 163.0, 342.0, 673.0, 1000.0, 565.0, 298.0, 314.0, 616.0]
2025-09-12 02:59:48,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (1323.11) for latency ExtremeClogL1U23
2025-09-12 02:59:48,339 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 40/100 (estimated time remaining: 12 hours, 48 minutes, 37 seconds)
2025-09-12 03:11:01,486 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:11:01,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:12:14,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 781.25031 ± 579.191
2025-09-12 03:12:14,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1317.8735, 10.961717, 738.80035, 1845.7831, 204.86269, 523.26843, 1009.2781, 440.6589, 244.49628, 1476.52]
2025-09-12 03:12:14,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [441.0, 15.0, 259.0, 580.0, 97.0, 216.0, 298.0, 185.0, 123.0, 485.0]
2025-09-12 03:12:14,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 41/100 (estimated time remaining: 12 hours, 35 minutes, 10 seconds)
2025-09-12 03:23:07,488 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:23:07,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:24:34,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 917.42920 ± 923.031
2025-09-12 03:24:34,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1336.7479, 450.9972, 2348.8894, 575.14716, 1035.8424, 80.231964, 348.31186, 2804.2646, 183.05652, 10.802682]
2025-09-12 03:24:34,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [467.0, 183.0, 806.0, 205.0, 332.0, 47.0, 141.0, 975.0, 90.0, 19.0]
2025-09-12 03:24:34,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 42/100 (estimated time remaining: 12 hours, 31 minutes, 9 seconds)
2025-09-12 03:36:03,281 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:36:03,286 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:37:20,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 832.70978 ± 545.374
2025-09-12 03:37:20,315 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [859.5052, 1558.6763, 21.454327, 105.67339, 894.4947, 1528.3274, 869.15564, 1056.5654, 1305.5924, 127.653755]
2025-09-12 03:37:20,315 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [288.0, 553.0, 20.0, 79.0, 275.0, 491.0, 267.0, 377.0, 423.0, 69.0]
2025-09-12 03:37:20,322 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 43/100 (estimated time remaining: 12 hours, 18 minutes, 6 seconds)
2025-09-12 03:48:39,195 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:48:39,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:50:09,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1003.67206 ± 692.315
2025-09-12 03:50:09,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [2140.817, 88.30366, 504.51398, 1052.8273, 494.0784, 170.88301, 1179.9071, 2183.244, 1187.5193, 1034.6273]
2025-09-12 03:50:09,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [706.0, 52.0, 195.0, 367.0, 204.0, 87.0, 359.0, 683.0, 372.0, 351.0]
2025-09-12 03:50:09,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 44/100 (estimated time remaining: 12 hours, 37 seconds)
2025-09-12 04:01:07,935 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:01:07,936 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:02:02,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 575.99304 ± 456.132
2025-09-12 04:02:02,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [822.1362, 14.86729, 1060.0859, 965.6996, 17.36206, 148.1636, 16.736012, 847.6347, 632.43024, 1234.8145]
2025-09-12 04:02:02,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [297.0, 24.0, 324.0, 335.0, 24.0, 81.0, 20.0, 268.0, 238.0, 421.0]
2025-09-12 04:02:02,048 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 45/100 (estimated time remaining: 11 hours, 36 minutes, 57 seconds)
2025-09-12 04:13:07,742 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:13:07,744 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:14:17,262 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 784.57513 ± 475.681
2025-09-12 04:14:17,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [62.538837, 745.1541, 1226.5035, 1414.4458, 274.4297, 1414.9194, 115.79368, 777.3987, 827.33606, 987.23206]
2025-09-12 04:14:17,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [42.0, 240.0, 407.0, 425.0, 125.0, 471.0, 72.0, 258.0, 273.0, 294.0]
2025-09-12 04:14:17,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 46/100 (estimated time remaining: 11 hours, 22 minutes, 28 seconds)
2025-09-12 04:25:46,758 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:25:46,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:26:13,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 224.46884 ± 258.034
2025-09-12 04:26:13,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [665.5508, 70.810814, 292.68484, 42.10601, 13.139828, 14.485166, 19.049469, 657.4407, 448.158, 21.262852]
2025-09-12 04:26:13,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [251.0, 43.0, 131.0, 42.0, 16.0, 19.0, 22.0, 248.0, 200.0, 21.0]
2025-09-12 04:26:13,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 47/100 (estimated time remaining: 11 hours, 5 minutes, 42 seconds)
2025-09-12 04:37:31,596 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:37:31,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:38:42,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 793.81525 ± 435.404
2025-09-12 04:38:42,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [906.45557, 196.02956, 1171.2997, 272.74915, 1520.6052, 1004.9655, 203.31001, 1197.1696, 632.5919, 832.97656]
2025-09-12 04:38:42,682 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [302.0, 94.0, 409.0, 121.0, 441.0, 296.0, 96.0, 362.0, 232.0, 320.0]
2025-09-12 04:38:42,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 48/100 (estimated time remaining: 10 hours, 50 minutes, 33 seconds)
2025-09-12 04:49:46,401 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:49:46,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:50:48,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 674.63562 ± 442.342
2025-09-12 04:50:48,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [130.32132, 771.023, 637.7178, 13.116633, 971.4398, 575.2858, 1187.2104, 173.33145, 860.1344, 1426.7753]
2025-09-12 04:50:48,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [69.0, 287.0, 267.0, 16.0, 302.0, 222.0, 371.0, 84.0, 299.0, 407.0]
2025-09-12 04:50:48,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 49/100 (estimated time remaining: 10 hours, 30 minutes, 44 seconds)
2025-09-12 05:01:57,900 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:01:57,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:03:06,539 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 767.33075 ± 718.867
2025-09-12 05:03:06,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [984.08856, 71.785065, 20.237211, 2338.2163, 645.2344, 566.3522, 283.23154, 1299.3064, 12.700791, 1452.1552]
2025-09-12 05:03:06,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [343.0, 61.0, 22.0, 714.0, 234.0, 217.0, 128.0, 396.0, 16.0, 449.0]
2025-09-12 05:03:06,549 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 50/100 (estimated time remaining: 10 hours, 22 minutes, 57 seconds)
2025-09-12 05:14:27,248 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:14:27,255 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:15:28,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 686.05042 ± 493.277
2025-09-12 05:15:28,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1058.3632, 1010.9408, 1293.8627, 1159.2117, 1185.0178, 14.47733, 718.2277, 159.51352, 136.99915, 123.89063]
2025-09-12 05:15:28,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [331.0, 327.0, 388.0, 370.0, 363.0, 16.0, 259.0, 90.0, 70.0, 63.0]
2025-09-12 05:15:28,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 51/100 (estimated time remaining: 10 hours, 11 minutes, 48 seconds)
2025-09-12 05:26:47,460 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:26:47,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:28:03,482 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 852.21545 ± 657.799
2025-09-12 05:28:03,483 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [127.76891, 685.1682, 9.588172, 1526.6617, 1947.7885, 1416.7915, 249.54558, 993.1267, 192.24268, 1373.4723]
2025-09-12 05:28:03,483 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [87.0, 250.0, 13.0, 482.0, 602.0, 446.0, 113.0, 353.0, 96.0, 413.0]
2025-09-12 05:28:03,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 52/100 (estimated time remaining: 10 hours, 6 minutes, 1 second)
2025-09-12 05:39:04,842 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:39:04,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:40:07,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 688.58728 ± 407.078
2025-09-12 05:40:07,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [76.55806, 1188.1753, 971.4277, 419.68173, 453.06067, 961.7638, 1109.0272, 645.23773, 19.63725, 1041.3038]
2025-09-12 05:40:07,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [44.0, 389.0, 324.0, 182.0, 219.0, 298.0, 331.0, 224.0, 22.0, 321.0]
2025-09-12 05:40:07,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 53/100 (estimated time remaining: 9 hours, 49 minutes, 38 seconds)
2025-09-12 05:51:32,235 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:51:32,240 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:52:26,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 569.72253 ± 444.042
2025-09-12 05:52:26,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [235.64127, 1130.9843, 139.8744, 1018.4142, 492.75674, 59.36839, 1390.9712, 710.67645, 311.96625, 206.57243]
2025-09-12 05:52:26,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [107.0, 382.0, 73.0, 346.0, 186.0, 39.0, 419.0, 257.0, 137.0, 101.0]
2025-09-12 05:52:26,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 54/100 (estimated time remaining: 9 hours, 39 minutes, 23 seconds)
2025-09-12 06:03:32,173 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:03:32,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:04:43,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 754.19403 ± 495.197
2025-09-12 06:04:43,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1606.308, 620.70105, 751.2799, 1052.7832, 12.975848, 894.7352, 1248.4846, 169.26492, 1046.211, 139.19699]
2025-09-12 06:04:43,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [502.0, 229.0, 281.0, 326.0, 17.0, 328.0, 425.0, 99.0, 368.0, 75.0]
2025-09-12 06:04:43,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 55/100 (estimated time remaining: 9 hours, 26 minutes, 52 seconds)
2025-09-12 06:15:54,344 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:15:54,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:17:53,002 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1307.38635 ± 1103.029
2025-09-12 06:17:53,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [712.1458, 2325.634, 185.25543, 77.1879, 446.33646, 2064.5725, 1082.3041, 2984.4985, 245.91951, 2950.008]
2025-09-12 06:17:53,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [262.0, 722.0, 89.0, 49.0, 196.0, 692.0, 393.0, 927.0, 110.0, 1000.0]
2025-09-12 06:17:53,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 56/100 (estimated time remaining: 9 hours, 21 minutes, 44 seconds)
2025-09-12 06:28:52,678 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:28:52,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:30:09,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 860.90900 ± 556.231
2025-09-12 06:30:09,405 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [672.2777, 1602.5305, 1292.5114, 451.25327, 1414.5332, 135.83351, 313.04767, 88.23142, 1272.8823, 1365.989]
2025-09-12 06:30:09,405 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [264.0, 486.0, 410.0, 189.0, 457.0, 86.0, 135.0, 50.0, 390.0, 404.0]
2025-09-12 06:30:09,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 57/100 (estimated time remaining: 9 hours, 6 minutes, 28 seconds)
2025-09-12 06:41:46,744 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:41:46,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:42:59,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 835.46454 ± 538.424
2025-09-12 06:42:59,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1294.9327, 962.129, 1904.5427, 557.7244, 9.815234, 666.40027, 860.57043, 1243.083, 79.72052, 775.7263]
2025-09-12 06:42:59,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [389.0, 294.0, 596.0, 203.0, 13.0, 246.0, 264.0, 356.0, 46.0, 276.0]
2025-09-12 06:42:59,087 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 58/100 (estimated time remaining: 9 hours, 31 seconds)
2025-09-12 06:54:03,441 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:54:03,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:55:02,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 670.55188 ± 536.666
2025-09-12 06:55:02,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1255.3124, 597.24005, 51.0097, 1179.2205, 1372.7075, 12.282108, 15.8473625, 162.52563, 1127.6251, 931.7478]
2025-09-12 06:55:02,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [386.0, 216.0, 38.0, 381.0, 421.0, 15.0, 18.0, 83.0, 357.0, 314.0]
2025-09-12 06:55:02,628 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 59/100 (estimated time remaining: 8 hours, 45 minutes, 51 seconds)
2025-09-12 07:06:02,231 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:06:02,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:07:01,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 600.65479 ± 542.098
2025-09-12 07:07:01,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [180.39723, 984.58093, 235.52406, 10.877225, 1468.342, 535.5934, 630.65076, 1607.4293, 143.16522, 209.98795]
2025-09-12 07:07:01,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [104.0, 341.0, 136.0, 15.0, 452.0, 202.0, 243.0, 538.0, 75.0, 109.0]
2025-09-12 07:07:01,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 60/100 (estimated time remaining: 8 hours, 30 minutes, 54 seconds)
2025-09-12 07:18:35,972 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:18:35,985 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:20:36,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1330.54871 ± 895.655
2025-09-12 07:20:36,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [262.61664, 493.49695, 1311.3547, 1153.6914, 2656.01, 172.40411, 2795.9368, 1632.8596, 2045.427, 781.6905]
2025-09-12 07:20:36,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [117.0, 198.0, 383.0, 405.0, 894.0, 95.0, 919.0, 528.0, 686.0, 280.0]
2025-09-12 07:20:36,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1226 [INFO]: New best (1330.55) for latency ExtremeClogL1U23
2025-09-12 07:20:36,614 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 61/100 (estimated time remaining: 8 hours, 21 minutes, 48 seconds)
2025-09-12 07:31:51,678 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:31:51,680 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:32:39,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 490.89105 ± 385.502
2025-09-12 07:32:39,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [318.75052, 776.28265, 413.2035, 138.24457, 633.5987, 203.2578, 383.3642, 1481.0532, 455.36, 105.796135]
2025-09-12 07:32:39,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [131.0, 245.0, 160.0, 80.0, 219.0, 94.0, 152.0, 451.0, 175.0, 72.0]
2025-09-12 07:32:39,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 62/100 (estimated time remaining: 8 hours, 7 minutes, 32 seconds)
2025-09-12 07:43:35,621 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:43:35,624 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:44:29,385 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 582.66565 ± 399.079
2025-09-12 07:44:29,386 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [825.3021, 918.69244, 66.50044, 456.18845, 222.13548, 430.72522, 1185.193, 1127.2883, 582.67664, 11.953737]
2025-09-12 07:44:29,386 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [269.0, 284.0, 39.0, 175.0, 122.0, 172.0, 388.0, 369.0, 208.0, 18.0]
2025-09-12 07:44:29,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 63/100 (estimated time remaining: 7 hours, 47 minutes, 26 seconds)
2025-09-12 07:55:57,733 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:55:57,739 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:56:35,324 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 365.23361 ± 341.359
2025-09-12 07:56:35,324 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [200.42392, 428.25546, 1027.3326, 194.58856, 12.954521, 310.26535, 17.503887, 589.5513, 859.0109, 12.449658]
2025-09-12 07:56:35,324 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [95.0, 181.0, 378.0, 104.0, 15.0, 132.0, 20.0, 216.0, 279.0, 20.0]
2025-09-12 07:56:35,351 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 64/100 (estimated time remaining: 7 hours, 35 minutes, 26 seconds)
2025-09-12 08:07:26,404 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:07:26,408 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:08:56,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 995.85370 ± 952.321
2025-09-12 08:08:56,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1513.4327, 3048.3296, 818.8588, 157.59666, 220.27678, 204.20917, 2248.411, 699.6177, 1035.8839, 11.921308]
2025-09-12 08:08:56,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [457.0, 1000.0, 299.0, 78.0, 110.0, 98.0, 729.0, 257.0, 374.0, 17.0]
2025-09-12 08:08:56,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 65/100 (estimated time remaining: 7 hours, 25 minutes, 42 seconds)
2025-09-12 08:19:39,547 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:19:39,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:20:43,231 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 720.41199 ± 635.335
2025-09-12 08:20:43,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [572.2764, 143.18228, 1434.1646, 299.9912, 58.64929, 270.5008, 2007.8339, 980.3664, 1279.7308, 157.42426]
2025-09-12 08:20:43,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [212.0, 83.0, 415.0, 126.0, 56.0, 137.0, 622.0, 283.0, 413.0, 93.0]
2025-09-12 08:20:43,261 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 66/100 (estimated time remaining: 7 hours, 46 seconds)
2025-09-12 08:31:34,436 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:31:34,439 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:32:49,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 841.44592 ± 813.279
2025-09-12 08:32:49,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [2059.3838, 94.65767, 16.282564, 2026.4352, 14.735266, 250.7888, 39.372505, 1598.6293, 1015.95105, 1298.2223]
2025-09-12 08:32:49,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [639.0, 54.0, 18.0, 670.0, 17.0, 113.0, 29.0, 503.0, 367.0, 448.0]
2025-09-12 08:32:49,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 67/100 (estimated time remaining: 6 hours, 49 minutes, 7 seconds)
2025-09-12 08:43:39,071 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:43:39,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:45:15,642 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1187.35718 ± 618.280
2025-09-12 08:45:15,643 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [497.19388, 968.5394, 1047.8545, 611.94244, 1360.3422, 1269.863, 2080.4832, 198.40157, 2097.6582, 1741.2931]
2025-09-12 08:45:15,643 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [177.0, 284.0, 314.0, 206.0, 431.0, 378.0, 609.0, 92.0, 634.0, 525.0]
2025-09-12 08:45:15,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 68/100 (estimated time remaining: 6 hours, 41 minutes, 5 seconds)
2025-09-12 08:56:26,570 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:56:26,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:58:03,872 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1097.00684 ± 998.090
2025-09-12 08:58:03,874 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [543.2494, 1668.3873, 2422.18, 69.360466, 3027.873, 45.57057, 326.01923, 1114.9312, 1580.0264, 172.4718]
2025-09-12 08:58:03,874 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [218.0, 550.0, 749.0, 41.0, 1000.0, 32.0, 139.0, 389.0, 530.0, 87.0]
2025-09-12 08:58:03,888 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 69/100 (estimated time remaining: 6 hours, 33 minutes, 26 seconds)
2025-09-12 09:08:50,697 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:08:50,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:09:41,079 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 570.24268 ± 475.359
2025-09-12 09:09:41,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [988.75073, 608.6988, 700.63226, 182.02878, 73.53492, 1312.3724, 526.2851, 1278.2855, 11.41031, 20.428171]
2025-09-12 09:09:41,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [316.0, 217.0, 253.0, 88.0, 43.0, 391.0, 201.0, 379.0, 19.0, 22.0]
2025-09-12 09:09:41,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 70/100 (estimated time remaining: 6 hours, 16 minutes, 39 seconds)
2025-09-12 09:20:43,845 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:20:43,847 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:22:19,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1167.49280 ± 336.945
2025-09-12 09:22:19,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1457.6117, 966.28955, 1067.687, 1331.9747, 1243.4104, 1344.4382, 1072.0481, 364.8725, 1128.316, 1698.2794]
2025-09-12 09:22:19,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [437.0, 299.0, 374.0, 404.0, 376.0, 416.0, 325.0, 145.0, 338.0, 510.0]
2025-09-12 09:22:19,774 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 71/100 (estimated time remaining: 6 hours, 9 minutes, 39 seconds)
2025-09-12 09:33:20,928 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:33:20,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:34:36,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 887.91553 ± 913.713
2025-09-12 09:34:36,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [841.49664, 16.749653, 21.622051, 161.6939, 17.860415, 1195.3138, 1194.9945, 2163.4854, 2778.983, 486.95572]
2025-09-12 09:34:36,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [303.0, 22.0, 23.0, 83.0, 21.0, 404.0, 380.0, 652.0, 829.0, 189.0]
2025-09-12 09:34:36,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 72/100 (estimated time remaining: 5 hours, 58 minutes, 19 seconds)
2025-09-12 09:45:39,393 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:45:39,396 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:46:34,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 605.59143 ± 718.169
2025-09-12 09:46:34,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [40.49382, 19.750387, 443.33093, 246.15912, 1573.7914, 19.805433, 690.44116, 2242.7449, 16.630335, 762.7667]
2025-09-12 09:46:34,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [39.0, 24.0, 174.0, 119.0, 475.0, 22.0, 230.0, 727.0, 18.0, 256.0]
2025-09-12 09:46:34,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 73/100 (estimated time remaining: 5 hours, 43 minutes, 20 seconds)
2025-09-12 09:57:40,821 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:57:40,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:58:46,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 721.47327 ± 698.558
2025-09-12 09:58:46,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [209.04962, 56.52587, 812.5817, 15.7480545, 240.95242, 1296.2496, 724.0212, 1959.0112, 1807.986, 92.60741]
2025-09-12 09:58:46,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [98.0, 34.0, 281.0, 21.0, 106.0, 428.0, 271.0, 654.0, 558.0, 54.0]
2025-09-12 09:58:46,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 74/100 (estimated time remaining: 5 hours, 27 minutes, 48 seconds)
2025-09-12 10:10:03,175 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:10:03,185 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:11:13,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 779.59485 ± 598.125
2025-09-12 10:11:13,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [855.11676, 12.411202, 1508.5206, 202.41025, 239.59328, 1122.1195, 72.085594, 1464.2344, 1634.8893, 684.5674]
2025-09-12 10:11:13,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [303.0, 22.0, 501.0, 95.0, 114.0, 359.0, 42.0, 450.0, 552.0, 248.0]
2025-09-12 10:11:13,870 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 75/100 (estimated time remaining: 5 hours, 20 minutes, 2 seconds)
2025-09-12 10:21:30,119 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:21:30,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:22:20,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 542.25287 ± 456.350
2025-09-12 10:22:20,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [508.4391, 222.4469, 191.06548, 838.4471, 1108.7168, 86.628365, 1002.6793, 1309.8496, 23.65598, 130.60057]
2025-09-12 10:22:20,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [197.0, 107.0, 96.0, 298.0, 378.0, 58.0, 309.0, 402.0, 21.0, 68.0]
2025-09-12 10:22:20,600 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 76/100 (estimated time remaining: 5 hours, 4 seconds)
2025-09-12 10:33:16,090 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:33:16,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:34:20,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 756.37720 ± 536.306
2025-09-12 10:34:20,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1709.4391, 169.28175, 217.76761, 1101.2964, 1337.188, 120.536995, 1088.0566, 164.17882, 800.1477, 855.8794]
2025-09-12 10:34:20,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [516.0, 87.0, 110.0, 339.0, 397.0, 64.0, 348.0, 81.0, 275.0, 273.0]
2025-09-12 10:34:20,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 77/100 (estimated time remaining: 4 hours, 46 minutes, 45 seconds)
2025-09-12 10:45:56,693 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:45:56,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:46:48,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 509.06235 ± 278.750
2025-09-12 10:46:48,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [977.1549, 713.17914, 342.35733, 667.0163, 176.74126, 920.4802, 353.84, 354.46725, 147.47617, 437.91104]
2025-09-12 10:46:48,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [406.0, 240.0, 142.0, 238.0, 90.0, 315.0, 165.0, 147.0, 73.0, 173.0]
2025-09-12 10:46:48,403 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 78/100 (estimated time remaining: 4 hours, 37 minutes, 4 seconds)
2025-09-12 10:57:24,113 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:57:24,118 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:58:11,401 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 460.56183 ± 345.539
2025-09-12 10:58:11,401 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1031.2378, 340.23492, 287.54678, 207.42896, 1054.8142, 783.74304, 395.35104, 186.31085, 303.45383, 15.497173]
2025-09-12 10:58:11,401 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [344.0, 167.0, 126.0, 117.0, 334.0, 279.0, 190.0, 93.0, 126.0, 24.0]
2025-09-12 10:58:11,411 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 79/100 (estimated time remaining: 4 hours, 21 minutes, 27 seconds)
2025-09-12 11:09:11,035 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:09:11,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:10:26,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 860.03400 ± 468.530
2025-09-12 11:10:26,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [865.9345, 1041.7155, 957.1093, 1403.9592, 1098.8116, 9.866714, 1366.3645, 700.37476, 14.0767975, 1142.1272]
2025-09-12 11:10:26,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [295.0, 364.0, 302.0, 429.0, 381.0, 15.0, 425.0, 254.0, 17.0, 392.0]
2025-09-12 11:10:26,129 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 80/100 (estimated time remaining: 4 hours, 8 minutes, 39 seconds)
2025-09-12 11:21:16,316 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:21:16,318 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:22:21,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 681.24078 ± 533.248
2025-09-12 11:22:21,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [56.69251, 700.2032, 664.5535, 688.3403, 432.84604, 1672.2048, 780.83966, 209.12047, 1560.8439, 46.76338]
2025-09-12 11:22:21,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [57.0, 259.0, 243.0, 248.0, 190.0, 506.0, 285.0, 102.0, 544.0, 44.0]
2025-09-12 11:22:21,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 81/100 (estimated time remaining: 4 hours, 4 seconds)
2025-09-12 11:33:47,974 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:33:47,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:34:39,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 556.30902 ± 383.124
2025-09-12 11:34:39,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [336.7949, 74.50092, 604.84015, 800.8846, 11.097837, 831.60785, 908.6177, 521.8346, 206.3232, 1266.589]
2025-09-12 11:34:39,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [138.0, 69.0, 231.0, 275.0, 15.0, 301.0, 274.0, 187.0, 114.0, 388.0]
2025-09-12 11:34:39,820 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 82/100 (estimated time remaining: 3 hours, 49 minutes, 12 seconds)
2025-09-12 11:45:08,514 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:45:08,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:46:29,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 957.68298 ± 622.941
2025-09-12 11:46:29,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1534.6068, 966.3589, 1325.3994, 126.03676, 985.82434, 910.559, 852.2999, 42.949814, 2267.6016, 565.19385]
2025-09-12 11:46:29,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [471.0, 346.0, 460.0, 67.0, 327.0, 264.0, 256.0, 31.0, 686.0, 206.0]
2025-09-12 11:46:29,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 83/100 (estimated time remaining: 3 hours, 34 minutes, 50 seconds)
2025-09-12 11:57:44,100 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:57:44,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:58:58,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 878.22528 ± 801.566
2025-09-12 11:58:58,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [2593.5369, 1221.3091, 1441.7151, 1678.9452, 76.74419, 412.99066, 836.22125, 351.11606, 152.14212, 17.53204]
2025-09-12 11:58:58,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [801.0, 349.0, 434.0, 516.0, 56.0, 167.0, 293.0, 145.0, 74.0, 23.0]
2025-09-12 11:58:58,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 84/100 (estimated time remaining: 3 hours, 26 minutes, 40 seconds)
2025-09-12 12:09:41,903 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:09:41,904 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:10:42,764 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 754.94214 ± 365.687
2025-09-12 12:10:42,764 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1068.8534, 102.02312, 1167.8761, 861.8499, 848.30194, 1005.547, 129.76419, 459.24054, 919.5243, 986.44073]
2025-09-12 12:10:42,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [312.0, 54.0, 349.0, 259.0, 257.0, 306.0, 69.0, 170.0, 268.0, 292.0]
2025-09-12 12:10:42,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 85/100 (estimated time remaining: 3 hours, 12 minutes, 53 seconds)
2025-09-12 12:21:56,570 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:21:56,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:22:53,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 682.23102 ± 542.162
2025-09-12 12:22:53,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1077.8726, 15.604809, 1341.5203, 907.70215, 1564.1166, 15.349019, 460.12787, 304.16714, 113.98386, 1021.8658]
2025-09-12 12:22:53,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [335.0, 19.0, 377.0, 272.0, 474.0, 19.0, 194.0, 144.0, 60.0, 296.0]
2025-09-12 12:22:53,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 1 minute, 36 seconds)
2025-09-12 12:33:35,853 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:33:35,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:34:32,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 671.47003 ± 480.392
2025-09-12 12:34:32,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1027.7881, 771.3513, 91.62916, 174.1844, 995.2307, 18.617472, 143.3259, 1011.61584, 1156.5142, 1324.4426]
2025-09-12 12:34:32,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [334.0, 237.0, 53.0, 84.0, 309.0, 24.0, 73.0, 290.0, 363.0, 369.0]
2025-09-12 12:34:32,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 87/100 (estimated time remaining: 2 hours, 47 minutes, 38 seconds)
2025-09-12 12:45:37,673 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:45:37,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:46:50,105 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 848.48938 ± 423.567
2025-09-12 12:46:50,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1469.1725, 915.1383, 73.20627, 737.636, 223.41772, 1180.4221, 1059.6548, 1151.9915, 1113.8849, 560.3698]
2025-09-12 12:46:50,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [448.0, 312.0, 42.0, 278.0, 101.0, 335.0, 316.0, 386.0, 342.0, 203.0]
2025-09-12 12:46:50,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 88/100 (estimated time remaining: 2 hours, 36 minutes, 54 seconds)
2025-09-12 12:58:04,183 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:58:04,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:59:25,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 984.86218 ± 638.972
2025-09-12 12:59:25,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [814.0903, 811.75226, 759.0377, 1638.5793, 62.078007, 1091.2876, 548.95636, 2175.8147, 1706.2744, 240.75143]
2025-09-12 12:59:25,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [248.0, 246.0, 239.0, 483.0, 40.0, 340.0, 198.0, 672.0, 536.0, 104.0]
2025-09-12 12:59:25,376 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 89/100 (estimated time remaining: 2 hours, 25 minutes, 4 seconds)
2025-09-12 13:10:38,240 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:10:38,243 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:11:41,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 754.78290 ± 504.916
2025-09-12 13:11:41,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1173.8772, 268.8484, 1003.52466, 778.98914, 59.046352, 18.500467, 1309.4125, 392.07468, 1077.7827, 1465.7732]
2025-09-12 13:11:41,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [364.0, 112.0, 309.0, 250.0, 41.0, 24.0, 418.0, 185.0, 312.0, 411.0]
2025-09-12 13:11:41,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 14 minutes, 9 seconds)
2025-09-12 13:22:29,848 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:22:29,851 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:23:29,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 680.05994 ± 513.660
2025-09-12 13:23:29,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [293.41464, 1052.701, 422.61087, 1166.13, 1438.3622, 350.29224, 1468.6423, 358.8129, 230.05089, 19.582422]
2025-09-12 13:23:29,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [123.0, 312.0, 165.0, 346.0, 401.0, 145.0, 498.0, 149.0, 113.0, 22.0]
2025-09-12 13:23:29,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 1 minute, 10 seconds)
2025-09-12 13:34:29,903 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:34:29,911 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:35:43,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 852.46082 ± 769.002
2025-09-12 13:35:43,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [582.6497, 577.96967, 98.05507, 1265.0398, 1037.1238, 1953.4031, 227.6888, 2415.3271, 345.75113, 21.59971]
2025-09-12 13:35:43,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [223.0, 208.0, 53.0, 393.0, 313.0, 589.0, 103.0, 741.0, 143.0, 23.0]
2025-09-12 13:35:43,045 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 92/100 (estimated time remaining: 1 hour, 50 minutes, 7 seconds)
2025-09-12 13:46:17,396 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:46:17,415 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:47:02,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 466.98907 ± 427.555
2025-09-12 13:47:02,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [17.714127, 955.1447, 75.823784, 990.5492, 16.031275, 166.12149, 307.92627, 942.5502, 162.80295, 1035.2268]
2025-09-12 13:47:02,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [21.0, 384.0, 61.0, 345.0, 23.0, 80.0, 128.0, 294.0, 86.0, 324.0]
2025-09-12 13:47:02,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 36 minutes, 20 seconds)
2025-09-12 13:58:24,327 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:58:24,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:59:44,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 874.55127 ± 659.668
2025-09-12 13:59:44,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [84.32742, 2043.6968, 1715.2185, 912.0364, 166.50674, 467.23822, 1499.6838, 120.878365, 981.9903, 753.93634]
2025-09-12 13:59:44,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [47.0, 686.0, 587.0, 321.0, 82.0, 179.0, 510.0, 66.0, 356.0, 259.0]
2025-09-12 13:59:44,511 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 24 minutes, 26 seconds)
2025-09-12 14:10:47,322 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:10:47,324 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:12:10,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 978.31726 ± 1008.256
2025-09-12 14:12:10,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [309.7762, 1992.7241, 3236.3726, 336.38794, 10.954466, 1054.3658, 531.4191, 43.018494, 387.63498, 1880.5194]
2025-09-12 14:12:10,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [131.0, 624.0, 1000.0, 135.0, 15.0, 330.0, 191.0, 36.0, 151.0, 540.0]
2025-09-12 14:12:10,684 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 12 minutes, 34 seconds)
2025-09-12 14:22:45,389 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:22:45,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:23:46,576 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 687.79529 ± 323.172
2025-09-12 14:23:46,576 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [1004.3154, 921.2407, 967.55084, 433.90054, 826.47766, 149.23521, 80.002716, 855.261, 768.44727, 871.5215]
2025-09-12 14:23:46,576 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [297.0, 269.0, 287.0, 171.0, 274.0, 102.0, 63.0, 299.0, 272.0, 288.0]
2025-09-12 14:23:46,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 17 seconds)
2025-09-12 14:35:15,703 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:35:15,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:36:26,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 809.61914 ± 569.368
2025-09-12 14:36:26,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [885.8225, 969.8759, 2018.664, 249.8452, 1142.3192, 998.8186, 874.81305, 15.608456, 10.001895, 930.4228]
2025-09-12 14:36:26,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [298.0, 294.0, 667.0, 133.0, 395.0, 286.0, 311.0, 24.0, 12.0, 290.0]
2025-09-12 14:36:26,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 97/100 (estimated time remaining: 48 minutes, 34 seconds)
2025-09-12 14:47:16,551 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:47:16,554 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:48:51,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1101.38843 ± 905.358
2025-09-12 14:48:51,697 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [848.4096, 490.73114, 219.56671, 1863.2594, 1474.045, 62.108356, 1163.1838, 1482.6066, 237.84833, 3172.1248]
2025-09-12 14:48:51,697 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [272.0, 187.0, 122.0, 553.0, 482.0, 38.0, 391.0, 491.0, 104.0, 997.0]
2025-09-12 14:48:51,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 98/100 (estimated time remaining: 37 minutes, 5 seconds)
2025-09-12 14:59:34,350 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:59:34,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:00:26,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 559.39758 ± 831.260
2025-09-12 15:00:26,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [223.22432, 15.692356, 28.85738, 95.76341, 19.399025, 2869.8894, 851.22504, 786.2987, 627.5689, 76.05763]
2025-09-12 15:00:26,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [99.0, 18.0, 25.0, 52.0, 24.0, 916.0, 293.0, 262.0, 253.0, 54.0]
2025-09-12 15:00:26,963 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 99/100 (estimated time remaining: 24 minutes, 16 seconds)
2025-09-12 15:11:34,614 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:11:34,616 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:13:04,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 1116.73438 ± 664.316
2025-09-12 15:13:04,473 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [2606.1895, 947.815, 123.95346, 752.0726, 1834.3059, 1203.2305, 752.6905, 1031.2424, 548.7257, 1367.1177]
2025-09-12 15:13:04,473 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [807.0, 281.0, 66.0, 239.0, 520.0, 358.0, 242.0, 322.0, 196.0, 374.0]
2025-09-12 15:13:04,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1199 [INFO]: Iteration 100/100 (estimated time remaining: 12 minutes, 10 seconds)
2025-09-12 15:24:01,030 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:24:01,033 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:24:42,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1221 [DEBUG]: Total Reward: 415.45978 ± 312.312
2025-09-12 15:24:42,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1222 [DEBUG]: All rewards: [318.19598, 12.63821, 716.5815, 533.0512, 23.900198, 99.32387, 387.59097, 765.7598, 975.53235, 322.0233]
2025-09-12 15:24:42,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1223 [DEBUG]: All trajectory lengths: [131.0, 16.0, 259.0, 200.0, 23.0, 57.0, 160.0, 269.0, 333.0, 140.0]
2025-09-12 15:24:42,513 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-hopper):1251 [DEBUG]: Training session finished
