2025-09-11 18:08:03,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc10-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:08:03,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc10-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:08:03,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x150d1c9f4190>}
2025-09-11 18:08:03,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1111 [DEBUG]: using device: cuda
2025-09-11 18:08:03,807 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1133 [INFO]: Creating new trainer
2025-09-11 18:08:03,826 baseline-mbpac-noiseperc10-ant:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 18:08:03,827 baseline-mbpac-noiseperc10-ant:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=35, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 18:08:03,838 baseline-mbpac-noiseperc10-ant:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=27, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=8, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 18:08:04,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1194 [DEBUG]: Starting training session...
2025-09-11 18:08:04,744 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 1/100
2025-09-11 18:20:00,439 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:20:00,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:21:26,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: -178.59496 ± 227.752
2025-09-11 18:21:26,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [-66.13821, -44.72732, -565.35754, -395.48105, -17.11312, -14.361462, -10.29378, -592.6309, -51.272205, -28.573929]
2025-09-11 18:21:26,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [87.0, 73.0, 1000.0, 677.0, 15.0, 62.0, 10.0, 1000.0, 63.0, 38.0]
2025-09-11 18:21:26,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (-178.59) for latency ExtremeClogL1U23
2025-09-11 18:21:27,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 2/100 (estimated time remaining: 22 hours, 3 minutes, 47 seconds)
2025-09-11 18:34:31,885 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:34:31,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:36:27,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: -14.99938 ± 45.597
2025-09-11 18:36:27,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [-28.282782, -21.308474, -93.914696, 11.717193, 37.453506, -50.810825, 52.063766, -11.069178, -72.726685, 26.8844]
2025-09-11 18:36:27,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [101.0, 274.0, 1000.0, 108.0, 473.0, 162.0, 418.0, 140.0, 1000.0, 301.0]
2025-09-11 18:36:27,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (-15.00) for latency ExtremeClogL1U23
2025-09-11 18:36:27,879 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 3/100 (estimated time remaining: 23 hours, 10 minutes, 53 seconds)
2025-09-11 18:49:16,160 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:49:16,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:51:22,265 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 87.98910 ± 53.610
2025-09-11 18:51:22,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [176.80725, 74.600204, 93.38724, 190.88684, 85.83983, 82.67808, 63.85096, 40.301365, 6.5089464, 65.03024]
2025-09-11 18:51:22,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [459.0, 180.0, 1000.0, 466.0, 153.0, 1000.0, 300.0, 62.0, 305.0, 362.0]
2025-09-11 18:51:22,271 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (87.99) for latency ExtremeClogL1U23
2025-09-11 18:51:22,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 4/100 (estimated time remaining: 23 hours, 19 minutes, 47 seconds)
2025-09-11 19:04:26,764 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:04:26,768 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:06:34,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 109.70872 ± 113.743
2025-09-11 19:06:34,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [35.260334, 231.55855, 211.3518, 51.272293, 25.642633, 3.9832373, 174.88345, 337.77835, -18.14454, 43.501083]
2025-09-11 19:06:34,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [318.0, 380.0, 1000.0, 120.0, 42.0, 22.0, 1000.0, 1000.0, 410.0, 83.0]
2025-09-11 19:06:34,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (109.71) for latency ExtremeClogL1U23
2025-09-11 19:06:34,200 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 5/100 (estimated time remaining: 23 hours, 23 minutes, 46 seconds)
2025-09-11 19:19:35,202 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:19:35,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:21:44,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 140.43442 ± 67.878
2025-09-11 19:21:44,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [108.23513, 58.91997, 78.89745, 60.20829, 216.71759, 199.19057, 104.01186, 268.1661, 134.60303, 175.39427]
2025-09-11 19:21:44,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [259.0, 211.0, 132.0, 171.0, 386.0, 1000.0, 112.0, 1000.0, 241.0, 1000.0]
2025-09-11 19:21:44,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (140.43) for latency ExtremeClogL1U23
2025-09-11 19:21:44,693 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 6/100 (estimated time remaining: 23 hours, 19 minutes, 39 seconds)
2025-09-11 19:34:57,770 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:34:57,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:36:56,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 200.80663 ± 168.082
2025-09-11 19:36:56,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [6.050423, 117.67134, 500.4055, 448.68396, 151.61183, 146.6323, 35.626564, 12.640205, 272.94684, 315.79724]
2025-09-11 19:36:56,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [26.0, 243.0, 1000.0, 1000.0, 198.0, 190.0, 47.0, 23.0, 383.0, 1000.0]
2025-09-11 19:36:56,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (200.81) for latency ExtremeClogL1U23
2025-09-11 19:36:56,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 7/100 (estimated time remaining: 23 hours, 39 minutes, 22 seconds)
2025-09-11 19:49:58,477 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:49:58,481 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:52:25,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 237.23367 ± 167.061
2025-09-11 19:52:25,691 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [93.5824, 411.44473, 94.00827, 337.95755, 23.733284, 448.09464, 375.47513, 423.59354, 29.739738, 134.70734]
2025-09-11 19:52:25,691 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [105.0, 1000.0, 196.0, 1000.0, 44.0, 1000.0, 519.0, 1000.0, 48.0, 264.0]
2025-09-11 19:52:25,691 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (237.23) for latency ExtremeClogL1U23
2025-09-11 19:52:25,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 8/100 (estimated time remaining: 23 hours, 32 minutes, 55 seconds)
2025-09-11 20:05:22,304 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:05:22,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:08:40,500 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 496.83124 ± 322.828
2025-09-11 20:08:40,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [129.03851, 766.0867, 96.995995, 917.73755, 33.07732, 391.80127, 464.5252, 1000.00214, 613.44025, 555.6073]
2025-09-11 20:08:40,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [159.0, 1000.0, 103.0, 1000.0, 39.0, 1000.0, 1000.0, 1000.0, 1000.0, 676.0]
2025-09-11 20:08:40,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (496.83) for latency ExtremeClogL1U23
2025-09-11 20:08:40,531 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 9/100 (estimated time remaining: 23 hours, 42 minutes, 23 seconds)
2025-09-11 20:21:26,168 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:21:26,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:23:29,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 337.81610 ± 244.485
2025-09-11 20:23:29,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [56.618473, 485.12234, 428.0678, 490.00983, 52.81757, 84.77967, 108.65469, 429.45923, 843.5121, 399.11975]
2025-09-11 20:23:29,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [69.0, 481.0, 432.0, 1000.0, 43.0, 85.0, 102.0, 1000.0, 700.0, 355.0]
2025-09-11 20:23:29,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 10/100 (estimated time remaining: 23 hours, 19 minutes, 59 seconds)
2025-09-11 20:36:22,332 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:36:22,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:40:21,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 791.13098 ± 332.983
2025-09-11 20:40:21,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1269.0885, 582.1293, 1152.4224, 493.2093, 866.0488, 511.0213, 352.63187, 1079.8768, 443.71033, 1161.1719]
2025-09-11 20:40:21,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 521.0, 1000.0, 1000.0, 358.0, 1000.0, 369.0, 1000.0]
2025-09-11 20:40:21,454 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (791.13) for latency ExtremeClogL1U23
2025-09-11 20:40:21,474 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 11/100 (estimated time remaining: 23 hours, 35 minutes, 2 seconds)
2025-09-11 20:53:35,710 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:53:35,714 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:55:49,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 428.50177 ± 333.527
2025-09-11 20:55:49,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [462.4459, 52.268093, 830.45245, 752.4176, 134.7894, 208.13193, 552.48114, 1024.4115, 23.76447, 243.85522]
2025-09-11 20:55:49,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 72.0, 1000.0, 520.0, 130.0, 251.0, 466.0, 1000.0, 34.0, 172.0]
2025-09-11 20:55:49,853 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 12/100 (estimated time remaining: 23 hours, 24 minutes, 5 seconds)
2025-09-11 21:08:45,386 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:08:45,398 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:10:16,976 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 359.53238 ± 383.214
2025-09-11 21:10:16,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [19.12813, 230.77579, 365.02798, 37.376575, 906.06226, 103.18143, 522.1358, 25.835234, 189.02249, 1196.7783]
2025-09-11 21:10:16,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [28.0, 154.0, 301.0, 34.0, 1000.0, 93.0, 398.0, 31.0, 133.0, 1000.0]
2025-09-11 21:10:16,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 13/100 (estimated time remaining: 22 hours, 50 minutes, 14 seconds)
2025-09-11 21:22:58,401 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:22:58,403 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:25:34,054 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 435.07617 ± 239.348
2025-09-11 21:25:34,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [391.79434, 256.24893, 833.08514, 724.4086, 192.67938, 518.8126, 200.52905, 574.4552, 588.55286, 70.19536]
2025-09-11 21:25:34,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [254.0, 172.0, 1000.0, 482.0, 155.0, 1000.0, 178.0, 1000.0, 1000.0, 126.0]
2025-09-11 21:25:34,085 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 14/100 (estimated time remaining: 22 hours, 17 minutes, 55 seconds)
2025-09-11 21:38:27,980 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:38:27,984 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:41:35,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 688.18982 ± 295.987
2025-09-11 21:41:35,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [878.3902, 685.6822, 686.5002, 452.95114, 503.40695, 1298.191, 627.9822, 629.97687, 970.58356, 148.23384]
2025-09-11 21:41:35,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [709.0, 409.0, 491.0, 427.0, 404.0, 1000.0, 1000.0, 1000.0, 1000.0, 118.0]
2025-09-11 21:41:35,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 15/100 (estimated time remaining: 22 hours, 23 minutes, 19 seconds)
2025-09-11 21:54:44,287 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:54:44,292 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:58:42,629 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 703.58234 ± 388.542
2025-09-11 21:58:42,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [522.9577, 527.0025, 574.8342, 354.07306, 839.3566, 1493.7874, 664.4108, 22.922667, 991.74713, 1044.7313]
2025-09-11 21:58:42,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 261.0, 1000.0, 1000.0, 1000.0, 22.0, 1000.0, 873.0]
2025-09-11 21:58:42,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 16/100 (estimated time remaining: 22 hours, 11 minutes, 59 seconds)
2025-09-11 22:11:40,537 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:11:40,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:14:42,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 629.83984 ± 296.280
2025-09-11 22:14:42,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1017.3735, 546.6392, 837.25476, 228.18489, 683.3214, 1014.025, 98.1891, 782.38806, 696.77997, 394.24292]
2025-09-11 22:14:42,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [579.0, 336.0, 1000.0, 168.0, 1000.0, 793.0, 55.0, 408.0, 1000.0, 1000.0]
2025-09-11 22:14:42,871 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 17/100 (estimated time remaining: 22 hours, 5 minutes, 14 seconds)
2025-09-11 22:27:21,188 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:27:21,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:31:07,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 802.75848 ± 444.748
2025-09-11 22:31:07,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [474.09158, 1152.6656, 567.2469, 163.82835, 1810.1351, 613.13495, 502.57242, 675.6558, 1133.5198, 934.73413]
2025-09-11 22:31:07,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 589.0, 1000.0, 90.0, 949.0, 1000.0, 272.0, 1000.0, 1000.0, 1000.0]
2025-09-11 22:31:07,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (802.76) for latency ExtremeClogL1U23
2025-09-11 22:31:07,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 18/100 (estimated time remaining: 22 hours, 21 minutes, 53 seconds)
2025-09-11 22:44:39,127 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:44:39,131 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:47:09,991 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 429.09073 ± 396.262
2025-09-11 22:47:09,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [215.39052, 1454.931, 72.35384, 185.9686, 626.1568, 492.98264, 219.1002, 47.698895, 335.84958, 640.47485]
2025-09-11 22:47:09,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [110.0, 815.0, 74.0, 180.0, 1000.0, 1000.0, 123.0, 28.0, 1000.0, 1000.0]
2025-09-11 22:47:10,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 19/100 (estimated time remaining: 22 hours, 18 minutes, 13 seconds)
2025-09-11 22:59:18,800 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:59:18,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:01:13,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 475.90289 ± 509.423
2025-09-11 23:01:13,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [307.84592, 1005.1226, 195.70715, 60.958385, 484.39975, 1777.8967, 173.0875, 63.29023, 495.0725, 195.64789]
2025-09-11 23:01:13,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [275.0, 1000.0, 123.0, 33.0, 307.0, 1000.0, 97.0, 34.0, 1000.0, 111.0]
2025-09-11 23:01:13,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 20/100 (estimated time remaining: 21 hours, 30 minutes, 2 seconds)
2025-09-11 23:14:15,807 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:14:15,811 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:16:55,260 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 457.04843 ± 347.233
2025-09-11 23:16:55,262 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [514.0725, 22.348742, 203.63594, 478.1475, 609.0152, 10.398977, 511.91446, 641.8197, 305.85355, 1273.2776]
2025-09-11 23:16:55,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 16.0, 104.0, 256.0, 1000.0, 21.0, 1000.0, 1000.0, 192.0, 1000.0]
2025-09-11 23:16:55,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 21/100 (estimated time remaining: 20 hours, 51 minutes, 22 seconds)
2025-09-11 23:30:11,799 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:30:11,803 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:32:50,062 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 374.62372 ± 265.933
2025-09-11 23:32:50,063 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [497.5406, 148.40298, 368.70865, 80.69176, 269.8952, 558.0225, 925.8337, 271.46408, 6.4252515, 619.25226]
2025-09-11 23:32:50,063 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 95.0, 1000.0, 84.0, 202.0, 1000.0, 1000.0, 159.0, 13.0, 1000.0]
2025-09-11 23:32:50,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 22/100 (estimated time remaining: 20 hours, 34 minutes, 17 seconds)
2025-09-11 23:45:36,995 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:45:37,004 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:47:23,621 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 421.62997 ± 275.369
2025-09-11 23:47:23,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [332.38492, 306.6461, 388.4829, 676.263, 124.227974, 186.93384, 454.43506, 1110.8193, 195.34644, 440.7604]
2025-09-11 23:47:23,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [139.0, 160.0, 164.0, 347.0, 85.0, 105.0, 1000.0, 551.0, 132.0, 1000.0]
2025-09-11 23:47:23,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 23/100 (estimated time remaining: 19 hours, 49 minutes, 51 seconds)
2025-09-12 00:00:07,899 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:00:07,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:02:40,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 693.10876 ± 445.827
2025-09-12 00:02:40,313 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [450.1644, 1274.0966, 415.86703, 355.72293, 1060.0066, 96.733536, 1484.0508, 837.61664, 185.65291, 771.17615]
2025-09-12 00:02:40,313 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [260.0, 1000.0, 185.0, 189.0, 488.0, 50.0, 1000.0, 1000.0, 75.0, 1000.0]
2025-09-12 00:02:40,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 24/100 (estimated time remaining: 19 hours, 22 minutes, 46 seconds)
2025-09-12 00:16:16,625 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:16:16,635 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:19:39,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 749.29889 ± 471.908
2025-09-12 00:19:39,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [306.1599, 572.5968, 883.3585, 553.77875, 1921.1265, 390.6831, 430.82053, 1196.8237, 389.0034, 848.63763]
2025-09-12 00:19:39,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 254.0, 381.0, 290.0, 996.0, 1000.0, 1000.0, 1000.0, 211.0, 1000.0]
2025-09-12 00:19:39,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 25/100 (estimated time remaining: 19 hours, 52 minutes, 5 seconds)
2025-09-12 00:32:34,635 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:32:34,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:34:41,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 840.27893 ± 569.421
2025-09-12 00:34:41,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [255.07985, 66.60793, 625.6852, 248.23157, 338.64917, 1226.4064, 1641.4404, 1518.7291, 1441.9888, 1039.9703]
2025-09-12 00:34:41,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [134.0, 57.0, 320.0, 160.0, 165.0, 532.0, 670.0, 1000.0, 1000.0, 411.0]
2025-09-12 00:34:41,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (840.28) for latency ExtremeClogL1U23
2025-09-12 00:34:41,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 26/100 (estimated time remaining: 19 hours, 26 minutes, 28 seconds)
2025-09-12 00:46:42,923 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:46:42,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:50:30,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 900.79578 ± 579.810
2025-09-12 00:50:30,774 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1059.2108, 1332.067, 684.64716, 1906.8811, 1614.75, 218.22908, 164.41034, 1195.5929, 323.64767, 508.52112]
2025-09-12 00:50:30,774 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 133.0, 92.0, 649.0, 1000.0, 1000.0]
2025-09-12 00:50:30,774 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (900.80) for latency ExtremeClogL1U23
2025-09-12 00:50:30,788 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 27/100 (estimated time remaining: 19 hours, 9 minutes, 38 seconds)
2025-09-12 01:03:47,251 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:03:47,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:05:42,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 659.58228 ± 539.570
2025-09-12 01:05:42,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1257.632, 550.6266, 853.3339, 63.93739, 570.7368, 67.98271, 213.5925, 1884.9979, 758.4834, 374.49973]
2025-09-12 01:05:42,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [648.0, 251.0, 412.0, 68.0, 253.0, 69.0, 121.0, 907.0, 287.0, 1000.0]
2025-09-12 01:05:42,892 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 28/100 (estimated time remaining: 19 hours, 3 minutes, 29 seconds)
2025-09-12 01:18:38,497 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:18:38,500 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:22:13,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 974.17041 ± 573.364
2025-09-12 01:22:13,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1104.5935, 1986.8646, 442.66064, 1174.6685, 1216.7202, 1691.4053, 442.87753, 37.4372, 554.8129, 1089.6638]
2025-09-12 01:22:13,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 887.0, 1000.0, 465.0, 1000.0, 764.0, 1000.0, 38.0, 341.0, 1000.0]
2025-09-12 01:22:13,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (974.17) for latency ExtremeClogL1U23
2025-09-12 01:22:13,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 29/100 (estimated time remaining: 19 hours, 5 minutes, 37 seconds)
2025-09-12 01:35:00,744 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:35:00,747 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:36:30,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 518.93866 ± 531.063
2025-09-12 01:36:30,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1085.0005, 423.96457, 398.9132, 311.98935, 108.06792, 1842.7391, 639.6211, 280.04575, 50.44886, 48.596085]
2025-09-12 01:36:30,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [509.0, 185.0, 210.0, 220.0, 70.0, 794.0, 1000.0, 116.0, 34.0, 26.0]
2025-09-12 01:36:30,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 30/100 (estimated time remaining: 18 hours, 11 minutes, 25 seconds)
2025-09-12 01:50:15,974 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:50:15,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:52:16,463 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 498.65887 ± 337.463
2025-09-12 01:52:16,463 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [39.264645, 200.16237, 183.21686, 866.4581, 634.83795, 1011.9202, 23.780247, 638.3825, 707.2849, 681.2813]
2025-09-12 01:52:16,463 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [26.0, 71.0, 82.0, 1000.0, 1000.0, 1000.0, 21.0, 349.0, 366.0, 288.0]
2025-09-12 01:52:16,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 31/100 (estimated time remaining: 18 hours, 6 minutes, 14 seconds)
2025-09-12 02:04:34,592 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:04:34,594 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:06:34,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 875.34357 ± 791.785
2025-09-12 02:06:34,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [613.99915, 77.81045, 1642.8278, 112.28359, 2406.9043, 268.33636, 1172.386, 8.543931, 1770.6567, 679.68695]
2025-09-12 02:06:34,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [269.0, 50.0, 1000.0, 49.0, 1000.0, 179.0, 535.0, 14.0, 801.0, 262.0]
2025-09-12 02:06:34,124 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 32/100 (estimated time remaining: 17 hours, 29 minutes, 34 seconds)
2025-09-12 02:19:16,608 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:19:16,611 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:20:58,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 750.78467 ± 579.287
2025-09-12 02:20:58,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [780.8686, 1142.1348, 25.303509, 1136.6357, 288.54492, 1407.968, 577.2027, 210.05725, 114.383965, 1824.7472]
2025-09-12 02:20:58,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [357.0, 530.0, 34.0, 525.0, 122.0, 662.0, 253.0, 167.0, 102.0, 789.0]
2025-09-12 02:20:58,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 33/100 (estimated time remaining: 17 hours, 3 minutes, 28 seconds)
2025-09-12 02:34:52,840 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:34:52,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:37:38,502 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1186.94861 ± 874.052
2025-09-12 02:37:38,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2633.0796, 838.0546, 1767.8796, 58.75381, 390.22736, 2535.6191, 935.0202, 515.2568, 452.76334, 1742.832]
2025-09-12 02:37:38,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 421.0, 658.0, 34.0, 158.0, 1000.0, 417.0, 1000.0, 168.0, 1000.0]
2025-09-12 02:37:38,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (1186.95) for latency ExtremeClogL1U23
2025-09-12 02:37:38,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 34/100 (estimated time remaining: 16 hours, 50 minutes, 31 seconds)
2025-09-12 02:49:56,594 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:49:56,597 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:53:02,157 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1339.67822 ± 719.864
2025-09-12 02:53:02,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1374.066, 1759.3359, 504.64417, 104.89768, 892.1507, 1070.2734, 1257.756, 2441.9934, 2457.915, 1533.7499]
2025-09-12 02:53:02,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [516.0, 635.0, 278.0, 81.0, 1000.0, 463.0, 491.0, 1000.0, 989.0, 1000.0]
2025-09-12 02:53:02,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (1339.68) for latency ExtremeClogL1U23
2025-09-12 02:53:02,230 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 35/100 (estimated time remaining: 16 hours, 50 minutes, 6 seconds)
2025-09-12 03:05:47,569 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:05:47,578 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:08:13,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 970.94763 ± 634.962
2025-09-12 03:08:13,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2188.871, 661.6502, 380.77695, 1069.5892, 387.50583, 829.4233, 401.6214, 1877.3036, 414.02902, 1498.7064]
2025-09-12 03:08:13,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 277.0, 1000.0, 382.0, 191.0, 391.0, 178.0, 819.0, 151.0, 727.0]
2025-09-12 03:08:13,312 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 36/100 (estimated time remaining: 16 hours, 27 minutes, 18 seconds)
2025-09-12 03:21:50,491 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:21:50,494 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:25:04,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 925.16174 ± 669.441
2025-09-12 03:25:04,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1307.649, 299.17978, 1404.2847, 376.85938, 1044.813, 2431.4834, 226.37672, 242.96802, 1253.8491, 664.15436]
2025-09-12 03:25:04,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [503.0, 1000.0, 611.0, 152.0, 417.0, 1000.0, 109.0, 1000.0, 1000.0, 1000.0]
2025-09-12 03:25:04,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 37/100 (estimated time remaining: 16 hours, 44 minutes, 53 seconds)
2025-09-12 03:37:49,849 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:37:49,853 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:39:42,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 681.16626 ± 465.986
2025-09-12 03:39:42,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1359.2911, 1229.1876, 1308.9419, 843.64056, 524.4875, 614.3779, 51.123764, 531.3653, 53.844326, 295.40213]
2025-09-12 03:39:42,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 457.0, 498.0, 326.0, 200.0, 270.0, 41.0, 1000.0, 36.0, 129.0]
2025-09-12 03:39:42,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 38/100 (estimated time remaining: 16 hours, 32 minutes, 4 seconds)
2025-09-12 03:52:20,288 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:52:20,293 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:56:12,702 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1762.58630 ± 721.690
2025-09-12 03:56:12,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2164.055, 846.082, 1435.4338, 1632.7112, 2353.0925, 283.33975, 2539.5044, 2638.9873, 2143.1362, 1589.5206]
2025-09-12 03:56:12,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 398.0, 1000.0, 1000.0, 1000.0, 124.0, 1000.0, 1000.0, 883.0, 689.0]
2025-09-12 03:56:12,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (1762.59) for latency ExtremeClogL1U23
2025-09-12 03:56:12,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 39/100 (estimated time remaining: 16 hours, 14 minutes, 16 seconds)
2025-09-12 04:09:03,015 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:09:03,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:11:33,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1104.47424 ± 994.627
2025-09-12 04:11:33,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [146.70638, 532.2872, 11.186676, 2172.4004, 1143.2123, 1850.2373, 2360.5845, 2591.1746, 223.89963, 13.053549]
2025-09-12 04:11:33,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [92.0, 1000.0, 23.0, 1000.0, 426.0, 731.0, 955.0, 1000.0, 89.0, 24.0]
2025-09-12 04:11:33,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 40/100 (estimated time remaining: 15 hours, 58 minutes, 1 second)
2025-09-12 04:25:00,246 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:25:00,250 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:28:23,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1421.03528 ± 686.063
2025-09-12 04:28:23,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1357.8324, 431.8991, 913.9473, 1244.8872, 2325.2302, 306.1239, 2224.315, 1896.1023, 2130.281, 1379.734]
2025-09-12 04:28:23,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [620.0, 1000.0, 343.0, 505.0, 1000.0, 154.0, 1000.0, 723.0, 844.0, 1000.0]
2025-09-12 04:28:23,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 41/100 (estimated time remaining: 16 hours, 2 minutes, 4 seconds)
2025-09-12 04:40:10,617 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:40:10,621 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:42:32,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 662.30920 ± 442.393
2025-09-12 04:42:32,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [664.8401, 362.618, 426.59995, 1285.3693, 329.77185, 593.69275, 1707.3203, 272.0703, 441.39664, 539.4132]
2025-09-12 04:42:32,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [246.0, 1000.0, 174.0, 461.0, 150.0, 1000.0, 660.0, 116.0, 159.0, 1000.0]
2025-09-12 04:42:32,642 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 42/100 (estimated time remaining: 15 hours, 14 minutes, 7 seconds)
2025-09-12 04:56:15,039 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:56:15,044 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:58:56,456 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1019.42596 ± 687.009
2025-09-12 04:58:56,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2340.7178, 1863.5525, 794.0672, 186.39816, 1264.6382, 697.6625, 259.8903, 497.09192, 1630.7853, 659.45605]
2025-09-12 04:58:56,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 749.0, 1000.0, 103.0, 436.0, 300.0, 1000.0, 180.0, 586.0, 346.0]
2025-09-12 04:58:56,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 43/100 (estimated time remaining: 15 hours, 19 minutes, 7 seconds)
2025-09-12 05:11:22,816 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:11:22,820 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:14:15,014 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1256.14417 ± 779.455
2025-09-12 05:14:15,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1227.1858, 446.73138, 1879.519, 1131.699, 379.20502, 2305.7854, 269.52365, 956.2594, 1266.752, 2698.7817]
2025-09-12 05:14:15,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 195.0, 1000.0, 457.0, 138.0, 755.0, 133.0, 1000.0, 394.0, 1000.0]
2025-09-12 05:14:15,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 44/100 (estimated time remaining: 14 hours, 49 minutes, 38 seconds)
2025-09-12 05:27:02,155 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:27:02,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:30:28,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1690.73730 ± 777.125
2025-09-12 05:30:28,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1391.3187, 2539.4253, 2729.036, 2305.9373, 1804.265, 692.7186, 2566.9214, 1083.3352, 1326.0671, 468.35056]
2025-09-12 05:30:28,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [551.0, 1000.0, 1000.0, 1000.0, 754.0, 274.0, 1000.0, 420.0, 1000.0, 197.0]
2025-09-12 05:30:28,601 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 45/100 (estimated time remaining: 14 hours, 43 minutes, 49 seconds)
2025-09-12 05:43:33,913 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:43:33,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:45:58,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1003.51477 ± 775.518
2025-09-12 05:45:58,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1333.3926, 257.2215, 423.26724, 1189.3925, 1565.1511, 903.52515, 2831.7732, 1129.8196, 249.11504, 152.48943]
2025-09-12 05:45:58,895 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [496.0, 98.0, 182.0, 458.0, 614.0, 1000.0, 1000.0, 1000.0, 126.0, 86.0]
2025-09-12 05:45:58,910 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 46/100 (estimated time remaining: 14 hours, 13 minutes, 27 seconds)
2025-09-12 05:58:35,817 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:58:35,819 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:00:28,741 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 751.56287 ± 743.668
2025-09-12 06:00:28,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [43.231663, 144.80948, 638.54803, 896.2834, 130.6744, 1097.2109, 383.0985, 211.70961, 1414.1744, 2555.889]
2025-09-12 06:00:28,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [28.0, 69.0, 1000.0, 448.0, 59.0, 345.0, 144.0, 84.0, 1000.0, 791.0]
2025-09-12 06:00:28,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 47/100 (estimated time remaining: 14 hours, 1 minute, 42 seconds)
2025-09-12 06:13:41,555 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:13:41,559 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:17:21,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1776.27380 ± 833.026
2025-09-12 06:17:21,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2494.1692, 1607.0702, 2094.384, 1361.5934, 2436.5466, 2593.9902, 708.8202, 139.69292, 1492.3372, 2834.133]
2025-09-12 06:17:21,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 726.0, 873.0, 487.0, 1000.0, 1000.0, 1000.0, 55.0, 662.0, 1000.0]
2025-09-12 06:17:21,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (1776.27) for latency ExtremeClogL1U23
2025-09-12 06:17:21,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 48/100 (estimated time remaining: 13 hours, 51 minutes, 14 seconds)
2025-09-12 06:29:58,532 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:29:58,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:32:49,054 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1272.49927 ± 843.745
2025-09-12 06:32:49,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [330.71344, 2552.4265, 2902.6882, 751.48206, 366.09586, 1251.0815, 1403.003, 882.6791, 576.7065, 1708.1174]
2025-09-12 06:32:49,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 242.0, 142.0, 486.0, 1000.0, 350.0, 204.0, 624.0]
2025-09-12 06:32:49,063 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 49/100 (estimated time remaining: 13 hours, 37 minutes, 5 seconds)
2025-09-12 06:45:27,439 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:45:27,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:47:35,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 974.78625 ± 711.821
2025-09-12 06:47:35,470 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [972.61224, 527.2753, 436.25308, 1120.1752, 226.5264, 1985.4083, 1412.5737, 2353.5217, 114.1441, 599.3725]
2025-09-12 06:47:35,470 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [358.0, 1000.0, 247.0, 401.0, 87.0, 717.0, 515.0, 1000.0, 46.0, 274.0]
2025-09-12 06:47:35,516 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 50/100 (estimated time remaining: 13 hours, 6 minutes, 34 seconds)
2025-09-12 07:00:14,159 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:00:14,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:04:17,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2132.90723 ± 531.932
2025-09-12 07:04:17,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1163.5177, 2607.8108, 2429.47, 2202.322, 1277.1713, 2799.6619, 2434.022, 2355.4644, 1688.3513, 2371.2793]
2025-09-12 07:04:17,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [471.0, 1000.0, 1000.0, 924.0, 517.0, 1000.0, 1000.0, 1000.0, 801.0, 1000.0]
2025-09-12 07:04:17,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (2132.91) for latency ExtremeClogL1U23
2025-09-12 07:04:17,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 51/100 (estimated time remaining: 13 hours, 3 minutes, 3 seconds)
2025-09-12 07:17:13,408 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:17:13,411 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:20:24,530 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1525.56616 ± 912.438
2025-09-12 07:20:24,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2500.2295, 1492.8016, 2575.5886, 681.8888, 2112.8613, 596.0374, 937.37274, 3083.774, 638.35474, 636.7534]
2025-09-12 07:20:24,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 939.0, 270.0, 797.0, 284.0, 1000.0, 1000.0, 267.0, 300.0]
2025-09-12 07:20:24,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 52/100 (estimated time remaining: 13 hours, 3 minutes, 18 seconds)
2025-09-12 07:33:05,238 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:33:05,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:35:10,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 867.22705 ± 544.250
2025-09-12 07:35:10,751 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [682.4538, 1066.0562, 275.7312, 1417.827, 1284.3906, 318.49796, 1867.6544, 290.67612, 1177.0652, 291.91788]
2025-09-12 07:35:10,751 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [307.0, 325.0, 108.0, 1000.0, 1000.0, 124.0, 1000.0, 96.0, 434.0, 182.0]
2025-09-12 07:35:10,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 53/100 (estimated time remaining: 12 hours, 27 minutes, 3 seconds)
2025-09-12 07:47:26,530 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:47:26,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:51:04,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1482.89270 ± 1024.173
2025-09-12 07:51:04,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2754.881, 1485.3475, 507.0484, 279.35507, 2834.0383, 2406.1367, 1482.5941, 393.06253, 2475.894, 210.5692]
2025-09-12 07:51:04,410 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 114.0, 1000.0, 898.0, 628.0, 158.0, 1000.0, 1000.0]
2025-09-12 07:51:04,431 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 54/100 (estimated time remaining: 12 hours, 15 minutes, 36 seconds)
2025-09-12 08:04:37,853 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:04:37,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:07:26,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1528.35156 ± 1075.338
2025-09-12 08:07:26,473 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2967.9893, 756.7782, 544.40356, 541.15173, 201.58188, 1935.162, 2953.4822, 1025.6948, 3152.7288, 1204.5428]
2025-09-12 08:07:26,473 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 276.0, 190.0, 1000.0, 89.0, 629.0, 1000.0, 431.0, 1000.0, 475.0]
2025-09-12 08:07:26,526 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 55/100 (estimated time remaining: 12 hours, 14 minutes, 37 seconds)
2025-09-12 08:20:06,353 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:20:06,356 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:22:42,571 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1413.83130 ± 906.400
2025-09-12 08:22:42,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2222.5, 1428.1561, 567.76294, 805.2087, 2983.401, 2764.2397, 1186.0236, 1415.5579, 312.6673, 452.79605]
2025-09-12 08:22:42,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 254.0, 247.0, 1000.0, 966.0, 447.0, 476.0, 107.0, 182.0]
2025-09-12 08:22:42,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 56/100 (estimated time remaining: 11 hours, 45 minutes, 48 seconds)
2025-09-12 08:34:35,688 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:34:35,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:37:46,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1413.67847 ± 1056.111
2025-09-12 08:37:46,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2133.2207, 1195.3079, 2835.9827, 2645.897, 554.8345, 2837.7822, 56.97041, 1179.8698, 107.020584, 589.8985]
2025-09-12 08:37:46,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [737.0, 438.0, 1000.0, 1000.0, 1000.0, 1000.0, 62.0, 531.0, 54.0, 1000.0]
2025-09-12 08:37:46,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 57/100 (estimated time remaining: 11 hours, 20 minutes, 46 seconds)
2025-09-12 08:50:40,508 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:50:40,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:53:39,463 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1326.04065 ± 753.079
2025-09-12 08:53:39,469 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2859.5867, 587.4051, 586.65533, 561.05444, 1223.7837, 2138.2673, 769.17725, 1009.5614, 1461.8799, 2063.0361]
2025-09-12 08:53:39,469 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [984.0, 229.0, 194.0, 220.0, 401.0, 733.0, 1000.0, 1000.0, 1000.0, 648.0]
2025-09-12 08:53:39,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 58/100 (estimated time remaining: 11 hours, 14 minutes, 54 seconds)
2025-09-12 09:07:02,521 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:07:02,525 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:09:57,206 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1281.55481 ± 956.275
2025-09-12 09:09:57,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [95.33057, 632.9815, 2785.6655, 1592.9304, 1759.4844, 2729.9758, 212.14615, 1626.0844, 1269.3188, 111.631584]
2025-09-12 09:09:57,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [48.0, 1000.0, 984.0, 646.0, 569.0, 1000.0, 1000.0, 531.0, 493.0, 54.0]
2025-09-12 09:09:57,217 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 59/100 (estimated time remaining: 11 hours, 2 minutes, 35 seconds)
2025-09-12 09:21:34,838 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:21:34,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:23:31,965 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1310.89941 ± 1251.404
2025-09-12 09:23:31,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [113.206764, 2997.2996, 58.03462, 1646.5127, 88.919495, 2395.5725, 3270.5483, 370.94818, 34.471966, 2133.481]
2025-09-12 09:23:31,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [48.0, 1000.0, 30.0, 574.0, 48.0, 771.0, 1000.0, 129.0, 25.0, 634.0]
2025-09-12 09:23:32,016 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 60/100 (estimated time remaining: 10 hours, 23 minutes, 57 seconds)
2025-09-12 09:36:21,807 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:36:21,811 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:39:38,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1286.62500 ± 1049.348
2025-09-12 09:39:38,464 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [849.92255, 3016.0273, 1210.5325, 259.38763, 1751.0919, 230.76117, 2275.3135, 20.497871, 480.32892, 2772.3867]
2025-09-12 09:39:38,464 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [314.0, 1000.0, 1000.0, 1000.0, 742.0, 97.0, 830.0, 23.0, 1000.0, 1000.0]
2025-09-12 09:39:38,476 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 61/100 (estimated time remaining: 10 hours, 15 minutes, 26 seconds)
2025-09-12 09:52:08,575 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:52:08,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:54:23,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1401.92798 ± 1044.079
2025-09-12 09:54:23,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [473.1109, 2500.828, 2020.0862, 1039.0906, 244.67654, 123.03926, 2753.722, 965.97845, 782.0324, 3116.716]
2025-09-12 09:54:23,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [168.0, 878.0, 730.0, 415.0, 81.0, 54.0, 1000.0, 303.0, 261.0, 1000.0]
2025-09-12 09:54:23,895 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 62/100 (estimated time remaining: 9 hours, 57 minutes, 41 seconds)
2025-09-12 10:07:05,327 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:07:05,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:08:42,432 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 996.44708 ± 858.658
2025-09-12 10:08:42,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2285.9824, 199.98695, 743.5337, 1135.0001, 389.8837, 852.04944, 721.79425, 4.975663, 2865.1985, 766.06494]
2025-09-12 10:08:42,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [776.0, 72.0, 311.0, 362.0, 146.0, 291.0, 246.0, 21.0, 1000.0, 255.0]
2025-09-12 10:08:42,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 63/100 (estimated time remaining: 9 hours, 30 minutes, 22 seconds)
2025-09-12 10:21:42,251 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:21:42,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:24:01,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 939.04211 ± 818.326
2025-09-12 10:24:01,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1431.589, 165.76587, 77.75307, 1757.1104, 2793.785, 350.86942, 960.0156, 1085.7957, 485.2082, 282.52902]
2025-09-12 10:24:01,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 85.0, 1000.0, 703.0, 1000.0, 131.0, 362.0, 323.0, 234.0, 112.0]
2025-09-12 10:24:01,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 64/100 (estimated time remaining: 9 hours, 8 minutes, 6 seconds)
2025-09-12 10:37:22,143 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:37:22,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:40:51,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1756.20154 ± 1185.248
2025-09-12 10:40:51,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [228.67476, 2725.4294, 118.57883, 1035.191, 2916.6191, 3143.242, 1468.3466, 2796.8708, 2802.262, 326.80035]
2025-09-12 10:40:51,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [91.0, 1000.0, 1000.0, 363.0, 1000.0, 980.0, 1000.0, 1000.0, 1000.0, 122.0]
2025-09-12 10:40:51,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 65/100 (estimated time remaining: 9 hours, 16 minutes, 43 seconds)
2025-09-12 10:53:18,511 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:53:18,515 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:55:58,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1658.98596 ± 1071.254
2025-09-12 10:55:58,830 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [231.29996, 2970.3052, 2633.6023, 1342.2076, 1359.4236, 2056.5854, 164.3765, 323.20212, 2724.568, 2784.2888]
2025-09-12 10:55:58,830 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [99.0, 1000.0, 945.0, 446.0, 425.0, 613.0, 73.0, 134.0, 1000.0, 1000.0]
2025-09-12 10:55:58,841 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 66/100 (estimated time remaining: 8 hours, 54 minutes, 22 seconds)
2025-09-12 11:09:00,109 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:09:00,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:12:04,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1626.64429 ± 781.846
2025-09-12 11:12:04,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2931.572, 912.1544, 1977.4203, 1601.4419, 2399.1306, 1426.6168, 807.8427, 844.30536, 721.1476, 2644.8118]
2025-09-12 11:12:04,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 316.0, 645.0, 652.0, 1000.0, 482.0, 243.0, 1000.0, 253.0, 1000.0]
2025-09-12 11:12:04,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 67/100 (estimated time remaining: 8 hours, 48 minutes, 13 seconds)
2025-09-12 11:23:35,152 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:23:35,154 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:27:03,571 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1434.84644 ± 988.564
2025-09-12 11:27:03,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2114.4849, 1029.438, 2699.4517, 1516.0629, 119.6785, 724.1503, 2722.9746, 2643.2634, 372.0378, 406.9216]
2025-09-12 11:27:03,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [733.0, 416.0, 1000.0, 483.0, 44.0, 1000.0, 1000.0, 892.0, 1000.0, 1000.0]
2025-09-12 11:27:03,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 68/100 (estimated time remaining: 8 hours, 37 minutes, 7 seconds)
2025-09-12 11:39:48,766 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:39:48,769 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:43:09,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2097.29810 ± 996.450
2025-09-12 11:43:09,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2849.6348, 1179.5353, 39.48102, 2312.0405, 2909.6665, 3323.9902, 2042.5205, 3254.968, 1318.6556, 1742.4875]
2025-09-12 11:43:09,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 427.0, 26.0, 819.0, 1000.0, 1000.0, 835.0, 1000.0, 414.0, 649.0]
2025-09-12 11:43:09,506 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 26 minutes, 28 seconds)
2025-09-12 11:56:03,220 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:56:03,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:58:30,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1053.51636 ± 745.330
2025-09-12 11:58:30,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1280.3318, 886.4949, 1757.7219, 651.9311, 892.8431, 384.2666, 819.59717, 954.2152, 46.795403, 2860.9678]
2025-09-12 11:58:30,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [505.0, 382.0, 576.0, 1000.0, 1000.0, 138.0, 323.0, 325.0, 30.0, 1000.0]
2025-09-12 11:58:30,589 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 70/100 (estimated time remaining: 8 hours, 1 minute, 27 seconds)
2025-09-12 12:11:36,312 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:11:36,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:14:06,621 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1290.83618 ± 1171.242
2025-09-12 12:14:06,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [331.37805, 415.6354, 286.29526, 2676.9478, 2154.3633, 64.056526, 1095.5898, 73.12149, 3145.6165, 2665.3572]
2025-09-12 12:14:06,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [106.0, 147.0, 1000.0, 1000.0, 734.0, 43.0, 379.0, 41.0, 1000.0, 924.0]
2025-09-12 12:14:06,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 71/100 (estimated time remaining: 7 hours, 48 minutes, 46 seconds)
2025-09-12 12:26:44,619 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:26:44,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:29:13,506 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1326.99878 ± 959.677
2025-09-12 12:29:13,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1282.2324, 34.194828, 942.23975, 2762.151, 2968.102, 709.68445, 325.535, 787.5901, 2293.5508, 1164.7068]
2025-09-12 12:29:13,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [411.0, 28.0, 306.0, 877.0, 1000.0, 257.0, 112.0, 1000.0, 1000.0, 391.0]
2025-09-12 12:29:13,514 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 72/100 (estimated time remaining: 7 hours, 27 minutes, 26 seconds)
2025-09-12 12:41:45,702 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:41:45,705 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:43:18,583 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1029.70227 ± 1008.217
2025-09-12 12:43:18,584 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1679.5039, 989.5422, 2439.1062, 82.42928, 363.06058, 246.59814, 581.7907, 3196.9104, 321.92734, 396.15335]
2025-09-12 12:43:18,584 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [559.0, 307.0, 709.0, 47.0, 163.0, 162.0, 212.0, 1000.0, 113.0, 121.0]
2025-09-12 12:43:18,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 73/100 (estimated time remaining: 7 hours, 7 minutes)
2025-09-12 12:55:40,392 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:55:40,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:58:42,740 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1272.10767 ± 1024.261
2025-09-12 12:58:42,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [792.5473, 416.76492, 343.03638, 2771.8523, 238.45296, 1416.0681, 2795.6792, 2456.9094, 21.357164, 1468.4081]
2025-09-12 12:58:42,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 154.0, 225.0, 933.0, 157.0, 1000.0, 1000.0, 1000.0, 29.0, 1000.0]
2025-09-12 12:58:42,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 74/100 (estimated time remaining: 6 hours, 47 minutes, 59 seconds)
2025-09-12 13:11:36,962 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:11:36,965 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:15:01,573 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1840.12915 ± 1039.993
2025-09-12 13:15:01,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1122.1641, 2996.984, 517.7446, 1358.4235, 1401.3824, 3080.9028, 3100.8567, 2947.1858, 303.0071, 1572.6383]
2025-09-12 13:15:01,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [346.0, 1000.0, 1000.0, 1000.0, 524.0, 1000.0, 1000.0, 1000.0, 102.0, 493.0]
2025-09-12 13:15:01,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 75/100 (estimated time remaining: 6 hours, 37 minutes, 53 seconds)
2025-09-12 13:28:16,295 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:28:16,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:30:28,554 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1343.07483 ± 1066.142
2025-09-12 13:30:28,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [185.79243, 20.967651, 1837.6757, 3212.5781, 991.8459, 2289.323, 2073.9338, 2195.538, 568.4105, 54.68401]
2025-09-12 13:30:28,556 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [79.0, 24.0, 661.0, 1000.0, 337.0, 761.0, 764.0, 811.0, 278.0, 26.0]
2025-09-12 13:30:28,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 21 minutes, 49 seconds)
2025-09-12 13:42:51,814 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:42:51,818 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:45:35,550 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1207.50854 ± 945.216
2025-09-12 13:45:35,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [265.89062, 1014.9187, 2759.5117, 709.65265, 1394.7357, 540.6335, 1718.571, -4.41206, 753.86676, 2921.717]
2025-09-12 13:45:35,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [86.0, 358.0, 1000.0, 237.0, 434.0, 185.0, 551.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:45:35,564 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 6 minutes, 33 seconds)
2025-09-12 13:57:54,747 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:57:54,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:00:46,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1878.85095 ± 1093.915
2025-09-12 14:00:46,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [807.36176, 2678.8188, 2556.515, 1637.5154, 3024.7185, 130.23729, 3108.3086, 527.1678, 1211.6128, 3106.2544]
2025-09-12 14:00:46,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [243.0, 827.0, 852.0, 551.0, 1000.0, 56.0, 1000.0, 166.0, 452.0, 1000.0]
2025-09-12 14:00:46,842 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 78/100 (estimated time remaining: 5 hours, 56 minutes, 21 seconds)
2025-09-12 14:14:15,909 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:14:15,913 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:18:22,929 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2780.25146 ± 761.753
2025-09-12 14:18:22,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [3449.1636, 3501.1702, 2484.4248, 3103.034, 3131.7542, 1014.3294, 2943.9177, 1798.4882, 2990.099, 3386.135]
2025-09-12 14:18:22,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 381.0, 1000.0, 611.0, 1000.0, 1000.0]
2025-09-12 14:18:22,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1226 [INFO]: New best (2780.25) for latency ExtremeClogL1U23
2025-09-12 14:18:22,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 50 minutes, 32 seconds)
2025-09-12 14:30:12,789 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:30:12,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:32:51,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1276.08203 ± 931.770
2025-09-12 14:32:51,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1331.7758, 828.8258, 1062.1964, 3090.4985, 463.61435, 559.2761, 2810.544, 10.09429, 1237.1827, 1366.8126]
2025-09-12 14:32:51,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [520.0, 240.0, 329.0, 1000.0, 1000.0, 210.0, 1000.0, 14.0, 1000.0, 413.0]
2025-09-12 14:32:51,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 26 minutes, 52 seconds)
2025-09-12 14:45:29,572 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:45:29,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:47:43,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1102.57153 ± 1049.967
2025-09-12 14:47:43,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [40.356674, 764.24005, 1341.9097, 2758.0928, -27.39218, 142.19937, 1973.0802, 2915.351, 832.19666, 285.68057]
2025-09-12 14:47:43,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [27.0, 268.0, 395.0, 1000.0, 1000.0, 67.0, 599.0, 1000.0, 288.0, 133.0]
2025-09-12 14:47:43,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 8 minutes, 59 seconds)
2025-09-12 15:01:28,278 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:01:28,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:04:16,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 986.75421 ± 908.740
2025-09-12 15:04:16,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [192.21791, 259.48953, 755.1647, 39.795967, 2856.1094, 2521.922, 638.14325, 1088.3256, 931.3413, 585.03217]
2025-09-12 15:04:16,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 330.0, 39.0, 1000.0, 878.0, 211.0, 1000.0, 305.0, 213.0]
2025-09-12 15:04:16,475 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 58 minutes, 59 seconds)
2025-09-12 15:15:48,320 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:15:48,324 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:19:27,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2340.56299 ± 837.016
2025-09-12 15:19:27,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [3154.3499, 2891.09, 3050.4602, 2556.2483, 2817.5498, 2099.6064, 2275.3418, 2893.2622, 1254.9092, 412.81537]
2025-09-12 15:19:27,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [983.0, 1000.0, 1000.0, 1000.0, 1000.0, 693.0, 696.0, 1000.0, 384.0, 141.0]
2025-09-12 15:19:27,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 43 minutes, 13 seconds)
2025-09-12 15:32:20,801 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:32:20,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:35:21,914 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1653.46509 ± 848.875
2025-09-12 15:35:21,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2329.3167, 2974.0894, 1646.3813, 1581.069, 1204.8302, 1993.0485, 550.2218, 2476.665, 1789.8301, -10.801953]
2025-09-12 15:35:21,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [801.0, 1000.0, 582.0, 510.0, 384.0, 599.0, 187.0, 888.0, 564.0, 1000.0]
2025-09-12 15:35:21,932 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 21 minutes, 44 seconds)
2025-09-12 15:48:08,093 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:48:08,097 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:50:11,413 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 992.39539 ± 940.215
2025-09-12 15:50:11,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1722.2719, 76.663925, 97.83677, 2752.8857, 191.24767, 485.21304, 1562.7931, 166.29347, 2231.757, 636.99133]
2025-09-12 15:50:11,414 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [556.0, 65.0, 55.0, 992.0, 76.0, 182.0, 529.0, 1000.0, 739.0, 198.0]
2025-09-12 15:50:11,424 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 7 minutes, 28 seconds)
2025-09-12 16:03:30,719 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:03:30,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:06:34,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1755.02930 ± 1110.758
2025-09-12 16:06:34,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [601.2472, 466.5619, 55.35206, 2500.529, 3144.4338, 1172.9982, 1911.7053, 1527.1407, 3178.9536, 2991.3713]
2025-09-12 16:06:34,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 209.0, 30.0, 801.0, 1000.0, 444.0, 607.0, 514.0, 1000.0, 1000.0]
2025-09-12 16:06:34,045 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 56 minutes, 31 seconds)
2025-09-12 16:18:38,044 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:18:38,048 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:21:33,937 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1482.93152 ± 1090.741
2025-09-12 16:21:33,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [345.77652, 3040.2935, 667.62335, 430.4668, 216.94008, 2884.4836, 605.01184, 1797.3016, 2745.318, 2096.0996]
2025-09-12 16:21:33,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [148.0, 1000.0, 250.0, 216.0, 90.0, 1000.0, 1000.0, 1000.0, 902.0, 718.0]
2025-09-12 16:21:33,960 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 36 minutes, 24 seconds)
2025-09-12 16:34:23,479 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:34:23,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:38:30,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2588.28564 ± 724.942
2025-09-12 16:38:30,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2519.6511, 2712.2297, 2411.0454, 2690.8916, 2896.2861, 2971.4058, 2855.0547, 533.7821, 3010.3398, 3282.1687]
2025-09-12 16:38:30,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 754.0, 1000.0, 1000.0, 1000.0, 1000.0, 225.0, 1000.0, 1000.0]
2025-09-12 16:38:30,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 25 minutes, 31 seconds)
2025-09-12 16:50:42,910 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:50:42,915 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:53:46,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1328.36353 ± 1189.286
2025-09-12 16:53:46,072 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2927.5977, 705.42303, 346.39032, 421.6238, 2847.3232, 373.66602, 348.89584, 2056.0847, 3132.3745, 124.25665]
2025-09-12 16:53:46,073 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 259.0, 1000.0, 139.0, 1000.0, 1000.0, 115.0, 1000.0, 1000.0, 46.0]
2025-09-12 16:53:46,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 8 minutes, 9 seconds)
2025-09-12 17:06:47,053 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:06:47,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:09:26,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1501.68665 ± 1045.478
2025-09-12 17:09:26,393 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [3025.7126, 2066.0962, 1362.613, 9.95581, 1144.1068, 1314.2852, 38.529816, 2554.6262, 2870.3262, 630.61444]
2025-09-12 17:09:26,393 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 655.0, 427.0, 18.0, 324.0, 430.0, 31.0, 803.0, 1000.0, 1000.0]
2025-09-12 17:09:26,421 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 54 minutes, 20 seconds)
2025-09-12 17:21:53,534 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:21:53,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:25:20,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2050.10181 ± 1056.227
2025-09-12 17:25:20,128 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1857.4479, 2900.9192, 579.0333, 1718.3065, 2919.391, 31.563478, 3342.5876, 2830.749, 1458.1754, 2862.843]
2025-09-12 17:25:20,128 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [633.0, 1000.0, 218.0, 1000.0, 1000.0, 25.0, 1000.0, 998.0, 549.0, 1000.0]
2025-09-12 17:25:20,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 37 minutes, 32 seconds)
2025-09-12 17:38:03,059 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:38:03,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:41:22,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2038.07812 ± 976.460
2025-09-12 17:41:22,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2400.2627, 409.7173, 229.05696, 2767.8704, 2691.4294, 2340.3948, 2329.447, 3085.6816, 2852.4229, 1274.498]
2025-09-12 17:41:22,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 142.0, 91.0, 1000.0, 1000.0, 711.0, 742.0, 1000.0, 1000.0, 413.0]
2025-09-12 17:41:22,344 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 23 minutes, 39 seconds)
2025-09-12 17:54:43,749 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:54:43,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:57:38,270 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1596.97571 ± 1101.204
2025-09-12 17:57:38,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [1590.888, 3006.4106, 81.9289, 929.82764, 3031.772, 2439.5376, 285.78683, 2448.7468, 170.13132, 1984.7275]
2025-09-12 17:57:38,277 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [481.0, 1000.0, 1000.0, 443.0, 1000.0, 856.0, 97.0, 769.0, 66.0, 643.0]
2025-09-12 17:57:38,298 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 93/100 (estimated time remaining: 2 hours, 6 minutes, 36 seconds)
2025-09-12 18:09:22,876 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:09:22,880 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:12:58,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2007.19373 ± 1090.996
2025-09-12 18:12:58,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [3121.0444, 2251.464, 3447.8306, 1591.5963, 2894.4556, 633.37915, 2874.6812, 115.84314, 874.7407, 2266.903]
2025-09-12 18:12:58,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 607.0, 1000.0, 277.0, 902.0, 1000.0, 343.0, 728.0]
2025-09-12 18:12:58,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 50 minutes, 53 seconds)
2025-09-12 18:26:23,723 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:26:23,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:30:19,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2460.64014 ± 740.038
2025-09-12 18:30:19,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2376.7534, 2249.504, 2956.123, 1779.1793, 604.3039, 2883.252, 2921.9375, 2969.4207, 3226.765, 2639.1628]
2025-09-12 18:30:19,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [743.0, 831.0, 1000.0, 653.0, 201.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 18:30:19,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 37 minutes, 4 seconds)
2025-09-12 18:43:11,182 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:43:11,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:47:19,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2337.31372 ± 1048.628
2025-09-12 18:47:19,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [3226.8875, 2790.4983, 2609.926, 52.76435, 3162.5415, 2920.9226, 2562.2283, 3075.922, 2390.671, 580.77405]
2025-09-12 18:47:19,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [974.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 909.0, 1000.0, 773.0, 245.0]
2025-09-12 18:47:19,464 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 21 minutes, 59 seconds)
2025-09-12 18:59:35,491 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:59:35,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:03:35,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 1952.73083 ± 1204.017
2025-09-12 19:03:35,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2872.5493, 3129.677, 1912.6401, 132.11253, 317.66476, 3200.0557, 367.81006, 1650.754, 3111.1367, 2832.9092]
2025-09-12 19:03:35,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 128.0, 464.0, 1000.0, 1000.0]
2025-09-12 19:03:35,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 5 minutes, 46 seconds)
2025-09-12 19:16:02,825 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:16:02,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:19:49,849 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2309.58472 ± 928.103
2025-09-12 19:19:49,850 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2847.3765, 2966.1714, 764.1247, 2159.0366, 3079.0276, 3001.4426, 793.11206, 1378.5415, 2790.6152, 3316.4001]
2025-09-12 19:19:49,850 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 959.0, 258.0, 698.0, 1000.0, 942.0, 1000.0, 457.0, 1000.0, 1000.0]
2025-09-12 19:19:49,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 98/100 (estimated time remaining: 49 minutes, 18 seconds)
2025-09-12 19:32:48,991 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:32:48,996 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:36:05,694 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 2123.88574 ± 1179.904
2025-09-12 19:36:05,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2839.106, 3369.8677, 765.99115, 3169.2175, 3020.181, 1556.7526, 139.1642, 2876.63, 2978.2224, 523.72394]
2025-09-12 19:36:05,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 207.0, 1000.0, 1000.0, 595.0, 85.0, 1000.0, 1000.0, 215.0]
2025-09-12 19:36:05,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 99/100 (estimated time remaining: 33 minutes, 14 seconds)
2025-09-12 19:48:15,600 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:48:15,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:50:42,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 976.53259 ± 795.148
2025-09-12 19:50:42,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [2172.9683, 289.0207, 1190.8464, 1460.6805, 304.3051, 199.14732, 756.22955, 369.02365, 488.8537, 2534.2515]
2025-09-12 19:50:42,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 107.0, 363.0, 501.0, 121.0, 1000.0, 241.0, 99.0, 1000.0, 830.0]
2025-09-12 19:50:42,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1199 [INFO]: Iteration 100/100 (estimated time remaining: 16 minutes, 4 seconds)
2025-09-12 20:03:18,406 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:03:18,408 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:04:52,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1221 [DEBUG]: Total Reward: 729.34473 ± 995.813
2025-09-12 20:04:52,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1222 [DEBUG]: All rewards: [610.54736, 485.9749, 39.90275, 2355.2795, 32.36582, 37.627773, 168.9499, 245.43733, 2978.5864, 338.77536]
2025-09-12 20:04:52,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1223 [DEBUG]: All trajectory lengths: [201.0, 198.0, 28.0, 700.0, 27.0, 32.0, 56.0, 1000.0, 1000.0, 132.0]
2025-09-12 20:04:52,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc10-ant):1251 [DEBUG]: Training session finished
