2025-09-11 18:14:43,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc15-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:14:43,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc15-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:14:43,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x14b5b411e1d0>}
2025-09-11 18:14:43,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1111 [DEBUG]: using device: cuda
2025-09-11 18:14:43,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1133 [INFO]: Creating new trainer
2025-09-11 18:14:43,989 baseline-mbpac-noiseperc15-ant:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 18:14:43,989 baseline-mbpac-noiseperc15-ant:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=35, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 18:14:43,999 baseline-mbpac-noiseperc15-ant:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=27, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=8, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 18:14:44,909 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1194 [DEBUG]: Starting training session...
2025-09-11 18:14:44,909 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 1/100
2025-09-11 18:27:06,415 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:27:06,420 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:28:18,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: -119.05550 ± 197.673
2025-09-11 18:28:18,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [-5.090441, -26.442152, 3.4523664, -26.834337, -531.79114, -31.761364, -19.470991, -10.444204, -494.42075, -47.752106]
2025-09-11 18:28:18,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [59.0, 20.0, 81.0, 100.0, 1000.0, 102.0, 79.0, 32.0, 1000.0, 41.0]
2025-09-11 18:28:18,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (-119.06) for latency ExtremeClogL1U23
2025-09-11 18:28:18,216 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 2/100 (estimated time remaining: 22 hours, 21 minutes, 57 seconds)
2025-09-11 18:41:05,175 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:41:05,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:41:56,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: -17.63563 ± 56.654
2025-09-11 18:41:56,020 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [17.026442, -179.4349, -1.9362022, -18.446802, -0.40631703, -10.690914, 7.6685953, 15.355571, -33.985626, 28.493887]
2025-09-11 18:41:56,020 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [42.0, 1000.0, 48.0, 46.0, 84.0, 26.0, 46.0, 30.0, 188.0, 258.0]
2025-09-11 18:41:56,020 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (-17.64) for latency ExtremeClogL1U23
2025-09-11 18:41:56,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 3/100 (estimated time remaining: 22 hours, 12 minutes, 4 seconds)
2025-09-11 18:54:06,957 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:54:06,961 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:55:10,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 0.77911 ± 41.251
2025-09-11 18:55:10,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [-16.30295, -25.69537, 26.920923, 12.956072, 15.386446, 25.925596, 54.412586, -105.28043, 4.23301, 15.23521]
2025-09-11 18:55:10,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [135.0, 202.0, 295.0, 29.0, 118.0, 141.0, 275.0, 1000.0, 16.0, 22.0]
2025-09-11 18:55:10,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (0.78) for latency ExtremeClogL1U23
2025-09-11 18:55:10,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 4/100 (estimated time remaining: 21 hours, 46 minutes, 54 seconds)
2025-09-11 19:07:29,727 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:07:29,730 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:09:28,006 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: -7.87195 ± 95.919
2025-09-11 19:09:28,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [-11.273768, -121.55286, 50.140186, -133.72946, 7.6085725, -128.94069, 23.8811, 198.45683, 15.874555, 20.816034]
2025-09-11 19:09:28,007 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [122.0, 1000.0, 73.0, 1000.0, 342.0, 1000.0, 32.0, 471.0, 47.0, 21.0]
2025-09-11 19:09:28,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 5/100 (estimated time remaining: 21 hours, 53 minutes, 14 seconds)
2025-09-11 19:22:19,456 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:22:19,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:22:54,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 15.65860 ± 40.139
2025-09-11 19:22:54,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [16.584448, 20.10373, 32.75404, -1.7247999, 34.858482, 92.63636, 31.455723, 0.80535305, 6.445814, -77.33312]
2025-09-11 19:22:54,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [44.0, 38.0, 67.0, 51.0, 343.0, 230.0, 116.0, 54.0, 57.0, 253.0]
2025-09-11 19:22:54,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (15.66) for latency ExtremeClogL1U23
2025-09-11 19:22:54,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 6/100 (estimated time remaining: 21 hours, 35 minutes, 8 seconds)
2025-09-11 19:35:18,893 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:35:18,900 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:37:00,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: -24.21029 ± 50.975
2025-09-11 19:37:00,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [10.117564, -9.848286, -167.58391, 20.762463, 5.990273, -13.830767, -48.078007, -9.62117, -10.012603, -19.99839]
2025-09-11 19:37:00,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [23.0, 1000.0, 256.0, 59.0, 37.0, 1000.0, 149.0, 30.0, 33.0, 1000.0]
2025-09-11 19:37:00,837 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 7/100 (estimated time remaining: 21 hours, 31 minutes, 45 seconds)
2025-09-11 19:50:38,493 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:50:38,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:51:19,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 12.02117 ± 22.179
2025-09-11 19:51:19,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [5.036488, 5.32546, -32.280037, 32.495113, 5.8431997, -3.098376, 55.65906, 24.042616, 5.7511044, 21.43706]
2025-09-11 19:51:19,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [16.0, 1000.0, 64.0, 70.0, 17.0, 12.0, 139.0, 35.0, 12.0, 100.0]
2025-09-11 19:51:19,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 8/100 (estimated time remaining: 21 hours, 30 minutes, 43 seconds)
2025-09-11 20:03:50,026 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:03:50,029 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:04:40,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 32.08885 ± 37.541
2025-09-11 20:04:40,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [43.126526, -7.2946954, 11.688887, 20.037622, 26.12887, 24.870779, 72.388, 123.82673, -1.7353448, 7.851179]
2025-09-11 20:04:40,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [165.0, 143.0, 23.0, 22.0, 42.0, 23.0, 1000.0, 340.0, 33.0, 38.0]
2025-09-11 20:04:40,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (32.09) for latency ExtremeClogL1U23
2025-09-11 20:04:40,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 9/100 (estimated time remaining: 21 hours, 19 minutes, 3 seconds)
2025-09-11 20:16:52,187 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:16:52,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:17:36,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 37.43942 ± 32.799
2025-09-11 20:17:36,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [25.721256, 110.253075, 12.83229, 63.8678, 59.275078, 7.6344113, 56.301434, -1.4409134, 29.246683, 10.703096]
2025-09-11 20:17:36,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [30.0, 258.0, 15.0, 1000.0, 129.0, 23.0, 45.0, 12.0, 56.0, 24.0]
2025-09-11 20:17:36,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (37.44) for latency ExtremeClogL1U23
2025-09-11 20:17:36,073 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 10/100 (estimated time remaining: 20 hours, 40 minutes, 2 seconds)
2025-09-11 20:30:36,772 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:30:36,775 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:32:42,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 86.84639 ± 69.855
2025-09-11 20:32:42,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [159.88004, 202.15096, 6.634032, 30.36565, 34.57886, 138.30539, 175.91997, 56.69442, 19.782467, 44.15212]
2025-09-11 20:32:42,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 21.0, 72.0, 151.0, 1000.0, 1000.0, 100.0, 140.0, 46.0]
2025-09-11 20:32:42,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (86.85) for latency ExtremeClogL1U23
2025-09-11 20:32:42,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 11/100 (estimated time remaining: 20 hours, 56 minutes, 22 seconds)
2025-09-11 20:44:48,545 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:44:48,549 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:45:54,724 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 60.14690 ± 102.977
2025-09-11 20:45:54,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [-5.1827602, 6.719105, 263.08798, 9.13878, 263.25952, -9.051326, 22.329193, 54.250885, -4.204599, 1.1221703]
2025-09-11 20:45:54,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [17.0, 50.0, 1000.0, 41.0, 1000.0, 33.0, 67.0, 63.0, 40.0, 47.0]
2025-09-11 20:45:54,768 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 12/100 (estimated time remaining: 20 hours, 26 minutes, 23 seconds)
2025-09-11 20:58:17,680 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:58:17,685 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:00:29,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 143.69455 ± 132.268
2025-09-11 21:00:29,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [286.8227, -8.46244, 12.847848, 35.019802, 260.48618, 26.540071, 68.290436, 102.20964, 345.32068, 307.87057]
2025-09-11 21:00:29,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 237.0, 31.0, 75.0, 1000.0, 148.0, 111.0, 93.0, 1000.0, 1000.0]
2025-09-11 21:00:29,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (143.69) for latency ExtremeClogL1U23
2025-09-11 21:00:29,611 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 13/100 (estimated time remaining: 20 hours, 17 minutes, 18 seconds)
2025-09-11 21:13:08,882 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:13:08,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:15:17,712 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 155.97641 ± 145.575
2025-09-11 21:15:17,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [30.353863, -2.1332207, 334.0657, 18.178057, 113.781204, 322.69208, 323.6156, 24.359123, 342.341, 52.510605]
2025-09-11 21:15:17,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [31.0, 33.0, 1000.0, 194.0, 217.0, 1000.0, 1000.0, 29.0, 1000.0, 72.0]
2025-09-11 21:15:17,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (155.98) for latency ExtremeClogL1U23
2025-09-11 21:15:17,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 14/100 (estimated time remaining: 20 hours, 28 minutes, 40 seconds)
2025-09-11 21:27:41,307 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:27:41,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:29:36,265 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 158.73093 ± 138.679
2025-09-11 21:29:36,267 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [199.13277, 382.82944, 33.16507, 348.10788, 95.80363, 58.339977, -2.3109717, 40.890392, 335.02832, 96.32262]
2025-09-11 21:29:36,267 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [303.0, 1000.0, 483.0, 1000.0, 90.0, 63.0, 9.0, 41.0, 1000.0, 83.0]
2025-09-11 21:29:36,267 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (158.73) for latency ExtremeClogL1U23
2025-09-11 21:29:36,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 15/100 (estimated time remaining: 20 hours, 38 minutes, 27 seconds)
2025-09-11 21:42:56,653 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:42:56,662 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:44:32,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 132.28574 ± 171.690
2025-09-11 21:44:32,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [380.32693, 396.68954, 404.3271, 31.915514, 17.62148, 8.38714, 8.626426, 49.875214, 11.645774, 13.442444]
2025-09-11 21:44:32,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 111.0, 33.0, 65.0, 126.0, 65.0, 27.0, 28.0]
2025-09-11 21:44:32,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 16/100 (estimated time remaining: 20 hours, 20 minutes, 59 seconds)
2025-09-11 21:56:02,297 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:56:02,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:57:30,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 161.27507 ± 155.663
2025-09-11 21:57:30,082 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [89.773445, 387.23294, 45.187378, 13.724414, 160.86362, 59.730595, 359.13705, 35.254402, 36.179985, 425.66696]
2025-09-11 21:57:30,082 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [155.0, 1000.0, 61.0, 48.0, 158.0, 94.0, 518.0, 41.0, 93.0, 1000.0]
2025-09-11 21:57:30,082 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (161.28) for latency ExtremeClogL1U23
2025-09-11 21:57:30,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 17/100 (estimated time remaining: 20 hours, 2 minutes, 41 seconds)
2025-09-11 22:10:46,575 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:10:46,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:11:33,094 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 94.68522 ± 117.972
2025-09-11 22:11:33,094 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [51.615757, 24.518589, 142.97449, 7.6658535, 120.535934, 6.401662, 59.62427, 21.709435, 422.47455, 89.331696]
2025-09-11 22:11:33,094 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [45.0, 13.0, 126.0, 21.0, 140.0, 60.0, 84.0, 40.0, 1000.0, 134.0]
2025-09-11 22:11:33,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 18/100 (estimated time remaining: 19 hours, 39 minutes, 33 seconds)
2025-09-11 22:23:40,089 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:23:40,094 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:25:25,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 193.04424 ± 187.422
2025-09-11 22:25:25,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [16.108234, 26.342325, -6.2548227, 64.252205, 100.43245, 91.88502, 423.99777, 451.07776, 266.7219, 495.8794]
2025-09-11 22:25:25,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [18.0, 40.0, 25.0, 184.0, 119.0, 83.0, 1000.0, 1000.0, 274.0, 1000.0]
2025-09-11 22:25:25,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (193.04) for latency ExtremeClogL1U23
2025-09-11 22:25:25,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 19/100 (estimated time remaining: 19 hours, 10 minutes, 2 seconds)
2025-09-11 22:38:05,956 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:38:05,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:39:07,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 162.33829 ± 95.440
2025-09-11 22:39:07,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [198.12032, 162.59651, 168.64273, 144.90146, 115.8578, 78.20501, 112.852356, 427.5844, 133.43228, 81.19004]
2025-09-11 22:39:07,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [261.0, 174.0, 261.0, 142.0, 146.0, 144.0, 195.0, 549.0, 227.0, 129.0]
2025-09-11 22:39:07,587 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 20/100 (estimated time remaining: 18 hours, 46 minutes, 15 seconds)
2025-09-11 22:51:31,915 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:51:31,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:53:38,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 236.60481 ± 202.717
2025-09-11 22:53:38,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [4.3737407, 599.5313, 239.10214, 203.45134, 37.249195, 38.695866, 482.04916, 377.6886, 366.8042, 17.102427]
2025-09-11 22:53:38,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [14.0, 1000.0, 230.0, 141.0, 47.0, 82.0, 1000.0, 1000.0, 1000.0, 16.0]
2025-09-11 22:53:38,392 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (236.60) for latency ExtremeClogL1U23
2025-09-11 22:53:38,398 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 21/100 (estimated time remaining: 18 hours, 25 minutes, 40 seconds)
2025-09-11 23:06:37,265 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:06:37,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:08:25,378 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 182.81693 ± 144.308
2025-09-11 23:08:25,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [66.89856, 179.39743, 344.10074, 411.73026, 42.54148, 147.55573, 45.075897, 143.57248, 412.87936, 34.41725]
2025-09-11 23:08:25,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [59.0, 205.0, 1000.0, 1000.0, 38.0, 308.0, 44.0, 166.0, 1000.0, 57.0]
2025-09-11 23:08:25,387 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 22/100 (estimated time remaining: 18 hours, 40 minutes, 33 seconds)
2025-09-11 23:20:35,245 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:20:35,250 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:21:55,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 182.74219 ± 175.303
2025-09-11 23:21:55,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [6.1390758, 100.37993, 36.75574, 97.22343, 117.92866, 21.47844, 429.6111, 127.90059, 377.93817, 512.06665]
2025-09-11 23:21:55,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [15.0, 66.0, 35.0, 125.0, 104.0, 35.0, 1000.0, 170.0, 1000.0, 345.0]
2025-09-11 23:21:55,891 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 23/100 (estimated time remaining: 18 hours, 17 minutes, 55 seconds)
2025-09-11 23:35:25,333 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:35:25,339 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:36:39,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 157.93842 ± 160.672
2025-09-11 23:36:39,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [97.226875, 92.40642, 42.743866, 490.8278, 49.702587, 42.392994, 373.80817, 315.41858, 54.422905, 20.433939]
2025-09-11 23:36:39,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [98.0, 108.0, 26.0, 1000.0, 40.0, 66.0, 1000.0, 288.0, 74.0, 21.0]
2025-09-11 23:36:39,659 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 24/100 (estimated time remaining: 18 hours, 17 minutes, 6 seconds)
2025-09-11 23:48:58,537 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:48:58,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:50:46,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 209.89261 ± 157.105
2025-09-11 23:50:46,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [330.6465, 60.137474, -0.15623455, 404.7938, 286.4435, 85.58234, 74.1165, 382.95886, 405.2997, 69.10353]
2025-09-11 23:50:46,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 87.0, 24.0, 273.0, 257.0, 83.0, 104.0, 1000.0, 1000.0, 109.0]
2025-09-11 23:50:46,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 25/100 (estimated time remaining: 18 hours, 8 minutes, 56 seconds)
2025-09-12 00:03:14,500 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:03:14,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:06:21,665 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 305.76184 ± 161.406
2025-09-12 00:06:21,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [100.398186, 380.13507, 512.17975, 329.66904, 417.96146, 68.14879, 339.2593, 432.9597, 431.09448, 45.812393]
2025-09-12 00:06:21,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [185.0, 1000.0, 1000.0, 277.0, 1000.0, 141.0, 1000.0, 1000.0, 1000.0, 49.0]
2025-09-12 00:06:21,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (305.76) for latency ExtremeClogL1U23
2025-09-12 00:06:21,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 26/100 (estimated time remaining: 18 hours, 10 minutes, 49 seconds)
2025-09-12 00:18:10,494 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:18:10,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:19:32,231 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 173.69156 ± 127.779
2025-09-12 00:19:32,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [141.93779, 329.29782, 7.3566713, 95.2307, 254.28075, 101.5456, 445.0409, 106.38776, 199.25815, 56.579453]
2025-09-12 00:19:32,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [143.0, 1000.0, 40.0, 104.0, 192.0, 114.0, 1000.0, 76.0, 234.0, 34.0]
2025-09-12 00:19:32,251 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 27/100 (estimated time remaining: 17 hours, 32 minutes, 29 seconds)
2025-09-12 00:32:07,634 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:32:07,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:34:45,470 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 272.84244 ± 159.663
2025-09-12 00:34:45,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [562.8423, 113.30949, 320.47205, 363.6352, 185.44691, 171.55107, 25.78541, 154.22762, 439.12054, 392.03372]
2025-09-12 00:34:45,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 108.0, 1000.0, 1000.0, 171.0, 183.0, 25.0, 183.0, 1000.0, 1000.0]
2025-09-12 00:34:45,480 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 28/100 (estimated time remaining: 17 hours, 43 minutes, 15 seconds)
2025-09-12 00:47:07,711 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:47:07,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:50:04,312 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 394.32831 ± 193.716
2025-09-12 00:50:04,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [438.89066, 349.4139, 458.05734, 839.2151, 102.76419, 349.27982, 203.42189, 528.16125, 233.8132, 440.26535]
2025-09-12 00:50:04,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 130.0, 367.0, 168.0, 488.0, 179.0, 1000.0]
2025-09-12 00:50:04,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (394.33) for latency ExtremeClogL1U23
2025-09-12 00:50:04,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 29/100 (estimated time remaining: 17 hours, 37 minutes, 7 seconds)
2025-09-12 01:03:33,635 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:03:33,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:06:10,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 263.05145 ± 161.693
2025-09-12 01:06:10,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [347.10925, 157.91203, 409.33014, 325.34763, 256.4562, 343.95813, 160.00215, 568.30426, 33.22414, 28.870777]
2025-09-12 01:06:10,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 112.0, 1000.0, 1000.0, 259.0, 1000.0, 143.0, 1000.0, 36.0, 44.0]
2025-09-12 01:06:10,953 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 30/100 (estimated time remaining: 17 hours, 50 minutes, 53 seconds)
2025-09-12 01:18:15,535 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:18:15,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:20:37,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 316.08588 ± 227.491
2025-09-12 01:20:37,693 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [77.27989, 294.90146, 698.23456, 21.083488, 336.63065, 20.099775, 511.80045, 461.7496, 561.52264, 177.55637]
2025-09-12 01:20:37,693 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [65.0, 318.0, 1000.0, 37.0, 1000.0, 39.0, 1000.0, 375.0, 1000.0, 205.0]
2025-09-12 01:20:37,700 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 31/100 (estimated time remaining: 17 hours, 19 minutes, 44 seconds)
2025-09-12 01:33:54,273 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:33:54,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:36:41,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 346.48672 ± 235.464
2025-09-12 01:36:41,482 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [364.00232, 2.9740388, 2.868435, 503.83258, 420.28473, 678.4235, 558.1897, 54.84323, 316.42838, 563.02014]
2025-09-12 01:36:41,482 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 15.0, 11.0, 1000.0, 1000.0, 1000.0, 408.0, 63.0, 430.0, 1000.0]
2025-09-12 01:36:41,491 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 32/100 (estimated time remaining: 17 hours, 44 minutes, 43 seconds)
2025-09-12 01:48:22,874 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:48:22,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:49:15,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 137.38733 ± 130.549
2025-09-12 01:49:15,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [310.96133, 97.25792, 41.88677, 69.0265, 28.859343, 427.1113, 151.18648, 37.95215, 12.670602, 196.96095]
2025-09-12 01:49:15,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 73.0, 46.0, 41.0, 53.0, 337.0, 132.0, 48.0, 15.0, 138.0]
2025-09-12 01:49:15,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 33/100 (estimated time remaining: 16 hours, 53 minutes, 17 seconds)
2025-09-12 02:02:05,759 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:02:05,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:04:13,030 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 331.69073 ± 199.887
2025-09-12 02:04:13,031 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [566.03577, 698.65063, 461.91275, 58.059105, 120.92501, 207.2344, 375.1445, 123.648476, 430.69, 274.60703]
2025-09-12 02:04:13,031 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [450.0, 502.0, 1000.0, 61.0, 95.0, 123.0, 1000.0, 100.0, 1000.0, 199.0]
2025-09-12 02:04:13,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 34/100 (estimated time remaining: 16 hours, 33 minutes, 32 seconds)
2025-09-12 02:17:05,302 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:17:05,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:19:06,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 293.81720 ± 146.861
2025-09-12 02:19:06,788 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [197.52255, 63.627537, 527.95685, 286.07516, 413.0033, 360.25598, 196.28781, 488.36453, 291.6901, 113.388245]
2025-09-12 02:19:06,788 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [164.0, 83.0, 1000.0, 240.0, 1000.0, 1000.0, 306.0, 315.0, 200.0, 60.0]
2025-09-12 02:19:06,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 35/100 (estimated time remaining: 16 hours, 2 minutes, 41 seconds)
2025-09-12 02:31:34,581 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:31:34,585 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:33:59,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 293.29541 ± 173.187
2025-09-12 02:33:59,658 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [435.84366, 304.03076, 132.27217, 142.93889, 563.54266, 155.05493, 16.44362, 538.3125, 307.02826, 337.48648]
2025-09-12 02:33:59,658 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 292.0, 121.0, 121.0, 401.0, 213.0, 17.0, 1000.0, 1000.0, 1000.0]
2025-09-12 02:33:59,663 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 36/100 (estimated time remaining: 15 hours, 53 minutes, 45 seconds)
2025-09-12 02:46:14,126 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:46:14,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:49:26,645 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 367.29337 ± 140.116
2025-09-12 02:49:26,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [420.82993, 185.80194, 56.136936, 543.94604, 395.52713, 356.11597, 417.37524, 536.7593, 377.7459, 382.6953]
2025-09-12 02:49:26,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 119.0, 81.0, 1000.0, 1000.0, 291.0, 1000.0, 373.0, 1000.0, 1000.0]
2025-09-12 02:49:26,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 37/100 (estimated time remaining: 15 hours, 31 minutes, 14 seconds)
2025-09-12 03:02:28,102 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:02:28,105 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:05:05,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 377.89551 ± 223.429
2025-09-12 03:05:05,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [339.20834, 320.33466, 291.3807, 258.50684, 487.46555, 916.78534, 277.07877, 358.42468, 523.8844, 5.885694]
2025-09-12 03:05:05,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 137.0, 163.0, 456.0, 643.0, 188.0, 1000.0, 1000.0, 27.0]
2025-09-12 03:05:05,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 38/100 (estimated time remaining: 15 hours, 55 minutes, 31 seconds)
2025-09-12 03:17:42,429 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:17:42,432 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:19:01,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 243.39136 ± 174.915
2025-09-12 03:19:01,943 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [0.29516724, 227.42555, 603.19495, 334.79675, 315.43103, 1.5052152, 133.0444, 152.96992, 266.2508, 398.9997]
2025-09-12 03:19:01,943 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [11.0, 173.0, 1000.0, 206.0, 171.0, 15.0, 84.0, 96.0, 163.0, 1000.0]
2025-09-12 03:19:01,953 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 39/100 (estimated time remaining: 15 hours, 27 minutes, 42 seconds)
2025-09-12 03:31:19,878 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:31:19,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:33:07,827 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 253.94174 ± 245.245
2025-09-12 03:33:07,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [297.6527, 776.7536, 29.050564, 58.43389, 38.183395, 450.26184, 54.027966, 57.274548, 525.3658, 252.41331]
2025-09-12 03:33:07,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 53.0, 63.0, 73.0, 1000.0, 45.0, 52.0, 423.0, 212.0]
2025-09-12 03:33:07,836 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 40/100 (estimated time remaining: 15 hours, 3 minutes)
2025-09-12 03:45:46,257 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:45:46,261 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:46:58,489 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 243.28667 ± 182.598
2025-09-12 03:46:58,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [84.375984, 353.5752, 681.042, 30.950993, 245.4468, 143.39616, 325.71887, 304.2197, 48.23063, 215.91032]
2025-09-12 03:46:58,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [79.0, 1000.0, 479.0, 41.0, 185.0, 108.0, 261.0, 170.0, 67.0, 190.0]
2025-09-12 03:46:58,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 41/100 (estimated time remaining: 14 hours, 35 minutes, 46 seconds)
2025-09-12 03:58:56,293 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:58:56,295 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:00:58,803 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 327.71866 ± 448.608
2025-09-12 04:00:58,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [10.6845455, 498.6034, 33.87015, 30.061806, 1550.4535, 298.7294, 117.462875, 17.156042, 174.12823, 546.0366]
2025-09-12 04:00:58,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [37.0, 1000.0, 49.0, 30.0, 1000.0, 1000.0, 101.0, 23.0, 105.0, 1000.0]
2025-09-12 04:00:58,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 42/100 (estimated time remaining: 14 hours, 4 minutes, 7 seconds)
2025-09-12 04:14:35,886 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:14:35,890 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:16:07,047 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 287.82291 ± 246.268
2025-09-12 04:16:07,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [441.71484, 28.679691, 109.64069, 759.9784, 356.85843, 638.59766, 38.455677, 135.84026, 69.32521, 299.1381]
2025-09-12 04:16:07,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [250.0, 26.0, 73.0, 1000.0, 1000.0, 533.0, 43.0, 102.0, 37.0, 228.0]
2025-09-12 04:16:07,073 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 43/100 (estimated time remaining: 13 hours, 43 minutes, 49 seconds)
2025-09-12 04:28:27,325 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:28:27,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:29:24,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 223.65720 ± 246.068
2025-09-12 04:29:24,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [400.02036, -10.525651, 137.73155, 21.62467, 49.254894, 62.055714, 839.41626, 255.6959, 371.13495, 110.16338]
2025-09-12 04:29:24,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [254.0, 12.0, 81.0, 29.0, 87.0, 44.0, 1000.0, 166.0, 241.0, 155.0]
2025-09-12 04:29:24,948 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 44/100 (estimated time remaining: 13 hours, 22 minutes, 22 seconds)
2025-09-12 04:41:06,396 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:41:06,398 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:43:28,966 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 325.11047 ± 148.528
2025-09-12 04:43:28,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [224.62717, 388.16412, 271.56198, 12.339612, 164.06497, 397.6148, 430.06503, 359.0077, 505.26108, 498.39835]
2025-09-12 04:43:28,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [244.0, 1000.0, 231.0, 24.0, 86.0, 218.0, 1000.0, 270.0, 1000.0, 1000.0]
2025-09-12 04:43:28,976 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 45/100 (estimated time remaining: 13 hours, 7 minutes, 56 seconds)
2025-09-12 04:56:49,898 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:56:49,901 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:58:33,559 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 296.03607 ± 325.747
2025-09-12 04:58:33,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [-10.45883, 313.57773, 117.32166, 85.27998, 235.52057, 32.033314, 523.4076, 1165.1951, 196.18123, 302.30264]
2025-09-12 04:58:33,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [34.0, 237.0, 64.0, 90.0, 192.0, 23.0, 1000.0, 1000.0, 87.0, 1000.0]
2025-09-12 04:58:33,590 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 46/100 (estimated time remaining: 13 hours, 7 minutes, 26 seconds)
2025-09-12 05:10:47,721 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:10:47,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:12:13,985 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 298.53629 ± 157.356
2025-09-12 05:12:13,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [577.26855, 344.65436, 478.4784, 497.14395, 102.02507, 188.32906, 220.78268, 142.55618, 244.9359, 189.18864]
2025-09-12 05:12:13,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [316.0, 358.0, 432.0, 274.0, 58.0, 183.0, 1000.0, 100.0, 225.0, 129.0]
2025-09-12 05:12:13,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 47/100 (estimated time remaining: 12 hours, 49 minutes, 31 seconds)
2025-09-12 05:24:34,213 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:24:34,218 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:27:09,490 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 405.95074 ± 201.531
2025-09-12 05:27:09,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [27.468616, 515.7464, 266.72003, 295.93863, 234.50066, 584.0629, 293.94794, 513.245, 640.84607, 687.03107]
2025-09-12 05:27:09,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [23.0, 1000.0, 1000.0, 167.0, 1000.0, 359.0, 279.0, 353.0, 413.0, 1000.0]
2025-09-12 05:27:09,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (405.95) for latency ExtremeClogL1U23
2025-09-12 05:27:09,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 48/100 (estimated time remaining: 12 hours, 33 minutes, 2 seconds)
2025-09-12 05:39:49,507 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:39:49,511 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:43:13,210 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 342.82050 ± 208.806
2025-09-12 05:43:13,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [105.323616, 674.8864, 286.52737, 490.40427, 219.69327, 16.2511, 267.82053, 435.82025, 268.15735, 663.3211]
2025-09-12 05:43:13,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [66.0, 1000.0, 1000.0, 1000.0, 140.0, 31.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 05:43:13,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 49/100 (estimated time remaining: 12 hours, 47 minutes, 34 seconds)
2025-09-12 05:56:39,035 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:56:39,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:59:15,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 333.54523 ± 226.912
2025-09-12 05:59:15,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [370.6767, 679.2552, 217.12706, 449.96112, 263.37576, 613.8166, 102.09325, 563.47314, 5.4996395, 70.1739]
2025-09-12 05:59:15,722 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [247.0, 1000.0, 1000.0, 257.0, 1000.0, 1000.0, 53.0, 1000.0, 26.0, 66.0]
2025-09-12 05:59:15,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 50/100 (estimated time remaining: 12 hours, 52 minutes, 56 seconds)
2025-09-12 06:11:38,416 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:11:38,420 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:14:56,909 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 524.22931 ± 232.384
2025-09-12 06:14:56,910 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [867.4242, 462.81485, 790.55334, 841.907, 301.4215, 503.73022, 168.9291, 296.18115, 410.9939, 598.338]
2025-09-12 06:14:56,910 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 474.0, 1000.0, 315.0, 120.0, 1000.0, 234.0, 1000.0]
2025-09-12 06:14:56,910 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (524.23) for latency ExtremeClogL1U23
2025-09-12 06:14:56,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 51/100 (estimated time remaining: 12 hours, 43 minutes, 53 seconds)
2025-09-12 06:26:54,814 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:26:54,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:29:03,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 402.31076 ± 179.195
2025-09-12 06:29:03,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [168.5168, 758.7083, 358.06668, 349.14713, 481.27078, 487.36987, 385.0667, 432.3919, 78.169365, 524.40015]
2025-09-12 06:29:03,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [115.0, 1000.0, 1000.0, 220.0, 243.0, 1000.0, 272.0, 351.0, 78.0, 304.0]
2025-09-12 06:29:03,319 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 52/100 (estimated time remaining: 12 hours, 32 minutes, 51 seconds)
2025-09-12 06:41:30,577 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:41:30,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:44:12,688 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 557.54718 ± 360.241
2025-09-12 06:44:12,689 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [69.673195, 215.97179, 312.4009, 1426.7299, 649.8727, 688.76794, 498.63593, 670.9539, 724.6457, 317.81995]
2025-09-12 06:44:12,689 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [145.0, 126.0, 254.0, 1000.0, 1000.0, 1000.0, 336.0, 436.0, 516.0, 1000.0]
2025-09-12 06:44:12,689 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (557.55) for latency ExtremeClogL1U23
2025-09-12 06:44:12,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 53/100 (estimated time remaining: 12 hours, 19 minutes, 42 seconds)
2025-09-12 06:56:37,485 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:56:37,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:58:43,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 305.61200 ± 159.744
2025-09-12 06:58:43,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [403.70184, 166.74239, 349.3826, 49.975185, 439.06723, 626.5038, 189.60953, 271.2238, 385.48596, 174.42749]
2025-09-12 06:58:43,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 108.0, 220.0, 121.0, 262.0, 356.0, 1000.0, 334.0, 1000.0, 142.0]
2025-09-12 06:58:43,532 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 54/100 (estimated time remaining: 11 hours, 49 minutes, 44 seconds)
2025-09-12 07:12:10,778 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:12:10,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:13:47,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 366.96796 ± 413.601
2025-09-12 07:13:47,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [23.632715, 236.47798, 227.36603, 105.874115, 565.03375, 59.49109, 1461.1543, 227.51295, 119.645935, 643.4907]
2025-09-12 07:13:47,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [25.0, 181.0, 179.0, 75.0, 1000.0, 63.0, 743.0, 139.0, 77.0, 1000.0]
2025-09-12 07:13:47,218 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 55/100 (estimated time remaining: 11 hours, 25 minutes, 37 seconds)
2025-09-12 07:25:53,793 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:25:53,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:28:33,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 315.56403 ± 206.644
2025-09-12 07:28:33,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [471.62708, 178.06596, 249.30261, 274.281, 654.7022, 610.5656, 266.7641, 13.978754, 396.1066, 40.24602]
2025-09-12 07:28:33,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 123.0, 1000.0, 289.0, 326.0, 1000.0, 1000.0, 13.0, 1000.0, 46.0]
2025-09-12 07:28:33,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 56/100 (estimated time remaining: 11 hours, 2 minutes, 27 seconds)
2025-09-12 07:41:27,880 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:41:27,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:44:08,232 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 494.07242 ± 463.153
2025-09-12 07:44:08,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [465.88004, 482.5544, 142.93219, 395.16077, 1808.6812, 325.26584, 591.12885, 300.11734, 374.25513, 54.7483]
2025-09-12 07:44:08,234 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 118.0, 246.0, 1000.0, 1000.0, 1000.0, 152.0, 232.0, 44.0]
2025-09-12 07:44:08,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 57/100 (estimated time remaining: 11 hours, 43 seconds)
2025-09-12 07:56:28,068 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:56:28,075 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:59:32,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 495.98080 ± 319.886
2025-09-12 07:59:32,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [471.29865, 562.2776, 53.603745, 360.5262, 382.5666, 590.05133, 1315.9681, 259.35013, 646.01624, 318.1501]
2025-09-12 07:59:32,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 27.0, 1000.0, 203.0, 1000.0, 619.0, 1000.0, 482.0, 291.0]
2025-09-12 07:59:32,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 58/100 (estimated time remaining: 10 hours, 47 minutes, 47 seconds)
2025-09-12 08:12:27,761 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:12:27,764 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:14:57,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 340.89050 ± 243.905
2025-09-12 08:14:57,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [52.408894, 270.61615, 133.40744, 404.29663, 628.01965, 245.01653, 510.40427, 308.23395, 830.1016, 26.400242]
2025-09-12 08:14:57,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [59.0, 1000.0, 118.0, 1000.0, 407.0, 1000.0, 367.0, 1000.0, 438.0, 18.0]
2025-09-12 08:14:57,814 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 59/100 (estimated time remaining: 10 hours, 40 minutes, 23 seconds)
2025-09-12 08:26:38,893 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:26:38,897 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:29:33,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 357.89417 ± 229.420
2025-09-12 08:29:33,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [606.6656, 142.18114, 72.602585, 87.38807, 536.0855, 592.153, 586.7858, 485.48108, 431.96198, 37.63702]
2025-09-12 08:29:33,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 76.0, 53.0, 74.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 25.0]
2025-09-12 08:29:34,006 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 60/100 (estimated time remaining: 10 hours, 21 minutes, 23 seconds)
2025-09-12 08:42:43,588 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:42:43,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:44:58,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 514.63910 ± 299.278
2025-09-12 08:44:58,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [213.68944, 915.83673, 375.92642, 484.23434, 656.4865, 292.584, 315.84048, 351.78265, 346.8729, 1193.1377]
2025-09-12 08:44:58,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [155.0, 488.0, 1000.0, 246.0, 1000.0, 159.0, 151.0, 1000.0, 154.0, 525.0]
2025-09-12 08:44:58,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 61/100 (estimated time remaining: 10 hours, 11 minutes, 23 seconds)
2025-09-12 08:57:45,394 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:57:45,403 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:02:14,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 511.47272 ± 263.904
2025-09-12 09:02:14,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [564.3097, 473.99792, 1061.2805, 380.99783, 216.18004, 251.82108, 307.329, 370.03, 589.2395, 899.5414]
2025-09-12 09:02:14,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 593.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 09:02:14,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 62/100 (estimated time remaining: 10 hours, 9 minutes, 13 seconds)
2025-09-12 09:14:04,714 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:14:04,718 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:16:17,335 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 327.79114 ± 295.816
2025-09-12 09:16:17,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [352.51324, 6.98596, 430.9287, 337.88132, 108.841606, 14.833945, 14.291026, 768.92944, 347.78506, 894.9209]
2025-09-12 09:16:17,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [269.0, 20.0, 1000.0, 286.0, 90.0, 35.0, 15.0, 1000.0, 1000.0, 1000.0]
2025-09-12 09:16:17,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 63/100 (estimated time remaining: 9 hours, 43 minutes, 19 seconds)
2025-09-12 09:28:52,531 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:28:52,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:31:00,923 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 357.92093 ± 324.799
2025-09-12 09:31:00,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [237.97696, 1013.5431, 477.23526, 906.8637, 75.67327, 303.17996, 206.39244, 84.1521, 44.784966, 229.40717]
2025-09-12 09:31:00,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [127.0, 1000.0, 251.0, 1000.0, 57.0, 1000.0, 117.0, 68.0, 30.0, 1000.0]
2025-09-12 09:31:00,936 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 64/100 (estimated time remaining: 9 hours, 22 minutes, 47 seconds)
2025-09-12 09:43:28,254 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:43:28,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:47:30,210 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 846.15790 ± 457.590
2025-09-12 09:47:30,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [610.43176, 1135.6978, 1496.2068, 899.99274, 678.91364, 254.16711, 370.3018, 1757.0177, 612.9158, 645.93353]
2025-09-12 09:47:30,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 573.0, 1000.0, 155.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 09:47:30,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (846.16) for latency ExtremeClogL1U23
2025-09-12 09:47:30,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 65/100 (estimated time remaining: 9 hours, 21 minutes, 8 seconds)
2025-09-12 10:00:15,661 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:00:15,670 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:01:48,927 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 418.99878 ± 468.095
2025-09-12 10:01:48,928 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [538.26587, 838.2832, 1577.2098, 537.0196, 39.714153, 378.7116, 123.27987, -4.229779, 30.830362, 130.903]
2025-09-12 10:01:48,928 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 502.0, 1000.0, 307.0, 33.0, 255.0, 104.0, 16.0, 38.0, 92.0]
2025-09-12 10:01:48,938 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 66/100 (estimated time remaining: 8 hours, 57 minutes, 50 seconds)
2025-09-12 10:14:10,412 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:14:10,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:16:08,312 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 312.99744 ± 242.396
2025-09-12 10:16:08,313 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [188.62544, 429.25406, 397.98648, 425.24747, 586.56635, 30.938482, 43.87627, 791.57056, 176.31424, 59.59497]
2025-09-12 10:16:08,313 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 293.0, 174.0, 1000.0, 1000.0, 28.0, 54.0, 399.0, 205.0, 54.0]
2025-09-12 10:16:08,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 67/100 (estimated time remaining: 8 hours, 22 minutes, 29 seconds)
2025-09-12 10:28:45,359 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:28:45,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:31:18,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 692.44592 ± 483.706
2025-09-12 10:31:18,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1352.7438, 475.94638, 564.1011, 246.77641, 921.26935, 950.1826, 36.917652, 1630.1066, 444.68045, 301.73416]
2025-09-12 10:31:18,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [674.0, 214.0, 309.0, 1000.0, 325.0, 568.0, 93.0, 1000.0, 265.0, 1000.0]
2025-09-12 10:31:18,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 68/100 (estimated time remaining: 8 hours, 15 minutes, 5 seconds)
2025-09-12 10:44:00,697 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:44:00,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:46:13,508 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 398.50201 ± 346.945
2025-09-12 10:46:13,509 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1143.6467, 156.55972, 60.532585, 861.24207, 106.702866, 616.6263, 281.6362, 206.59692, 431.02914, 120.44789]
2025-09-12 10:46:13,509 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 68.0, 1000.0, 82.0, 1000.0, 169.0, 119.0, 235.0, 60.0]
2025-09-12 10:46:13,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 1 minute, 20 seconds)
2025-09-12 10:59:04,408 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:59:04,412 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:00:16,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 290.82697 ± 208.837
2025-09-12 11:00:16,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [246.25069, 220.65143, 150.28754, 545.5844, 34.46688, 577.1187, 213.81738, 662.29224, 112.7473, 145.05324]
2025-09-12 11:00:16,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 137.0, 107.0, 355.0, 24.0, 304.0, 137.0, 335.0, 95.0, 114.0]
2025-09-12 11:00:16,278 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 70/100 (estimated time remaining: 7 hours, 31 minutes, 9 seconds)
2025-09-12 11:12:25,880 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:12:25,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:14:01,030 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 391.62732 ± 302.649
2025-09-12 11:14:01,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [163.33878, 114.0485, 952.90497, 216.77141, 510.70218, 800.6834, 1.5966803, 302.67136, 647.20636, 206.34967]
2025-09-12 11:14:01,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [78.0, 70.0, 1000.0, 111.0, 268.0, 434.0, 39.0, 1000.0, 292.0, 100.0]
2025-09-12 11:14:01,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 71/100 (estimated time remaining: 7 hours, 13 minutes, 12 seconds)
2025-09-12 11:26:26,425 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:26:26,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:27:15,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 313.63998 ± 405.995
2025-09-12 11:27:15,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [11.929941, 427.3891, 144.90059, 104.770775, 21.869867, 786.0328, 1326.2734, 39.870785, 158.66602, 114.69676]
2025-09-12 11:27:15,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [14.0, 222.0, 137.0, 89.0, 21.0, 336.0, 688.0, 52.0, 95.0, 86.0]
2025-09-12 11:27:15,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 72/100 (estimated time remaining: 6 hours, 52 minutes, 27 seconds)
2025-09-12 11:40:20,283 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:40:20,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:43:03,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 667.77301 ± 621.330
2025-09-12 11:43:03,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [126.06746, 1630.2434, 900.0915, 50.851204, 1055.4171, 320.27673, 463.12924, 171.80702, 127.75183, 1832.0941]
2025-09-12 11:43:03,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [70.0, 820.0, 1000.0, 56.0, 662.0, 160.0, 1000.0, 104.0, 1000.0, 1000.0]
2025-09-12 11:43:03,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 73/100 (estimated time remaining: 6 hours, 41 minutes, 51 seconds)
2025-09-12 11:55:21,858 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:55:21,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:56:53,517 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 502.82465 ± 583.257
2025-09-12 11:56:53,517 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [6.8227873, 54.108665, 1566.7343, 188.38614, 210.84811, 252.3465, 380.89816, 28.082682, 698.18475, 1641.8346]
2025-09-12 11:56:53,517 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [15.0, 27.0, 655.0, 104.0, 1000.0, 87.0, 275.0, 38.0, 330.0, 792.0]
2025-09-12 11:56:53,524 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 74/100 (estimated time remaining: 6 hours, 21 minutes, 36 seconds)
2025-09-12 12:09:31,671 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:09:31,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:11:42,470 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 600.14264 ± 604.220
2025-09-12 12:11:42,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [548.63525, 312.67773, 319.90234, 193.17589, 1715.288, 46.98651, 624.4633, 1785.448, 17.104445, 437.74512]
2025-09-12 12:11:42,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [325.0, 190.0, 146.0, 97.0, 874.0, 46.0, 1000.0, 988.0, 40.0, 1000.0]
2025-09-12 12:11:42,479 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 75/100 (estimated time remaining: 6 hours, 11 minutes, 28 seconds)
2025-09-12 12:24:59,048 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:24:59,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:26:06,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 429.37436 ± 491.397
2025-09-12 12:26:06,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [232.54205, 64.37735, 24.986156, 388.19705, 176.59738, 280.55035, 215.57574, 1706.0901, 960.1007, 244.72708]
2025-09-12 12:26:06,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [173.0, 140.0, 21.0, 204.0, 96.0, 143.0, 188.0, 951.0, 440.0, 106.0]
2025-09-12 12:26:06,871 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 29 seconds)
2025-09-12 12:37:31,600 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:37:31,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:40:20,383 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 622.19427 ± 559.785
2025-09-12 12:40:20,385 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [341.9853, 1722.5691, 132.78015, 15.857129, 40.138237, 435.64423, 1129.3284, 700.9546, 353.8082, 1348.8779]
2025-09-12 12:40:20,385 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 14.0, 36.0, 234.0, 612.0, 1000.0, 174.0, 1000.0]
2025-09-12 12:40:20,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 77/100 (estimated time remaining: 5 hours, 50 minutes, 49 seconds)
2025-09-12 12:53:46,467 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:53:46,470 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:55:51,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 528.97235 ± 246.139
2025-09-12 12:55:51,250 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [792.3813, 693.02893, 294.1497, 510.68594, 752.2934, 232.21211, 809.0603, 608.4567, 549.06604, 48.389267]
2025-09-12 12:55:51,250 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [366.0, 275.0, 1000.0, 338.0, 350.0, 1000.0, 398.0, 330.0, 421.0, 36.0]
2025-09-12 12:55:51,266 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 78/100 (estimated time remaining: 5 hours, 34 minutes, 50 seconds)
2025-09-12 13:08:15,922 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:08:15,925 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:10:51,881 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 745.02753 ± 610.788
2025-09-12 13:10:51,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [274.8514, 1699.7455, 309.17996, 209.0234, 508.09198, 1702.0206, 597.3843, 374.60263, 1576.5691, 198.80684]
2025-09-12 13:10:51,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 862.0, 168.0, 1000.0, 251.0, 1000.0, 318.0, 170.0, 726.0, 132.0]
2025-09-12 13:10:51,893 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 25 minutes, 28 seconds)
2025-09-12 13:23:07,824 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:23:07,830 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:26:06,131 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 559.90753 ± 526.572
2025-09-12 13:26:06,132 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [592.0225, 61.180035, 375.40607, 365.34998, 16.099823, 1001.9168, 55.331245, 556.4901, 719.2086, 1856.0703]
2025-09-12 13:26:06,132 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 58.0, 1000.0, 1000.0, 19.0, 1000.0, 69.0, 1000.0, 315.0, 875.0]
2025-09-12 13:26:06,141 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 12 minutes, 27 seconds)
2025-09-12 13:38:30,255 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:38:30,274 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:41:10,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 306.31946 ± 224.328
2025-09-12 13:41:10,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [543.63385, 291.29678, 16.85219, 169.41492, 264.46942, 437.05148, 201.08746, 809.3212, 61.824997, 268.2422]
2025-09-12 13:41:10,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [268.0, 132.0, 42.0, 1000.0, 1000.0, 219.0, 1000.0, 1000.0, 40.0, 1000.0]
2025-09-12 13:41:10,686 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 15 seconds)
2025-09-12 13:53:42,352 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:53:42,355 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:55:14,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 511.34122 ± 279.034
2025-09-12 13:55:14,107 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [156.615, 542.9234, 847.87537, 337.3643, 30.555979, 407.5908, 630.662, 476.23126, 969.1946, 714.3996]
2025-09-12 13:55:14,107 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 246.0, 323.0, 158.0, 33.0, 170.0, 304.0, 282.0, 482.0, 309.0]
2025-09-12 13:55:14,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 44 minutes, 36 seconds)
2025-09-12 14:07:42,429 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:07:42,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:10:46,850 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 660.92725 ± 525.262
2025-09-12 14:10:46,851 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1180.1918, 152.03519, 260.68915, 1787.7301, 1060.2037, 77.026085, 446.92184, 175.65356, 670.9062, 797.9152]
2025-09-12 14:10:46,851 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [578.0, 1000.0, 1000.0, 1000.0, 579.0, 46.0, 1000.0, 68.0, 1000.0, 354.0]
2025-09-12 14:10:46,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 29 minutes, 44 seconds)
2025-09-12 14:23:16,800 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:23:16,803 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:26:21,195 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 565.74011 ± 448.440
2025-09-12 14:26:21,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [635.16614, 1014.7421, 1600.9706, 306.30792, 889.16864, 149.95563, 344.2321, 349.43857, 170.91986, 196.49994]
2025-09-12 14:26:21,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 105.0, 173.0, 1000.0, 138.0, 111.0]
2025-09-12 14:26:21,203 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 16 minutes, 39 seconds)
2025-09-12 14:39:14,176 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:39:14,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:42:58,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 875.59753 ± 264.001
2025-09-12 14:42:58,031 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [797.49866, 1260.2217, 387.84964, 664.11554, 846.5816, 1119.8422, 713.10535, 1234.6691, 717.2295, 1014.8621]
2025-09-12 14:42:58,031 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [426.0, 1000.0, 1000.0, 1000.0, 390.0, 574.0, 1000.0, 1000.0, 1000.0, 501.0]
2025-09-12 14:42:58,031 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (875.60) for latency ExtremeClogL1U23
2025-09-12 14:42:58,040 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 5 minutes, 58 seconds)
2025-09-12 14:55:29,954 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:55:29,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:57:49,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 566.71722 ± 617.852
2025-09-12 14:57:49,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [12.1404705, 337.98148, 92.595375, 315.98438, 1192.776, 227.15388, 9.697233, 233.30016, 1549.6903, 1695.853]
2025-09-12 14:57:49,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [19.0, 170.0, 54.0, 149.0, 619.0, 1000.0, 34.0, 1000.0, 928.0, 1000.0]
2025-09-12 14:57:49,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 49 minutes, 57 seconds)
2025-09-12 15:10:33,269 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:10:33,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:13:09,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 196.94783 ± 149.392
2025-09-12 15:13:09,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [339.46167, 198.89336, 209.42542, 554.02094, 111.165825, 17.208235, 29.853382, 216.07962, 113.48976, 179.88016]
2025-09-12 15:13:09,055 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 343.0, 91.0, 22.0, 40.0, 1000.0, 55.0, 1000.0]
2025-09-12 15:13:09,063 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 38 minutes, 9 seconds)
2025-09-12 15:26:04,262 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:26:04,266 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:28:44,504 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 542.05316 ± 323.807
2025-09-12 15:28:44,504 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [439.52255, 756.46783, 936.46436, 46.44207, 354.96713, 662.5116, 1078.7473, 719.772, 262.35767, 163.27972]
2025-09-12 15:28:44,504 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 452.0, 49.0, 241.0, 342.0, 569.0, 1000.0, 1000.0, 131.0]
2025-09-12 15:28:44,511 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 22 minutes, 41 seconds)
2025-09-12 15:40:44,377 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:40:44,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:42:57,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 454.29004 ± 342.467
2025-09-12 15:42:57,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1220.4879, 426.2066, 575.25525, 42.223545, 234.33658, 354.86948, 411.97928, 435.139, 832.2652, 10.137732]
2025-09-12 15:42:57,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [630.0, 1000.0, 1000.0, 29.0, 1000.0, 169.0, 179.0, 243.0, 405.0, 96.0]
2025-09-12 15:42:57,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 3 minutes, 50 seconds)
2025-09-12 15:55:41,661 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:55:41,671 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:56:57,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 484.64154 ± 517.491
2025-09-12 15:56:57,096 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [58.50655, 210.26663, 483.55038, 191.38777, 753.26135, 962.18726, 34.332104, 371.51233, 1753.7051, 27.706163]
2025-09-12 15:56:57,096 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [132.0, 124.0, 252.0, 100.0, 373.0, 480.0, 40.0, 290.0, 908.0, 36.0]
2025-09-12 15:56:57,106 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 42 minutes, 45 seconds)
2025-09-12 16:09:40,693 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:09:40,697 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:12:04,203 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 873.55780 ± 700.280
2025-09-12 16:12:04,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [168.23996, 465.94318, 522.1825, 0.7310666, 1589.5619, 1843.5437, 771.7382, 2180.7617, 716.28986, 476.58554]
2025-09-12 16:12:04,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [106.0, 270.0, 250.0, 15.0, 1000.0, 900.0, 1000.0, 1000.0, 361.0, 313.0]
2025-09-12 16:12:04,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 28 minutes, 28 seconds)
2025-09-12 16:24:29,752 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:24:29,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:25:51,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 666.57086 ± 471.663
2025-09-12 16:25:51,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [609.51825, 192.3385, 1558.1937, 1055.4918, 230.42516, 612.0868, 1240.3578, 817.37946, 279.1877, 70.72937]
2025-09-12 16:25:51,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [261.0, 71.0, 570.0, 530.0, 133.0, 325.0, 521.0, 343.0, 175.0, 55.0]
2025-09-12 16:25:51,825 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 10 minutes, 52 seconds)
2025-09-12 16:38:46,800 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:38:46,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:41:14,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 738.61853 ± 707.091
2025-09-12 16:41:14,058 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [2142.6514, 595.79663, 145.09256, 506.5481, 1904.2374, 359.7685, 161.34874, 344.93967, 74.423836, 1151.3784]
2025-09-12 16:41:14,058 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 351.0, 103.0, 225.0, 1000.0, 1000.0, 79.0, 1000.0, 60.0, 532.0]
2025-09-12 16:41:14,066 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 55 minutes, 59 seconds)
2025-09-12 16:54:00,483 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:54:00,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:56:39,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 599.09717 ± 479.357
2025-09-12 16:56:39,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1385.5347, 132.81032, 314.7979, 930.1708, 497.58832, 210.44144, 251.50308, 1359.2441, 884.3424, 24.538404]
2025-09-12 16:56:39,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [644.0, 58.0, 145.0, 1000.0, 244.0, 1000.0, 1000.0, 640.0, 1000.0, 24.0]
2025-09-12 16:56:39,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 43 minutes, 10 seconds)
2025-09-12 17:07:58,346 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:07:58,360 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:10:47,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 860.50311 ± 637.262
2025-09-12 17:10:47,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [461.00558, 1120.9169, 566.603, 1865.3413, 583.7319, 475.86465, 48.36375, 2160.6294, 865.8804, 456.6939]
2025-09-12 17:10:47,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [191.0, 568.0, 261.0, 795.0, 275.0, 1000.0, 39.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:10:47,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 28 minutes, 36 seconds)
2025-09-12 17:23:24,725 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:23:24,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:26:02,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 838.25061 ± 641.483
2025-09-12 17:26:02,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1241.7788, 659.96967, 242.95847, 484.83856, 715.0216, 156.61087, 458.4992, 1654.928, 502.32086, 2265.5798]
2025-09-12 17:26:02,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 294.0, 1000.0, 226.0, 440.0, 115.0, 308.0, 1000.0, 304.0, 1000.0]
2025-09-12 17:26:02,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 13 minutes, 58 seconds)
2025-09-12 17:38:57,650 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:38:57,654 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:40:52,543 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 605.77679 ± 607.768
2025-09-12 17:40:52,543 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [839.75195, 38.9128, 79.072334, 190.65228, 1841.352, 1328.8103, 363.75388, 1132.8746, 201.02924, 41.558216]
2025-09-12 17:40:52,543 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [405.0, 1000.0, 57.0, 80.0, 849.0, 776.0, 232.0, 577.0, 116.0, 76.0]
2025-09-12 17:40:52,554 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour)
2025-09-12 17:53:03,212 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:53:03,216 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:55:31,724 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 841.11993 ± 586.882
2025-09-12 17:55:31,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1220.9823, 868.36914, 252.20065, 468.89243, 670.7113, 974.6423, 879.0511, 2352.457, 256.5126, 467.38074]
2025-09-12 17:55:31,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [604.0, 364.0, 113.0, 206.0, 359.0, 1000.0, 457.0, 1000.0, 1000.0, 258.0]
2025-09-12 17:55:31,747 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 98/100 (estimated time remaining: 44 minutes, 34 seconds)
2025-09-12 18:07:42,821 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:07:42,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:10:01,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 993.65918 ± 641.740
2025-09-12 18:10:01,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [1134.9249, 874.4771, 1459.9938, 2153.4463, 1908.5282, 837.85205, 85.78064, 426.90463, 727.27155, 327.41302]
2025-09-12 18:10:01,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [459.0, 424.0, 785.0, 1000.0, 1000.0, 500.0, 36.0, 226.0, 393.0, 180.0]
2025-09-12 18:10:01,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (993.66) for latency ExtremeClogL1U23
2025-09-12 18:10:01,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 99/100 (estimated time remaining: 29 minutes, 21 seconds)
2025-09-12 18:23:11,379 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:23:11,383 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:25:15,022 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 725.68091 ± 689.760
2025-09-12 18:25:15,024 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [23.946165, 93.011086, 1064.4324, 1627.2614, 964.19604, 473.9557, 48.350025, 2121.2205, 755.033, 85.40261]
2025-09-12 18:25:15,024 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [38.0, 52.0, 547.0, 1000.0, 1000.0, 252.0, 23.0, 1000.0, 399.0, 180.0]
2025-09-12 18:25:15,035 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1199 [INFO]: Iteration 100/100 (estimated time remaining: 14 minutes, 53 seconds)
2025-09-12 18:37:55,392 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:37:55,397 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:41:03,479 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1221 [DEBUG]: Total Reward: 1209.93762 ± 832.611
2025-09-12 18:41:03,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1222 [DEBUG]: All rewards: [202.65135, 1861.0449, 733.27997, 2222.3916, 2041.3116, 752.7569, 8.390005, 2137.7996, 1776.0994, 363.65]
2025-09-12 18:41:03,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1223 [DEBUG]: All trajectory lengths: [107.0, 980.0, 313.0, 1000.0, 1000.0, 376.0, 18.0, 1000.0, 1000.0, 1000.0]
2025-09-12 18:41:03,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1226 [INFO]: New best (1209.94) for latency ExtremeClogL1U23
2025-09-12 18:41:03,521 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc15-ant):1251 [DEBUG]: Training session finished
