2025-09-11 18:16:29,483 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc25-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:16:29,483 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc25-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:16:29,483 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x1545ac8b01d0>}
2025-09-11 18:16:29,483 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1111 [DEBUG]: using device: cuda
2025-09-11 18:16:29,489 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1133 [INFO]: Creating new trainer
2025-09-11 18:16:29,505 baseline-mbpac-noiseperc25-ant:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 18:16:29,506 baseline-mbpac-noiseperc25-ant:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=35, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 18:16:29,516 baseline-mbpac-noiseperc25-ant:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=27, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=8, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 18:16:30,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1194 [DEBUG]: Starting training session...
2025-09-11 18:16:30,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 1/100
2025-09-11 18:28:29,981 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:28:29,991 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:29:47,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -243.44621 ± 330.443
2025-09-11 18:29:47,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-33.280544, -175.80417, -297.6124, -97.36608, -11.313049, -915.6269, -845.342, -12.891097, -12.004484, -33.221333]
2025-09-11 18:29:47,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [60.0, 203.0, 296.0, 71.0, 17.0, 1000.0, 1000.0, 14.0, 16.0, 27.0]
2025-09-11 18:29:47,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (-243.45) for latency ExtremeClogL1U23
2025-09-11 18:29:47,612 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 2/100 (estimated time remaining: 21 hours, 55 minutes, 9 seconds)
2025-09-11 18:42:27,769 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:42:27,779 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:43:21,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -120.47715 ± 181.357
2025-09-11 18:43:21,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-5.787028, -4.1669497, -81.06367, -117.21142, -5.850709, -37.977676, -56.956036, -79.479324, -174.15384, -642.12494]
2025-09-11 18:43:21,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [16.0, 21.0, 127.0, 214.0, 22.0, 32.0, 119.0, 104.0, 212.0, 1000.0]
2025-09-11 18:43:21,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (-120.48) for latency ExtremeClogL1U23
2025-09-11 18:43:21,202 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 3/100 (estimated time remaining: 21 hours, 55 minutes, 21 seconds)
2025-09-11 18:56:20,340 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:56:20,344 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:56:35,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -20.68482 ± 24.075
2025-09-11 18:56:35,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [3.3409975, -21.266273, -22.574059, -22.63626, -36.582123, -14.847746, -81.93672, -8.932869, 10.156088, -11.569203]
2025-09-11 18:56:35,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [39.0, 54.0, 47.0, 32.0, 41.0, 24.0, 157.0, 34.0, 38.0, 65.0]
2025-09-11 18:56:35,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (-20.68) for latency ExtremeClogL1U23
2025-09-11 18:56:35,609 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 4/100 (estimated time remaining: 21 hours, 36 minutes, 3 seconds)
2025-09-11 19:09:31,060 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:09:31,062 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:10:41,759 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -153.51657 ± 269.495
2025-09-11 19:10:41,760 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-7.914783, 0.12861854, -11.4776325, -124.39874, 4.5748534, -22.311045, -2.2535543, 2.6876156, -659.3939, -714.80707]
2025-09-11 19:10:41,760 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [61.0, 57.0, 16.0, 147.0, 52.0, 72.0, 27.0, 39.0, 1000.0, 1000.0]
2025-09-11 19:10:41,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 5/100 (estimated time remaining: 21 hours, 40 minutes, 29 seconds)
2025-09-11 19:24:02,357 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:24:02,370 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:24:43,035 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -75.96362 ± 180.708
2025-09-11 19:24:43,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-21.873463, -30.521788, -17.494385, -35.398827, -29.977333, -3.1049197, -7.9343505, -17.249937, -616.13214, 20.05099]
2025-09-11 19:24:43,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [18.0, 27.0, 129.0, 43.0, 88.0, 14.0, 27.0, 12.0, 1000.0, 45.0]
2025-09-11 19:24:43,045 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 6/100 (estimated time remaining: 21 hours, 35 minutes, 57 seconds)
2025-09-11 19:37:17,737 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:37:17,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:38:23,709 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -136.35172 ± 235.224
2025-09-11 19:38:23,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [2.4991584, -603.5667, -29.828588, -8.21685, -2.9820168, -35.06104, -7.8344584, -5.007083, -66.737564, -606.7819]
2025-09-11 19:38:23,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [32.0, 1000.0, 36.0, 57.0, 13.0, 50.0, 36.0, 14.0, 46.0, 1000.0]
2025-09-11 19:38:23,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 7/100 (estimated time remaining: 21 hours, 29 minutes, 42 seconds)
2025-09-11 19:51:54,386 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:51:54,389 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:52:02,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -8.34619 ± 12.560
2025-09-11 19:52:02,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-21.61952, -11.511791, -8.236133, 13.906629, -10.790863, -31.873272, -9.980221, -3.1039891, -9.385057, 9.1322775]
2025-09-11 19:52:02,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [26.0, 12.0, 11.0, 31.0, 47.0, 37.0, 18.0, 35.0, 9.0, 47.0]
2025-09-11 19:52:02,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (-8.35) for latency ExtremeClogL1U23
2025-09-11 19:52:02,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 8/100 (estimated time remaining: 21 hours, 17 minutes, 28 seconds)
2025-09-11 20:04:21,842 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:04:21,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:05:00,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -90.38364 ± 209.740
2025-09-11 20:05:00,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-46.85222, 6.4648256, -5.4314423, -51.185837, -4.2889667, -17.52538, -716.89233, -14.659691, -6.5954685, -46.869984]
2025-09-11 20:05:00,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [45.0, 77.0, 27.0, 71.0, 19.0, 29.0, 1000.0, 15.0, 23.0, 39.0]
2025-09-11 20:05:00,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 9/100 (estimated time remaining: 20 hours, 58 minutes, 49 seconds)
2025-09-11 20:17:51,032 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:17:51,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:18:01,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -14.02943 ± 11.981
2025-09-11 20:18:01,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-4.8280144, -18.815414, -19.524357, -19.179426, -43.659054, -1.73293, -4.2624116, -16.875603, -5.404711, -6.012396]
2025-09-11 20:18:01,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [10.0, 34.0, 25.0, 89.0, 31.0, 28.0, 23.0, 29.0, 74.0, 17.0]
2025-09-11 20:18:01,046 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 10/100 (estimated time remaining: 20 hours, 25 minutes, 14 seconds)
2025-09-11 20:30:47,665 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:30:47,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:30:57,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -11.01201 ± 17.061
2025-09-11 20:30:57,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-39.26816, -7.859828, 1.005875, 2.0653963, -11.240233, 15.614491, -2.275164, -2.8883889, -32.172314, -33.101772]
2025-09-11 20:30:57,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [56.0, 17.0, 38.0, 36.0, 20.0, 21.0, 12.0, 58.0, 40.0, 38.0]
2025-09-11 20:30:57,198 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 11/100 (estimated time remaining: 19 hours, 52 minutes, 14 seconds)
2025-09-11 20:43:50,581 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:43:50,585 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:43:55,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -9.62740 ± 6.514
2025-09-11 20:43:55,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-10.220131, -10.946545, -3.5346923, -5.808256, -23.490824, -13.848365, -12.170877, -11.890878, -6.5467563, 2.1833472]
2025-09-11 20:43:55,839 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [10.0, 28.0, 14.0, 17.0, 19.0, 23.0, 13.0, 14.0, 23.0, 25.0]
2025-09-11 20:43:55,848 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 12/100 (estimated time remaining: 19 hours, 26 minutes, 31 seconds)
2025-09-11 20:57:23,304 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:57:23,309 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:57:30,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -5.01872 ± 9.593
2025-09-11 20:57:30,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-6.267606, 0.06473563, 0.38023138, -14.732053, -5.0992517, -12.354906, -9.788294, -1.610196, 17.476133, -18.255949]
2025-09-11 20:57:30,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [11.0, 23.0, 16.0, 32.0, 32.0, 15.0, 64.0, 10.0, 43.0, 27.0]
2025-09-11 20:57:30,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (-5.02) for latency ExtremeClogL1U23
2025-09-11 20:57:30,976 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 13/100 (estimated time remaining: 19 hours, 12 minutes, 28 seconds)
2025-09-11 21:09:42,521 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:09:42,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:09:49,016 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -1.04468 ± 11.771
2025-09-11 21:09:49,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-2.2561214, -2.2348664, 18.160099, 12.167179, -6.634962, 13.047743, -14.582944, -6.380867, -0.7490356, -20.983063]
2025-09-11 21:09:49,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [14.0, 14.0, 30.0, 28.0, 15.0, 33.0, 49.0, 12.0, 12.0, 26.0]
2025-09-11 21:09:49,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (-1.04) for latency ExtremeClogL1U23
2025-09-11 21:09:49,022 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 14/100 (estimated time remaining: 18 hours, 47 minutes, 40 seconds)
2025-09-11 21:22:52,266 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:22:52,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:22:59,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -14.54508 ± 17.923
2025-09-11 21:22:59,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-1.6521617, -21.050436, -10.260765, -19.920925, -14.687053, 9.039529, -49.84834, -13.471341, -35.65022, 12.050916]
2025-09-11 21:22:59,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [11.0, 24.0, 12.0, 22.0, 17.0, 29.0, 69.0, 15.0, 35.0, 16.0]
2025-09-11 21:22:59,403 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 15/100 (estimated time remaining: 18 hours, 37 minutes, 31 seconds)
2025-09-11 21:35:33,549 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:35:33,552 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:36:08,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -67.03476 ± 204.346
2025-09-11 21:36:08,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-9.82001, -4.680603, 5.4222794, -1.2768571, -12.074587, 8.046013, -679.4041, -8.012229, 16.114065, 15.338388]
2025-09-11 21:36:08,606 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [24.0, 10.0, 40.0, 9.0, 20.0, 28.0, 1000.0, 17.0, 28.0, 39.0]
2025-09-11 21:36:08,616 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 16/100 (estimated time remaining: 18 hours, 28 minutes, 14 seconds)
2025-09-11 21:49:04,149 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:49:04,155 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:49:10,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -4.66623 ± 12.063
2025-09-11 21:49:10,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-6.987483, -35.371548, -12.45172, 3.4229038, -7.7449055, 4.5516753, -4.285138, -0.16317624, 2.2732828, 10.093818]
2025-09-11 21:49:10,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [43.0, 42.0, 12.0, 9.0, 13.0, 12.0, 12.0, 10.0, 10.0, 48.0]
2025-09-11 21:49:10,076 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 17/100 (estimated time remaining: 18 hours, 15 minutes, 59 seconds)
2025-09-11 22:02:43,677 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:02:43,685 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:02:54,441 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -1.55637 ± 17.123
2025-09-11 22:02:54,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [4.066855, 16.872103, -3.7117906, -28.529467, -4.5970287, -21.215054, 35.446213, -8.890108, -4.5009756, -0.5044546]
2025-09-11 22:02:54,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [12.0, 39.0, 38.0, 99.0, 15.0, 28.0, 39.0, 43.0, 50.0, 15.0]
2025-09-11 22:02:54,445 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 18/100 (estimated time remaining: 18 hours, 5 minutes, 29 seconds)
2025-09-11 22:14:48,981 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:14:48,988 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:14:52,773 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -7.00433 ± 4.487
2025-09-11 22:14:52,773 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-4.8583016, -8.035764, -4.43831, -5.175433, -5.906246, 0.3174074, -8.212507, -5.3602023, -10.755955, -17.618004]
2025-09-11 22:14:52,773 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [9.0, 13.0, 9.0, 16.0, 18.0, 10.0, 9.0, 10.0, 20.0, 21.0]
2025-09-11 22:14:52,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 19/100 (estimated time remaining: 17 hours, 47 minutes, 1 second)
2025-09-11 22:28:07,454 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:28:07,459 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:28:11,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -5.41443 ± 8.296
2025-09-11 22:28:11,951 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-1.3590744, -24.840033, -6.8113647, -2.2720728, 10.355884, -7.702751, -8.525818, -6.074934, -1.1238191, -5.790364]
2025-09-11 22:28:11,951 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [13.0, 43.0, 11.0, 16.0, 20.0, 21.0, 11.0, 7.0, 8.0, 11.0]
2025-09-11 22:28:11,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 20/100 (estimated time remaining: 17 hours, 36 minutes, 23 seconds)
2025-09-11 22:40:35,343 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:40:35,345 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:41:09,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -42.18421 ± 120.178
2025-09-11 22:41:09,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [1.8389426, -9.588143, 6.180761, -401.7178, 0.033924975, 1.9669408, -10.565086, 3.1660924, 9.227662, -22.385387]
2025-09-11 22:41:09,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [9.0, 15.0, 10.0, 1000.0, 58.0, 14.0, 13.0, 12.0, 41.0, 25.0]
2025-09-11 22:41:09,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 21/100 (estimated time remaining: 17 hours, 20 minutes, 11 seconds)
2025-09-11 22:53:49,451 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:53:49,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:54:22,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -38.17661 ± 119.603
2025-09-11 22:54:22,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-13.2779045, 10.975243, 2.31688, 7.2211823, 2.2884219, 3.5844772, -2.9347966, -4.558816, -396.40836, 9.0275955]
2025-09-11 22:54:22,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [45.0, 13.0, 12.0, 10.0, 13.0, 17.0, 15.0, 9.0, 1000.0, 32.0]
2025-09-11 22:54:22,911 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 22/100 (estimated time remaining: 17 hours, 10 minutes, 22 seconds)
2025-09-11 23:07:09,891 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:07:09,892 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:07:16,147 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -5.78181 ± 6.637
2025-09-11 23:07:16,147 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-8.662882, -14.370812, 7.4745, -0.09625908, -5.3176856, -15.493185, -2.3863804, -1.1985579, -10.320295, -7.446577]
2025-09-11 23:07:16,148 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [14.0, 26.0, 33.0, 30.0, 14.0, 21.0, 18.0, 20.0, 11.0, 35.0]
2025-09-11 23:07:16,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 23/100 (estimated time remaining: 16 hours, 44 minutes, 3 seconds)
2025-09-11 23:20:01,950 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:20:01,953 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:20:08,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 1.33153 ± 14.822
2025-09-11 23:20:08,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [14.047237, -0.35619804, -1.6794066, -4.8923087, 9.201866, 4.4838905, -3.3124657, 11.637536, 20.501307, -36.316124]
2025-09-11 23:20:08,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [24.0, 10.0, 13.0, 21.0, 17.0, 17.0, 10.0, 26.0, 21.0, 58.0]
2025-09-11 23:20:08,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (1.33) for latency ExtremeClogL1U23
2025-09-11 23:20:08,047 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 24/100 (estimated time remaining: 16 hours, 44 minutes, 55 seconds)
2025-09-11 23:32:57,297 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:32:57,298 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:33:04,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -11.07099 ± 10.032
2025-09-11 23:33:04,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [5.3225846, -16.44095, -2.331208, -8.864907, -6.7879496, -19.740793, -28.054852, -23.03395, -1.0322969, -9.745555]
2025-09-11 23:33:04,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [12.0, 20.0, 12.0, 11.0, 11.0, 31.0, 41.0, 70.0, 18.0, 24.0]
2025-09-11 23:33:04,215 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 25/100 (estimated time remaining: 16 hours, 26 minutes, 2 seconds)
2025-09-11 23:45:53,127 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:45:53,128 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:45:59,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 0.86677 ± 10.768
2025-09-11 23:45:59,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-0.8469991, 29.51723, -0.16197993, -12.192284, -0.40536195, -5.9480834, -5.572878, 4.3233223, -5.547767, 5.5024743]
2025-09-11 23:45:59,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [8.0, 55.0, 8.0, 13.0, 14.0, 14.0, 22.0, 25.0, 42.0, 25.0]
2025-09-11 23:45:59,475 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 26/100 (estimated time remaining: 16 hours, 12 minutes, 32 seconds)
2025-09-11 23:58:50,346 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:58:50,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:58:57,423 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 2.24393 ± 16.202
2025-09-11 23:58:57,423 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [13.970541, -30.248388, 26.22318, 9.784919, -9.3908, 25.183207, -4.6223764, -0.51549923, -5.742215, -2.203265]
2025-09-11 23:58:57,423 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [20.0, 43.0, 34.0, 17.0, 12.0, 32.0, 9.0, 16.0, 13.0, 56.0]
2025-09-11 23:58:57,423 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (2.24) for latency ExtremeClogL1U23
2025-09-11 23:58:57,431 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 27/100 (estimated time remaining: 15 hours, 55 minutes, 42 seconds)
2025-09-12 00:11:41,882 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:11:41,887 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:11:47,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 0.95318 ± 9.818
2025-09-12 00:11:47,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [1.6535529, -4.853785, -10.032372, -5.546772, 18.978737, 4.836585, -6.450292, -10.372464, 15.917257, 5.401316]
2025-09-12 00:11:47,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [14.0, 9.0, 25.0, 39.0, 22.0, 17.0, 12.0, 24.0, 23.0, 16.0]
2025-09-12 00:11:47,518 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 28/100 (estimated time remaining: 15 hours, 42 minutes, 1 second)
2025-09-12 00:24:40,952 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:24:40,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:24:46,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 0.17998 ± 10.465
2025-09-12 00:24:46,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [6.4576178, -9.310183, -10.523561, -1.7975382, -7.9058843, 27.408997, 0.88072443, 2.9768617, -1.0392287, -5.3479614]
2025-09-12 00:24:46,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [17.0, 11.0, 15.0, 13.0, 38.0, 46.0, 18.0, 21.0, 8.0, 9.0]
2025-09-12 00:24:46,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 29/100 (estimated time remaining: 15 hours, 30 minutes, 48 seconds)
2025-09-12 00:37:26,693 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:37:26,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:38:01,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -23.10773 ± 66.718
2025-09-12 00:38:01,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-219.60022, 7.6498146, -27.747194, -16.279318, -5.1106253, 11.544032, 2.1220698, -0.19991864, 19.083378, -2.5393536]
2025-09-12 00:38:01,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 26.0, 43.0, 29.0, 30.0, 31.0, 9.0, 9.0, 28.0, 15.0]
2025-09-12 00:38:01,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 30/100 (estimated time remaining: 15 hours, 22 minutes, 21 seconds)
2025-09-12 00:50:49,916 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:50:49,917 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:50:55,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -0.87925 ± 8.975
2025-09-12 00:50:55,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [7.258792, 12.710365, -10.685442, 5.5699253, -1.1185368, -0.011992413, -8.315012, -17.498362, 7.6401787, -4.3424363]
2025-09-12 00:50:55,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [23.0, 31.0, 32.0, 13.0, 10.0, 18.0, 16.0, 21.0, 16.0, 25.0]
2025-09-12 00:50:55,595 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 31/100 (estimated time remaining: 15 hours, 9 minutes, 5 seconds)
2025-09-12 01:03:47,944 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:03:47,955 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:04:22,740 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -9.28721 ± 68.131
2025-09-12 01:04:22,740 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-211.73424, 8.742115, 4.9079113, 15.565894, 26.375492, 2.0921962, 23.111755, 27.064346, 11.792064, -0.789668]
2025-09-12 01:04:22,740 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 13.0, 23.0, 21.0, 56.0, 10.0, 23.0, 28.0, 11.0, 21.0]
2025-09-12 01:04:22,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 32/100 (estimated time remaining: 15 hours, 2 minutes, 49 seconds)
2025-09-12 01:17:07,988 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:17:07,990 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:17:16,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 8.34159 ± 23.264
2025-09-12 01:17:16,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [52.50609, -0.7243973, -5.784321, -6.7395067, -19.82004, 39.830635, -8.490551, 11.1775055, 31.52874, -10.068289]
2025-09-12 01:17:16,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [48.0, 11.0, 29.0, 22.0, 37.0, 52.0, 17.0, 23.0, 40.0, 13.0]
2025-09-12 01:17:16,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (8.34) for latency ExtremeClogL1U23
2025-09-12 01:17:16,312 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 33/100 (estimated time remaining: 14 hours, 50 minutes, 31 seconds)
2025-09-12 01:30:01,801 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:30:01,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:30:09,089 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 12.00822 ± 16.008
2025-09-12 01:30:09,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [5.073617, 38.892754, -0.60168546, 14.480051, -3.1777852, 17.821648, 11.894831, -8.010423, 2.265713, 41.44353]
2025-09-12 01:30:09,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [21.0, 25.0, 20.0, 25.0, 10.0, 24.0, 32.0, 59.0, 15.0, 30.0]
2025-09-12 01:30:09,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (12.01) for latency ExtremeClogL1U23
2025-09-12 01:30:09,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 34/100 (estimated time remaining: 14 hours, 36 minutes, 4 seconds)
2025-09-12 01:43:11,917 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:43:11,922 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:43:21,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 18.28492 ± 25.018
2025-09-12 01:43:21,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-0.54927117, 4.731187, 57.163414, 65.957886, 32.332783, 14.982269, -16.742771, 1.0215226, 4.111039, 19.84111]
2025-09-12 01:43:21,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [18.0, 21.0, 47.0, 53.0, 78.0, 28.0, 24.0, 18.0, 14.0, 32.0]
2025-09-12 01:43:21,291 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (18.28) for latency ExtremeClogL1U23
2025-09-12 01:43:21,299 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 35/100 (estimated time remaining: 14 hours, 22 minutes, 20 seconds)
2025-09-12 01:56:04,155 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:56:04,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:56:11,355 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 14.16800 ± 23.972
2025-09-12 01:56:11,355 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [10.362245, -7.5047655, 51.65097, -2.9184783, 2.8833296, 5.319623, 48.314716, 1.8311731, 47.77548, -16.034334]
2025-09-12 01:56:11,355 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [15.0, 19.0, 62.0, 23.0, 16.0, 14.0, 46.0, 9.0, 32.0, 21.0]
2025-09-12 01:56:11,361 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 36/100 (estimated time remaining: 14 hours, 8 minutes, 24 seconds)
2025-09-12 02:09:14,549 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:09:14,553 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:09:22,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 9.63972 ± 12.254
2025-09-12 02:09:22,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [24.48834, 1.6911359, -0.1340772, 9.19701, 9.298292, 15.044066, -0.17860413, 9.042576, -7.9186554, 35.86716]
2025-09-12 02:09:22,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [47.0, 14.0, 14.0, 50.0, 31.0, 24.0, 19.0, 30.0, 8.0, 30.0]
2025-09-12 02:09:22,059 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 37/100 (estimated time remaining: 13 hours, 51 minutes, 50 seconds)
2025-09-12 02:23:00,086 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:23:00,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:23:07,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 11.15079 ± 18.487
2025-09-12 02:23:07,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [46.217197, -11.094393, 3.970274, -11.602971, 25.09848, 4.4491167, 36.473663, 16.330498, 2.1365833, -0.47057495]
2025-09-12 02:23:07,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [51.0, 18.0, 19.0, 23.0, 36.0, 15.0, 38.0, 18.0, 14.0, 18.0]
2025-09-12 02:23:07,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 38/100 (estimated time remaining: 13 hours, 49 minutes, 40 seconds)
2025-09-12 02:35:34,856 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:35:34,860 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:35:41,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 3.29174 ± 22.655
2025-09-12 02:35:41,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [28.448692, -6.7029014, -6.662437, -10.6768875, -8.1589, -1.473274, 62.66974, -4.789447, -7.1831326, -12.554065]
2025-09-12 02:35:41,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [57.0, 15.0, 9.0, 12.0, 15.0, 18.0, 49.0, 23.0, 15.0, 20.0]
2025-09-12 02:35:41,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 39/100 (estimated time remaining: 13 hours, 32 minutes, 39 seconds)
2025-09-12 02:47:44,768 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:47:44,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:48:23,889 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -24.52940 ± 80.144
2025-09-12 02:48:23,889 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [5.989042, 9.140807, 0.55774003, -4.0340476, 44.865974, -81.04653, -245.3974, -16.727468, 11.313047, 30.044819]
2025-09-12 02:48:23,889 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [9.0, 24.0, 10.0, 26.0, 62.0, 129.0, 1000.0, 26.0, 36.0, 35.0]
2025-09-12 02:48:23,900 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 40/100 (estimated time remaining: 13 hours, 13 minutes, 31 seconds)
2025-09-12 03:01:14,322 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:01:14,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:01:23,125 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 7.84382 ± 22.151
2025-09-12 03:01:23,125 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-28.716576, 10.379941, 35.68962, 12.80397, 14.983562, -7.24795, 6.3738756, -12.835308, 51.472057, -4.4650173]
2025-09-12 03:01:23,125 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [47.0, 15.0, 67.0, 29.0, 20.0, 28.0, 16.0, 18.0, 65.0, 14.0]
2025-09-12 03:01:23,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 41/100 (estimated time remaining: 13 hours, 2 minutes, 21 seconds)
2025-09-12 03:14:14,520 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:14:14,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:14:26,267 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 21.95460 ± 32.352
2025-09-12 03:14:26,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [81.54246, 5.87071, -19.907001, 0.8509302, -0.94422215, 12.040236, -0.9891894, 17.649046, 69.76181, 53.671234]
2025-09-12 03:14:26,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [47.0, 13.0, 40.0, 68.0, 16.0, 28.0, 17.0, 42.0, 101.0, 44.0]
2025-09-12 03:14:26,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (21.95) for latency ExtremeClogL1U23
2025-09-12 03:14:26,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 42/100 (estimated time remaining: 12 hours, 47 minutes, 49 seconds)
2025-09-12 03:27:25,593 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:27:25,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:27:34,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 13.52749 ± 19.332
2025-09-12 03:27:34,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [23.667776, 17.068773, -4.22023, 6.244756, 6.783568, -8.164157, 16.494959, 63.076675, -2.6594813, 16.982222]
2025-09-12 03:27:34,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [51.0, 46.0, 16.0, 10.0, 34.0, 16.0, 24.0, 39.0, 15.0, 81.0]
2025-09-12 03:27:34,849 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 43/100 (estimated time remaining: 12 hours, 27 minutes, 45 seconds)
2025-09-12 03:40:10,022 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:40:10,023 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:40:21,520 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 13.98914 ± 27.169
2025-09-12 03:40:21,520 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-18.31836, 70.48957, 16.639936, -0.15563616, -10.932205, 11.256567, 57.13601, 2.9772403, -4.0633984, 14.861662]
2025-09-12 03:40:21,520 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [25.0, 73.0, 22.0, 23.0, 70.0, 18.0, 80.0, 39.0, 29.0, 28.0]
2025-09-12 03:40:21,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 44/100 (estimated time remaining: 12 hours, 17 minutes, 13 seconds)
2025-09-12 03:53:07,595 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:53:07,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:53:15,480 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 14.52382 ± 13.660
2025-09-12 03:53:15,481 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [18.358782, 4.024646, 32.275288, 6.440751, 42.06501, 1.9319648, 6.386856, 26.033237, 6.8034306, 0.91825503]
2025-09-12 03:53:15,481 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [11.0, 30.0, 28.0, 42.0, 29.0, 11.0, 21.0, 42.0, 29.0, 44.0]
2025-09-12 03:53:15,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 45/100 (estimated time remaining: 12 hours, 6 minutes, 26 seconds)
2025-09-12 04:06:53,663 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:06:53,667 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:07:01,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 9.18235 ± 15.344
2025-09-12 04:07:01,194 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-3.0366287, -4.6967916, 11.850929, -9.097529, 2.5807462, 22.562922, 9.170878, 18.527302, -0.7529649, 44.714596]
2025-09-12 04:07:01,194 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [54.0, 46.0, 25.0, 14.0, 14.0, 25.0, 22.0, 17.0, 14.0, 42.0]
2025-09-12 04:07:01,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 46/100 (estimated time remaining: 12 hours, 1 minute, 58 seconds)
2025-09-12 04:19:34,049 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:19:34,073 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:19:40,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 16.76754 ± 15.543
2025-09-12 04:19:40,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [45.79296, 40.103916, 10.957826, -2.417957, 27.471148, 21.902433, 3.2726808, 5.45677, 9.409675, 5.7259746]
2025-09-12 04:19:40,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [40.0, 36.0, 15.0, 10.0, 26.0, 26.0, 18.0, 11.0, 42.0, 17.0]
2025-09-12 04:19:40,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 47/100 (estimated time remaining: 11 hours, 44 minutes, 37 seconds)
2025-09-12 04:32:52,496 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:32:52,501 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:33:04,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 13.26788 ± 30.241
2025-09-12 04:33:04,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [41.799248, 7.501777, 45.355537, -31.678656, 39.94521, 6.3308997, -27.702513, 2.5780222, 59.29499, -10.745717]
2025-09-12 04:33:04,043 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [35.0, 13.0, 32.0, 47.0, 99.0, 13.0, 57.0, 19.0, 51.0, 47.0]
2025-09-12 04:33:04,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 48/100 (estimated time remaining: 11 hours, 34 minutes, 9 seconds)
2025-09-12 04:45:06,877 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:45:06,880 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:45:12,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 5.12065 ± 15.738
2025-09-12 04:45:12,969 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [11.040179, 17.913437, 0.09704514, 15.310193, 17.890816, 6.3645606, -36.62151, 18.703985, -1.9992146, 2.5069964]
2025-09-12 04:45:12,969 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [12.0, 44.0, 10.0, 18.0, 15.0, 13.0, 56.0, 16.0, 11.0, 22.0]
2025-09-12 04:45:12,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 49/100 (estimated time remaining: 11 hours, 14 minutes, 31 seconds)
2025-09-12 04:57:54,953 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:57:54,955 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:58:01,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 9.63243 ± 16.128
2025-09-12 04:58:01,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [0.034610007, 41.958313, 33.286617, -8.12412, -6.2541404, -0.5469419, 3.2241964, 8.419936, 2.7347646, 21.591063]
2025-09-12 04:58:01,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [37.0, 26.0, 40.0, 13.0, 20.0, 34.0, 10.0, 19.0, 16.0, 23.0]
2025-09-12 04:58:01,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 50/100 (estimated time remaining: 11 hours, 38 seconds)
2025-09-12 05:11:12,745 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:11:12,748 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:12:47,546 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -76.31921 ± 142.229
2025-09-12 05:12:47,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [10.995664, -230.88371, 15.783981, -288.96896, -347.4823, 41.719, -6.7007995, 18.000807, 1.9759991, 22.368267]
2025-09-12 05:12:47,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [36.0, 1000.0, 48.0, 1000.0, 1000.0, 35.0, 32.0, 26.0, 110.0, 30.0]
2025-09-12 05:12:47,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 51/100 (estimated time remaining: 10 hours, 57 minutes, 43 seconds)
2025-09-12 05:25:20,732 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:25:20,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:25:28,117 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 13.58890 ± 28.312
2025-09-12 05:25:28,117 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-40.33142, 45.357437, 1.4043274, 0.83813465, -2.5844755, 10.638689, 15.408629, 64.39198, -0.123642065, 40.889328]
2025-09-12 05:25:28,117 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [53.0, 40.0, 16.0, 10.0, 12.0, 15.0, 27.0, 52.0, 9.0, 31.0]
2025-09-12 05:25:28,150 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 52/100 (estimated time remaining: 10 hours, 44 minutes, 43 seconds)
2025-09-12 05:38:14,454 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:38:14,456 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:38:26,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 37.25525 ± 32.085
2025-09-12 05:38:26,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [94.76792, 5.8622427, 59.50971, 21.18314, 15.595065, 83.72055, 36.445244, -3.6835353, 49.88205, 9.270091]
2025-09-12 05:38:26,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [74.0, 20.0, 33.0, 34.0, 23.0, 79.0, 44.0, 17.0, 92.0, 25.0]
2025-09-12 05:38:26,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (37.26) for latency ExtremeClogL1U23
2025-09-12 05:38:26,891 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 53/100 (estimated time remaining: 10 hours, 27 minutes, 39 seconds)
2025-09-12 05:52:23,296 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:52:23,318 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:52:29,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 8.14682 ± 16.204
2025-09-12 05:52:29,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [31.533207, -0.051964123, -1.1560723, 0.52040625, 36.467525, -5.0820107, -0.15739655, -16.54872, 14.970675, 20.972559]
2025-09-12 05:52:29,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [44.0, 20.0, 9.0, 13.0, 37.0, 10.0, 14.0, 44.0, 26.0, 19.0]
2025-09-12 05:52:29,792 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 54/100 (estimated time remaining: 10 hours, 32 minutes, 26 seconds)
2025-09-12 06:04:31,107 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:04:31,110 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:04:39,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 23.18392 ± 21.558
2025-09-12 06:04:39,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [39.737164, 37.82111, 22.311611, 2.588318, 2.3653913, 51.664818, 15.744305, -3.0865216, 2.9645505, 59.72842]
2025-09-12 06:04:39,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [31.0, 66.0, 35.0, 13.0, 14.0, 35.0, 17.0, 11.0, 18.0, 52.0]
2025-09-12 06:04:39,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 55/100 (estimated time remaining: 10 hours, 12 minutes, 58 seconds)
2025-09-12 06:17:14,024 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:17:14,025 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:17:25,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 14.73773 ± 49.703
2025-09-12 06:17:25,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [32.659355, 85.725586, 54.707783, -8.866037, -4.8276896, 25.288452, -107.87422, -4.099118, 49.26794, 25.395266]
2025-09-12 06:17:25,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [24.0, 143.0, 43.0, 15.0, 9.0, 20.0, 86.0, 10.0, 48.0, 23.0]
2025-09-12 06:17:25,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 56/100 (estimated time remaining: 9 hours, 41 minutes, 43 seconds)
2025-09-12 06:30:11,291 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:30:11,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:30:21,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 28.87139 ± 34.619
2025-09-12 06:30:21,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [36.36207, 4.673301, -8.67619, 2.3527133, 20.17374, 102.30921, 31.931484, 2.4622564, 14.804172, 82.32117]
2025-09-12 06:30:21,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [29.0, 21.0, 11.0, 15.0, 34.0, 99.0, 32.0, 24.0, 23.0, 72.0]
2025-09-12 06:30:21,571 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 57/100 (estimated time remaining: 9 hours, 31 minutes, 2 seconds)
2025-09-12 06:43:11,976 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:43:11,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:43:27,346 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 39.63895 ± 26.245
2025-09-12 06:43:27,346 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [72.028435, 40.72036, 74.09419, 31.745789, 7.8978796, 48.023186, -3.2758055, 73.86699, 32.903633, 18.384855]
2025-09-12 06:43:27,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [62.0, 37.0, 62.0, 76.0, 10.0, 56.0, 107.0, 79.0, 31.0, 23.0]
2025-09-12 06:43:27,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (39.64) for latency ExtremeClogL1U23
2025-09-12 06:43:27,356 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 58/100 (estimated time remaining: 9 hours, 19 minutes, 3 seconds)
2025-09-12 06:56:37,292 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:56:37,298 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:56:50,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 41.85310 ± 41.199
2025-09-12 06:56:50,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [7.187493, 127.650314, 34.617702, 97.260796, 45.14469, 0.9336066, 16.459019, 64.24949, -6.654205, 31.682087]
2025-09-12 06:56:50,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [49.0, 98.0, 39.0, 82.0, 60.0, 11.0, 25.0, 38.0, 27.0, 57.0]
2025-09-12 06:56:50,767 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (41.85) for latency ExtremeClogL1U23
2025-09-12 06:56:50,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 59/100 (estimated time remaining: 9 hours, 32 seconds)
2025-09-12 07:09:10,661 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:09:10,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:09:15,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 6.64317 ± 10.165
2025-09-12 07:09:15,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [4.110163, 15.037843, 1.7745532, 9.068376, 4.902552, 5.6397343, 5.8366337, 30.501476, 0.7539812, -11.193571]
2025-09-12 07:09:15,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [13.0, 23.0, 16.0, 18.0, 18.0, 15.0, 14.0, 36.0, 15.0, 10.0]
2025-09-12 07:09:15,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 60/100 (estimated time remaining: 8 hours, 49 minutes, 44 seconds)
2025-09-12 07:21:51,376 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:21:51,377 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:22:02,402 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 26.42093 ± 41.790
2025-09-12 07:22:02,402 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-2.0261252, 2.3159974, 12.206671, -0.3887464, -22.308731, 11.553494, 100.85138, 106.80505, 10.966028, 44.234325]
2025-09-12 07:22:02,402 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [16.0, 16.0, 13.0, 15.0, 70.0, 57.0, 79.0, 78.0, 13.0, 48.0]
2025-09-12 07:22:02,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 61/100 (estimated time remaining: 8 hours, 36 minutes, 53 seconds)
2025-09-12 07:34:33,482 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:34:33,484 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:34:42,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 23.60478 ± 25.403
2025-09-12 07:34:42,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-2.1983416, 20.116665, 16.114557, -0.88868034, 35.378426, 8.572567, 38.352192, 88.12282, 4.702013, 27.775612]
2025-09-12 07:34:42,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [21.0, 39.0, 37.0, 23.0, 71.0, 21.0, 27.0, 57.0, 15.0, 23.0]
2025-09-12 07:34:42,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 62/100 (estimated time remaining: 8 hours, 21 minutes, 57 seconds)
2025-09-12 07:47:19,624 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:47:19,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:47:30,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 24.88871 ± 36.255
2025-09-12 07:47:30,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [33.973846, 19.507618, 20.005932, 21.218401, 114.26044, -7.234061, 6.6097913, 9.631165, -24.739393, 55.65333]
2025-09-12 07:47:30,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [19.0, 35.0, 23.0, 33.0, 124.0, 30.0, 14.0, 16.0, 51.0, 45.0]
2025-09-12 07:47:30,342 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 63/100 (estimated time remaining: 8 hours, 6 minutes, 46 seconds)
2025-09-12 08:00:06,134 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:00:06,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:00:14,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 29.10315 ± 20.706
2025-09-12 08:00:14,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [27.655352, 4.4527793, 34.100628, 73.11507, 33.69646, 55.816326, 26.324594, 8.231805, 7.881974, 19.756477]
2025-09-12 08:00:14,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [30.0, 29.0, 29.0, 43.0, 20.0, 63.0, 39.0, 11.0, 33.0, 15.0]
2025-09-12 08:00:14,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 64/100 (estimated time remaining: 7 hours, 49 minutes, 8 seconds)
2025-09-12 08:12:49,601 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:12:49,621 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:13:01,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 12.19443 ± 34.166
2025-09-12 08:13:01,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-17.9126, 43.37144, -40.210613, -0.110122435, -32.22332, 68.15166, 48.860508, 4.3076344, 16.300392, 31.409332]
2025-09-12 08:13:01,673 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [26.0, 41.0, 105.0, 13.0, 65.0, 70.0, 61.0, 16.0, 27.0, 23.0]
2025-09-12 08:13:01,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 65/100 (estimated time remaining: 7 hours, 39 minutes, 9 seconds)
2025-09-12 08:25:38,952 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:25:38,954 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:25:47,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 18.22293 ± 30.154
2025-09-12 08:25:47,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [58.946888, -4.784304, -3.753742, 6.864999, 79.98065, -18.603472, 4.994869, -4.0496387, 38.686157, 23.946913]
2025-09-12 08:25:47,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [34.0, 11.0, 24.0, 20.0, 60.0, 46.0, 14.0, 29.0, 63.0, 27.0]
2025-09-12 08:25:47,813 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 66/100 (estimated time remaining: 7 hours, 26 minutes, 17 seconds)
2025-09-12 08:38:45,883 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:38:45,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:38:57,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 39.64664 ± 26.358
2025-09-12 08:38:57,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [33.77922, 54.166466, 97.34617, 3.2783852, 11.03271, 29.25212, 39.712025, 15.77845, 53.300526, 58.820274]
2025-09-12 08:38:57,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [36.0, 42.0, 116.0, 16.0, 37.0, 34.0, 37.0, 22.0, 36.0, 35.0]
2025-09-12 08:38:57,135 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 67/100 (estimated time remaining: 7 hours, 16 minutes, 49 seconds)
2025-09-12 08:51:40,879 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:51:40,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:51:51,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 18.97893 ± 22.447
2025-09-12 08:51:51,809 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [4.7887373, -2.7710848, 0.12571372, 23.362381, 40.064674, 29.59185, 70.073105, -9.421817, 15.106033, 18.869724]
2025-09-12 08:51:51,809 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [17.0, 13.0, 14.0, 78.0, 38.0, 27.0, 93.0, 38.0, 64.0, 21.0]
2025-09-12 08:51:51,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 68/100 (estimated time remaining: 7 hours, 4 minutes, 45 seconds)
2025-09-12 09:03:59,160 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:03:59,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:04:12,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 42.79266 ± 30.984
2025-09-12 09:04:12,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [112.55915, 13.216598, 73.44704, 8.448605, 35.11992, 28.65794, 27.369427, 21.435734, 37.368122, 70.304085]
2025-09-12 09:04:12,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [56.0, 71.0, 42.0, 28.0, 64.0, 20.0, 34.0, 15.0, 112.0, 59.0]
2025-09-12 09:04:12,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (42.79) for latency ExtremeClogL1U23
2025-09-12 09:04:12,714 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 69/100 (estimated time remaining: 6 hours, 49 minutes, 23 seconds)
2025-09-12 09:16:53,086 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:16:53,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:17:03,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 29.67430 ± 31.471
2025-09-12 09:17:03,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-8.3320465, 28.969938, 4.4437685, 54.5246, 70.77521, 87.740685, 23.367476, 32.81651, 16.629639, -14.192808]
2025-09-12 09:17:03,288 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [16.0, 34.0, 36.0, 52.0, 49.0, 56.0, 22.0, 26.0, 24.0, 55.0]
2025-09-12 09:17:03,295 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 70/100 (estimated time remaining: 6 hours, 36 minutes, 58 seconds)
2025-09-12 09:29:37,687 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:29:37,688 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:29:46,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 30.50699 ± 27.923
2025-09-12 09:29:46,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [36.99437, 11.9173765, 79.65592, 9.895105, 4.5432167, 28.147606, 4.8048515, 55.84292, 1.1978371, 72.07073]
2025-09-12 09:29:46,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [52.0, 24.0, 84.0, 16.0, 10.0, 26.0, 10.0, 63.0, 11.0, 48.0]
2025-09-12 09:29:46,997 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 71/100 (estimated time remaining: 6 hours, 23 minutes, 55 seconds)
2025-09-12 09:42:25,073 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:42:25,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:42:36,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 33.17503 ± 37.720
2025-09-12 09:42:36,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [122.14549, 40.86911, -1.5106925, 72.695335, 0.8404482, 46.07461, 3.0363812, 10.80491, 3.7257583, 33.068966]
2025-09-12 09:42:36,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [117.0, 39.0, 15.0, 59.0, 21.0, 58.0, 11.0, 25.0, 10.0, 41.0]
2025-09-12 09:42:36,103 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 72/100 (estimated time remaining: 6 hours, 9 minutes, 10 seconds)
2025-09-12 09:55:47,327 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:55:47,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:56:06,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 58.69349 ± 54.645
2025-09-12 09:56:06,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [13.567215, 19.508417, 63.924286, 36.735302, 111.08484, 143.4926, 1.4804288, 155.78792, 31.918676, 9.43518]
2025-09-12 09:56:06,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [31.0, 39.0, 55.0, 30.0, 73.0, 165.0, 21.0, 179.0, 54.0, 32.0]
2025-09-12 09:56:06,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (58.69) for latency ExtremeClogL1U23
2025-09-12 09:56:06,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 73/100 (estimated time remaining: 5 hours, 59 minutes, 43 seconds)
2025-09-12 10:08:30,609 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:08:30,612 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:08:50,535 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 84.45688 ± 94.305
2025-09-12 10:08:50,535 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [69.949265, 35.273956, 80.79442, 17.99413, 238.15674, 16.245363, 290.8597, 38.726376, -7.724469, 64.29334]
2025-09-12 10:08:50,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [43.0, 31.0, 74.0, 69.0, 140.0, 33.0, 198.0, 69.0, 24.0, 38.0]
2025-09-12 10:08:50,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (84.46) for latency ExtremeClogL1U23
2025-09-12 10:08:50,542 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 74/100 (estimated time remaining: 5 hours, 49 minutes)
2025-09-12 10:21:07,610 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:21:07,612 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:21:22,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 60.85461 ± 61.774
2025-09-12 10:21:22,700 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [35.600838, 89.99668, 6.5445833, 108.2769, 13.584889, 32.42699, 31.10131, 213.27953, 79.77619, -2.0417237]
2025-09-12 10:21:22,700 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [46.0, 72.0, 15.0, 63.0, 17.0, 70.0, 30.0, 162.0, 57.0, 16.0]
2025-09-12 10:21:22,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 75/100 (estimated time remaining: 5 hours, 34 minutes, 28 seconds)
2025-09-12 10:34:11,654 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:34:11,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:34:26,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 51.62520 ± 45.935
2025-09-12 10:34:26,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-6.9883294, 73.47352, 144.51547, -15.080257, 48.115555, 89.30513, 34.163162, 38.946648, 21.807064, 87.994026]
2025-09-12 10:34:26,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [28.0, 58.0, 122.0, 24.0, 83.0, 56.0, 31.0, 76.0, 19.0, 53.0]
2025-09-12 10:34:26,866 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 76/100 (estimated time remaining: 5 hours, 23 minutes, 19 seconds)
2025-09-12 10:46:57,252 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:46:57,253 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:47:13,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 73.46836 ± 71.234
2025-09-12 10:47:13,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [114.02829, 16.90902, 27.317623, 10.067596, 63.56836, 13.0605345, 259.0097, 75.3798, 105.40275, 49.939857]
2025-09-12 10:47:13,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [124.0, 24.0, 43.0, 16.0, 49.0, 17.0, 151.0, 35.0, 99.0, 33.0]
2025-09-12 10:47:13,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 77/100 (estimated time remaining: 5 hours, 10 minutes, 11 seconds)
2025-09-12 10:59:50,843 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:59:50,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:00:31,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: -10.43331 ± 79.701
2025-09-12 11:00:31,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [13.170241, -7.293087, 39.992126, 11.277545, 9.364699, -235.41833, -10.020234, -2.5929203, -6.4717607, 83.65863]
2025-09-12 11:00:31,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [28.0, 16.0, 40.0, 86.0, 42.0, 1000.0, 38.0, 83.0, 24.0, 90.0]
2025-09-12 11:00:31,545 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 78/100 (estimated time remaining: 4 hours, 56 minutes, 21 seconds)
2025-09-12 11:13:23,049 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:13:23,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:13:47,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 105.27168 ± 107.141
2025-09-12 11:13:47,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [269.75336, 57.264057, 298.16614, 13.938185, 171.92284, 85.37871, -29.767145, 144.59532, -1.2714766, 42.736797]
2025-09-12 11:13:47,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [219.0, 52.0, 175.0, 27.0, 108.0, 45.0, 71.0, 97.0, 34.0, 61.0]
2025-09-12 11:13:47,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (105.27) for latency ExtremeClogL1U23
2025-09-12 11:13:47,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 79/100 (estimated time remaining: 4 hours, 45 minutes, 47 seconds)
2025-09-12 11:26:16,404 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:26:16,406 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:26:31,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 54.09855 ± 38.332
2025-09-12 11:26:31,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [9.440219, 62.250465, 66.38102, 81.04526, 27.655125, 130.62001, 77.64585, 8.379033, 6.4828744, 71.0857]
2025-09-12 11:26:31,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [35.0, 42.0, 71.0, 60.0, 34.0, 131.0, 99.0, 22.0, 17.0, 50.0]
2025-09-12 11:26:31,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 80/100 (estimated time remaining: 4 hours, 33 minutes, 37 seconds)
2025-09-12 11:39:10,293 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:39:10,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:39:27,328 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 63.57833 ± 47.590
2025-09-12 11:39:27,328 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [55.34398, 107.508385, -1.426453, 50.71349, 14.271009, 105.1599, 139.74863, 21.707777, 24.35624, 118.400314]
2025-09-12 11:39:27,328 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [51.0, 80.0, 77.0, 30.0, 35.0, 78.0, 117.0, 22.0, 58.0, 81.0]
2025-09-12 11:39:27,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 81/100 (estimated time remaining: 4 hours, 20 minutes, 1 second)
2025-09-12 11:52:10,451 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:52:10,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:52:55,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 40.02638 ± 125.027
2025-09-12 11:52:55,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [-1.4183087, 162.60779, 67.400055, 32.84164, 143.49785, 82.25434, 16.797266, 155.20079, 36.415108, -295.33267]
2025-09-12 11:52:55,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [11.0, 156.0, 64.0, 79.0, 102.0, 58.0, 51.0, 117.0, 22.0, 1000.0]
2025-09-12 11:52:55,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 9 minutes, 40 seconds)
2025-09-12 12:05:38,647 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:05:38,654 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:05:58,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 107.72273 ± 117.544
2025-09-12 12:05:58,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [227.53107, 22.631334, 2.0647173, 26.339272, -26.35137, 114.45431, 348.83066, 241.95184, 64.10662, 55.668873]
2025-09-12 12:05:58,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [143.0, 28.0, 24.0, 30.0, 67.0, 67.0, 177.0, 100.0, 46.0, 46.0]
2025-09-12 12:05:58,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (107.72) for latency ExtremeClogL1U23
2025-09-12 12:05:59,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 83/100 (estimated time remaining: 3 hours, 55 minutes, 38 seconds)
2025-09-12 12:18:54,128 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:18:54,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:19:18,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 79.18226 ± 76.644
2025-09-12 12:19:18,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [10.835111, 75.15126, 54.708897, 197.09766, 242.305, 25.865759, 31.38044, 24.618942, 17.853216, 112.00637]
2025-09-12 12:19:18,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [96.0, 87.0, 54.0, 169.0, 293.0, 45.0, 26.0, 25.0, 27.0, 63.0]
2025-09-12 12:19:18,548 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 84/100 (estimated time remaining: 3 hours, 42 minutes, 45 seconds)
2025-09-12 12:31:41,029 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:31:41,034 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:31:53,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 44.05627 ± 33.714
2025-09-12 12:31:53,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [107.61079, 10.17496, -0.60057986, 51.840412, 17.971035, 76.26095, 31.874275, 11.214472, 59.46371, 74.752625]
2025-09-12 12:31:53,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [46.0, 19.0, 15.0, 46.0, 37.0, 56.0, 28.0, 55.0, 66.0, 80.0]
2025-09-12 12:31:53,376 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 85/100 (estimated time remaining: 3 hours, 29 minutes, 9 seconds)
2025-09-12 12:44:38,847 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:44:38,853 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:45:14,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 160.39371 ± 114.236
2025-09-12 12:45:14,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [388.17923, 73.11373, 171.60254, 113.43203, 247.89561, 83.4663, 295.0121, 19.33875, 31.311033, 180.58572]
2025-09-12 12:45:14,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [309.0, 76.0, 162.0, 71.0, 159.0, 132.0, 173.0, 27.0, 39.0, 164.0]
2025-09-12 12:45:14,946 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (160.39) for latency ExtremeClogL1U23
2025-09-12 12:45:14,984 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 17 minutes, 22 seconds)
2025-09-12 12:57:55,536 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:57:55,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:58:46,139 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 111.20715 ± 149.685
2025-09-12 12:58:46,139 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [95.404915, 459.3126, 85.80658, 47.403683, 103.309044, -117.80697, 283.61667, 9.36049, 107.18021, 38.484276]
2025-09-12 12:58:46,139 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [62.0, 255.0, 72.0, 39.0, 92.0, 1000.0, 139.0, 14.0, 84.0, 73.0]
2025-09-12 12:58:46,160 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 4 minutes, 20 seconds)
2025-09-12 13:11:24,638 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:11:24,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:11:37,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 23.36714 ± 27.014
2025-09-12 13:11:37,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [2.6952295, 11.16414, 44.561375, 18.954857, 29.926401, 12.278402, 11.898122, 94.990746, 8.2797785, -1.0776231]
2025-09-12 13:11:37,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [16.0, 98.0, 38.0, 37.0, 37.0, 71.0, 20.0, 85.0, 19.0, 26.0]
2025-09-12 13:11:37,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 88/100 (estimated time remaining: 2 hours, 50 minutes, 38 seconds)
2025-09-12 13:25:13,662 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:25:13,671 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:25:32,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 74.00005 ± 40.358
2025-09-12 13:25:32,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [133.07375, 10.692056, 36.314796, 84.150444, 121.73463, 32.326027, 61.53337, 54.01228, 80.66167, 125.50134]
2025-09-12 13:25:32,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [155.0, 21.0, 33.0, 69.0, 82.0, 35.0, 43.0, 44.0, 78.0, 110.0]
2025-09-12 13:25:32,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 89/100 (estimated time remaining: 2 hours, 38 minutes, 56 seconds)
2025-09-12 13:37:14,981 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:37:14,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:37:38,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 93.52441 ± 84.301
2025-09-12 13:37:38,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [21.734228, 233.47932, 5.8403378, 79.48663, 97.25585, 244.84068, 66.92873, 145.43565, -5.87637, 46.118935]
2025-09-12 13:37:38,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [31.0, 142.0, 77.0, 77.0, 69.0, 149.0, 64.0, 70.0, 48.0, 143.0]
2025-09-12 13:37:38,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 24 minutes, 39 seconds)
2025-09-12 13:50:18,869 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:50:18,872 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:50:41,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 124.18901 ± 98.135
2025-09-12 13:50:41,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [296.86203, 140.30942, 55.420746, 41.149467, 293.4965, 111.01192, 30.902098, 182.99072, 67.49286, 22.254374]
2025-09-12 13:50:41,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [154.0, 100.0, 40.0, 32.0, 214.0, 55.0, 32.0, 109.0, 50.0, 19.0]
2025-09-12 13:50:41,229 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 10 minutes, 52 seconds)
2025-09-12 14:03:28,542 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:03:28,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:03:50,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 103.30135 ± 146.234
2025-09-12 14:03:50,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [45.722603, 493.0357, 34.921963, 38.924904, 60.448467, 255.02292, 4.317399, 24.549719, 23.193972, 52.87579]
2025-09-12 14:03:50,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [30.0, 295.0, 33.0, 50.0, 98.0, 186.0, 18.0, 32.0, 31.0, 49.0]
2025-09-12 14:03:50,952 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 92/100 (estimated time remaining: 1 hour, 57 minutes, 8 seconds)
2025-09-12 14:17:37,737 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:17:37,740 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:18:04,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 131.96584 ± 83.262
2025-09-12 14:18:04,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [251.61186, 148.42316, 121.933914, 109.78536, -8.593142, 113.420906, 149.69147, 185.50038, 247.91293, -0.028558472]
2025-09-12 14:18:04,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [154.0, 68.0, 95.0, 55.0, 33.0, 155.0, 95.0, 102.0, 155.0, 52.0]
2025-09-12 14:18:04,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 46 minutes, 19 seconds)
2025-09-12 14:29:35,040 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:29:35,042 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:30:01,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 135.19681 ± 91.405
2025-09-12 14:30:01,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [93.58492, 88.30276, 65.96505, 68.516754, 271.22626, 134.64128, 31.973694, 319.12613, 199.9586, 78.67252]
2025-09-12 14:30:01,263 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [104.0, 75.0, 41.0, 59.0, 148.0, 71.0, 39.0, 230.0, 137.0, 35.0]
2025-09-12 14:30:01,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 30 minutes, 16 seconds)
2025-09-12 14:42:42,832 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:42:42,834 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:43:27,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 61.35489 ± 110.061
2025-09-12 14:43:27,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [156.97845, 199.8399, 64.22233, 109.81605, 38.479073, 2.5648031, 219.78333, -5.053474, -5.2380366, -167.84357]
2025-09-12 14:43:27,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [78.0, 132.0, 76.0, 72.0, 23.0, 31.0, 110.0, 42.0, 13.0, 1000.0]
2025-09-12 14:43:27,283 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 18 minutes, 58 seconds)
2025-09-12 14:56:12,654 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:56:12,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:56:31,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 90.29676 ± 75.471
2025-09-12 14:56:31,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [74.487755, 113.92999, 149.18193, 173.2699, 18.69331, 242.50471, 74.36141, 0.7465337, 1.0277249, 54.764324]
2025-09-12 14:56:31,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [54.0, 85.0, 116.0, 119.0, 31.0, 122.0, 83.0, 12.0, 7.0, 48.0]
2025-09-12 14:56:31,438 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 5 minutes, 50 seconds)
2025-09-12 15:09:18,274 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:09:18,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:09:43,832 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 91.45438 ± 99.281
2025-09-12 15:09:43,832 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [2.5913467, 140.11879, 42.297863, 52.416943, -10.767214, 196.31938, 332.69583, 42.508617, 65.550285, 50.81194]
2025-09-12 15:09:43,832 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [40.0, 139.0, 32.0, 81.0, 19.0, 105.0, 230.0, 104.0, 61.0, 108.0]
2025-09-12 15:09:43,841 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 97/100 (estimated time remaining: 52 minutes, 42 seconds)
2025-09-12 15:22:19,698 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:22:19,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:22:59,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 192.79210 ± 105.906
2025-09-12 15:22:59,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [43.547337, 265.93066, 302.74173, 236.16019, 132.26308, 398.5817, 72.12304, 154.01448, 218.30621, 104.25253]
2025-09-12 15:22:59,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [41.0, 164.0, 216.0, 171.0, 100.0, 227.0, 72.0, 191.0, 154.0, 114.0]
2025-09-12 15:22:59,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1226 [INFO]: New best (192.79) for latency ExtremeClogL1U23
2025-09-12 15:22:59,573 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 98/100 (estimated time remaining: 38 minutes, 57 seconds)
2025-09-12 15:35:43,407 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:35:43,408 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:36:08,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 147.69434 ± 197.056
2025-09-12 15:36:08,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [17.813261, 124.132195, 42.930496, 18.07971, 195.26678, 226.28166, 7.3913383, 691.046, 7.140098, 146.86163]
2025-09-12 15:36:08,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [25.0, 66.0, 32.0, 27.0, 112.0, 139.0, 18.0, 365.0, 14.0, 106.0]
2025-09-12 15:36:08,341 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 99/100 (estimated time remaining: 26 minutes, 26 seconds)
2025-09-12 15:48:51,163 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:48:51,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:49:58,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 164.90518 ± 304.465
2025-09-12 15:49:58,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [111.94808, 10.721241, 132.60321, 325.06427, 320.34674, 19.093386, 320.64752, -494.69336, 771.4364, 131.88425]
2025-09-12 15:49:58,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [74.0, 47.0, 82.0, 192.0, 235.0, 43.0, 180.0, 1000.0, 469.0, 85.0]
2025-09-12 15:49:58,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1199 [INFO]: Iteration 100/100 (estimated time remaining: 13 minutes, 18 seconds)
2025-09-12 16:02:34,124 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:02:34,129 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:03:02,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1221 [DEBUG]: Total Reward: 133.43159 ± 129.094
2025-09-12 16:03:02,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1222 [DEBUG]: All rewards: [76.42388, 66.769714, 117.98804, 208.79773, 263.13513, 441.85703, -19.821213, 31.25928, 67.21709, 80.68919]
2025-09-12 16:03:02,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1223 [DEBUG]: All trajectory lengths: [56.0, 59.0, 102.0, 126.0, 127.0, 410.0, 34.0, 33.0, 40.0, 56.0]
2025-09-12 16:03:02,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc25-ant):1251 [DEBUG]: Training session finished
