2025-09-11 18:06:13,429 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc5-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:06:13,429 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc5-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:06:13,429 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x14f6576fcd90>}
2025-09-11 18:06:13,429 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1111 [DEBUG]: using device: cuda
2025-09-11 18:06:13,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1133 [INFO]: Creating new trainer
2025-09-11 18:06:13,459 baseline-mbpac-noiseperc5-ant:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 18:06:13,459 baseline-mbpac-noiseperc5-ant:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=35, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 18:06:13,469 baseline-mbpac-noiseperc5-ant:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=27, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=8, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 18:06:14,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1194 [DEBUG]: Starting training session...
2025-09-11 18:06:14,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 1/100
2025-09-11 18:18:23,308 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:18:23,325 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:20:05,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: -174.52609 ± 212.036
2025-09-11 18:20:05,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [-471.2677, -34.53247, -481.2966, -529.76636, -11.832095, -19.294676, -72.3974, -50.129055, 24.031996, -98.776505]
2025-09-11 18:20:05,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 91.0, 1000.0, 1000.0, 146.0, 47.0, 103.0, 135.0, 29.0, 56.0]
2025-09-11 18:20:05,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (-174.53) for latency ExtremeClogL1U23
2025-09-11 18:20:05,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 2/100 (estimated time remaining: 22 hours, 50 minutes, 37 seconds)
2025-09-11 18:33:12,866 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:33:12,889 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:34:23,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 23.68042 ± 53.530
2025-09-11 18:34:23,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [34.621857, -54.171326, -9.48936, 36.67313, 13.54623, 3.858196, 28.761671, 24.767511, 164.36938, -6.1330857]
2025-09-11 18:34:23,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [351.0, 375.0, 186.0, 101.0, 112.0, 72.0, 166.0, 50.0, 1000.0, 68.0]
2025-09-11 18:34:23,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (23.68) for latency ExtremeClogL1U23
2025-09-11 18:34:23,108 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 3/100 (estimated time remaining: 22 hours, 59 minutes, 3 seconds)
2025-09-11 18:46:15,303 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:46:15,319 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:48:11,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 73.46645 ± 60.258
2025-09-11 18:48:11,238 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [79.202614, 105.35113, -0.33740225, 29.26342, 49.92584, 77.84771, 141.7638, 9.538209, 204.73766, 37.371502]
2025-09-11 18:48:11,238 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [176.0, 1000.0, 479.0, 250.0, 132.0, 504.0, 359.0, 369.0, 743.0, 101.0]
2025-09-11 18:48:11,239 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (73.47) for latency ExtremeClogL1U23
2025-09-11 18:48:11,252 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 4/100 (estimated time remaining: 22 hours, 36 minutes, 16 seconds)
2025-09-11 19:01:19,616 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:01:19,620 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:03:05,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 123.60960 ± 94.488
2025-09-11 19:03:05,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [147.7918, 32.814323, 122.96772, 145.4126, 77.03044, 78.743645, 357.95575, 34.475597, 199.75053, 39.153503]
2025-09-11 19:03:05,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [731.0, 119.0, 454.0, 183.0, 278.0, 104.0, 1000.0, 140.0, 384.0, 306.0]
2025-09-11 19:03:05,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (123.61) for latency ExtremeClogL1U23
2025-09-11 19:03:05,744 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 5/100 (estimated time remaining: 22 hours, 44 minutes, 30 seconds)
2025-09-11 19:16:10,796 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:16:10,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:19:00,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 234.96835 ± 166.170
2025-09-11 19:19:00,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [-3.7486014, 274.27374, 470.6363, 172.8593, 106.83905, 306.94907, 269.20673, 59.05374, 147.47557, 546.1388]
2025-09-11 19:19:00,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [12.0, 1000.0, 1000.0, 540.0, 251.0, 1000.0, 1000.0, 73.0, 248.0, 786.0]
2025-09-11 19:19:00,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (234.97) for latency ExtremeClogL1U23
2025-09-11 19:19:00,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 6/100 (estimated time remaining: 23 hours, 2 minutes, 36 seconds)
2025-09-11 19:30:57,579 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:30:57,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:33:05,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 271.03955 ± 217.654
2025-09-11 19:33:05,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [144.36009, 200.57721, 257.85645, 705.73376, 207.97818, 28.34337, 517.18024, 37.909496, 509.48102, 100.97564]
2025-09-11 19:33:05,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [405.0, 533.0, 263.0, 990.0, 402.0, 52.0, 1000.0, 67.0, 717.0, 110.0]
2025-09-11 19:33:05,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (271.04) for latency ExtremeClogL1U23
2025-09-11 19:33:05,865 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 7/100 (estimated time remaining: 22 hours, 52 minutes, 37 seconds)
2025-09-11 19:46:06,333 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:46:06,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:48:10,591 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 359.12625 ± 284.987
2025-09-11 19:48:10,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [336.31638, 27.731693, 991.35596, 82.7478, 394.5475, 521.6333, 575.71576, 83.05159, 108.6229, 469.53955]
2025-09-11 19:48:10,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [358.0, 34.0, 1000.0, 182.0, 480.0, 561.0, 1000.0, 78.0, 135.0, 549.0]
2025-09-11 19:48:10,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (359.13) for latency ExtremeClogL1U23
2025-09-11 19:48:10,599 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 8/100 (estimated time remaining: 22 hours, 52 minutes, 31 seconds)
2025-09-11 20:01:20,220 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:01:20,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:04:18,313 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 586.22815 ± 324.596
2025-09-11 20:04:18,320 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1129.1533, 702.6231, 122.28552, 374.02612, 518.5662, 113.280846, 382.6484, 950.09686, 829.94354, 739.6577]
2025-09-11 20:04:18,320 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 533.0, 173.0, 413.0, 1000.0, 87.0, 334.0, 1000.0, 1000.0, 731.0]
2025-09-11 20:04:18,320 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (586.23) for latency ExtremeClogL1U23
2025-09-11 20:04:18,335 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 9/100 (estimated time remaining: 23 hours, 20 minutes, 34 seconds)
2025-09-11 20:17:03,820 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:17:03,829 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:20:19,439 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 509.57388 ± 337.623
2025-09-11 20:20:19,439 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [181.38046, 869.02844, 574.11725, 451.8467, 78.88009, 91.181335, 489.92203, 1226.126, 481.23425, 652.02185]
2025-09-11 20:20:19,439 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [155.0, 581.0, 1000.0, 1000.0, 73.0, 64.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 20:20:19,443 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 10/100 (estimated time remaining: 23 hours, 25 minutes, 33 seconds)
2025-09-11 20:33:32,163 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:33:32,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:37:15,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 980.03448 ± 509.049
2025-09-11 20:37:15,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1465.3323, 397.2133, 1312.814, 488.9604, 255.58087, 1432.9083, 709.13116, 1546.7191, 1602.2738, 589.41235]
2025-09-11 20:37:15,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 284.0, 952.0, 1000.0, 196.0, 1000.0, 462.0, 1000.0, 1000.0, 1000.0]
2025-09-11 20:37:15,710 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (980.03) for latency ExtremeClogL1U23
2025-09-11 20:37:15,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 11/100 (estimated time remaining: 23 hours, 28 minutes, 32 seconds)
2025-09-11 20:50:09,498 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:50:09,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:53:35,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 684.11298 ± 297.824
2025-09-11 20:53:35,217 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [574.7222, 601.5772, 663.7231, 1018.8917, 423.79343, 212.0998, 565.40485, 751.9759, 676.69366, 1352.2478]
2025-09-11 20:53:35,217 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [374.0, 1000.0, 1000.0, 1000.0, 335.0, 145.0, 403.0, 1000.0, 1000.0, 1000.0]
2025-09-11 20:53:35,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 12/100 (estimated time remaining: 23 hours, 52 minutes, 42 seconds)
2025-09-11 21:05:28,801 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:05:28,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:07:17,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 537.41071 ± 301.172
2025-09-11 21:07:17,493 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [970.99, 140.73503, 550.19165, 351.54285, 422.67953, 245.96095, 195.9201, 928.11993, 646.10657, 921.86035]
2025-09-11 21:07:17,494 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [666.0, 79.0, 350.0, 248.0, 245.0, 168.0, 128.0, 617.0, 343.0, 1000.0]
2025-09-11 21:07:17,498 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 13/100 (estimated time remaining: 23 hours, 12 minutes, 25 seconds)
2025-09-11 21:20:21,769 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:20:21,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:23:31,702 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 651.82764 ± 295.701
2025-09-11 21:23:31,709 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1339.3715, 631.49207, 606.17816, 723.4255, 720.48425, 101.867134, 723.2124, 503.87622, 424.90057, 743.4688]
2025-09-11 21:23:31,709 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 323.0, 393.0, 446.0, 1000.0, 75.0, 1000.0, 1000.0, 1000.0, 387.0]
2025-09-11 21:23:31,714 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 14/100 (estimated time remaining: 22 hours, 58 minutes, 28 seconds)
2025-09-11 21:36:07,105 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:36:07,109 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:39:36,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 905.04846 ± 510.239
2025-09-11 21:39:36,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [541.0876, 1242.1644, 1875.6473, 209.358, 666.8702, 476.13525, 490.06454, 1176.666, 807.7963, 1564.6945]
2025-09-11 21:39:36,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 641.0, 845.0, 107.0, 1000.0, 1000.0, 238.0, 572.0, 1000.0, 1000.0]
2025-09-11 21:39:36,467 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 15/100 (estimated time remaining: 22 hours, 43 minutes, 40 seconds)
2025-09-11 21:53:01,171 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:53:01,175 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:55:46,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 624.21692 ± 440.951
2025-09-11 21:55:46,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [20.436655, 458.8901, 349.64563, 1569.5276, 620.48254, 329.6884, 1147.0442, 203.961, 813.2851, 729.2085]
2025-09-11 21:55:46,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [24.0, 1000.0, 194.0, 1000.0, 275.0, 143.0, 1000.0, 123.0, 1000.0, 1000.0]
2025-09-11 21:55:46,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 16/100 (estimated time remaining: 22 hours, 14 minutes, 47 seconds)
2025-09-11 22:08:48,594 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:08:48,598 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:11:16,794 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 901.35175 ± 712.512
2025-09-11 22:11:16,797 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [267.6768, 176.55052, 2274.685, 2210.0195, 411.4632, 859.2679, 779.57574, 837.97485, 367.1485, 829.1558]
2025-09-11 22:11:16,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [138.0, 77.0, 973.0, 1000.0, 214.0, 1000.0, 332.0, 337.0, 183.0, 1000.0]
2025-09-11 22:11:16,806 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 17/100 (estimated time remaining: 21 hours, 45 minutes, 14 seconds)
2025-09-11 22:24:01,029 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:24:01,038 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:25:49,443 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 442.34433 ± 234.976
2025-09-11 22:25:49,443 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [447.57996, 785.6657, 213.96774, 443.52524, 303.12524, 528.9256, 922.2464, 135.11525, 295.44525, 347.84708]
2025-09-11 22:25:49,443 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [166.0, 1000.0, 88.0, 196.0, 161.0, 1000.0, 1000.0, 64.0, 162.0, 122.0]
2025-09-11 22:25:49,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 18/100 (estimated time remaining: 21 hours, 43 minutes, 39 seconds)
2025-09-11 22:37:40,119 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:37:40,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:39:57,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 609.60840 ± 453.474
2025-09-11 22:39:57,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [55.1033, 997.50446, 248.99348, 561.26794, 763.6324, 327.509, 1714.0991, 677.4818, 494.4207, 256.07108]
2025-09-11 22:39:57,051 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [72.0, 1000.0, 135.0, 241.0, 1000.0, 139.0, 1000.0, 1000.0, 239.0, 104.0]
2025-09-11 22:39:57,058 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 19/100 (estimated time remaining: 20 hours, 53 minutes, 19 seconds)
2025-09-11 22:52:47,940 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:52:47,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:56:36,703 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1936.17346 ± 709.526
2025-09-11 22:56:36,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2554.4517, 2694.203, 2644.1394, 893.3594, 2152.1453, 1278.7986, 2591.057, 1261.2831, 954.2812, 2338.016]
2025-09-11 22:56:36,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 365.0, 1000.0, 1000.0, 1000.0, 484.0, 417.0, 1000.0]
2025-09-11 22:56:36,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (1936.17) for latency ExtremeClogL1U23
2025-09-11 22:56:36,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 20/100 (estimated time remaining: 20 hours, 47 minutes, 28 seconds)
2025-09-11 23:09:08,126 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:09:08,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:12:47,997 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1758.89185 ± 923.621
2025-09-11 23:12:48,003 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2762.581, 2381.5962, 186.29895, 2354.5654, 2657.4539, 2260.7551, 810.65533, 599.9916, 2496.2468, 1078.7739]
2025-09-11 23:12:48,004 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 852.0, 155.0, 915.0, 1000.0, 848.0, 1000.0, 212.0, 1000.0, 1000.0]
2025-09-11 23:12:48,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 21/100 (estimated time remaining: 20 hours, 32 minutes, 19 seconds)
2025-09-11 23:26:05,812 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:26:05,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:28:21,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1155.03638 ± 692.544
2025-09-11 23:28:21,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [954.6759, 27.346493, 1509.3204, 1138.8893, 1165.2726, 1069.6373, 1492.8503, 703.95074, 662.7017, 2825.7175]
2025-09-11 23:28:21,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 21.0, 565.0, 434.0, 466.0, 353.0, 550.0, 277.0, 255.0, 1000.0]
2025-09-11 23:28:21,944 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 22/100 (estimated time remaining: 20 hours, 17 minutes, 57 seconds)
2025-09-11 23:40:00,841 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:40:00,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:43:07,214 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1657.69177 ± 1084.708
2025-09-11 23:43:07,216 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2596.7422, 2985.0864, 2730.499, 2705.3137, 0.8893328, 2352.5493, 962.62726, 387.1128, 436.30328, 1419.7955]
2025-09-11 23:43:07,216 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 17.0, 1000.0, 1000.0, 142.0, 140.0, 483.0]
2025-09-11 23:43:07,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 23/100 (estimated time remaining: 20 hours, 5 minutes, 48 seconds)
2025-09-11 23:56:45,111 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:56:45,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:00:07,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2168.00781 ± 948.320
2025-09-12 00:00:07,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1383.6151, 1702.2225, 2261.9617, 3082.7283, 3238.834, 2288.6812, 3136.1526, 176.90651, 2994.9517, 1414.0245]
2025-09-12 00:00:07,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [436.0, 721.0, 809.0, 1000.0, 1000.0, 821.0, 1000.0, 63.0, 1000.0, 510.0]
2025-09-12 00:00:07,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2168.01) for latency ExtremeClogL1U23
2025-09-12 00:00:07,213 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 24/100 (estimated time remaining: 20 hours, 34 minutes, 36 seconds)
2025-09-12 00:11:53,076 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:11:53,079 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:14:08,620 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1222.97632 ± 896.663
2025-09-12 00:14:08,624 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [626.844, 2044.9347, 79.9824, 3134.4114, 372.30594, 1112.1835, 1988.8077, 882.60297, 490.97195, 1496.7186]
2025-09-12 00:14:08,624 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [217.0, 740.0, 52.0, 1000.0, 131.0, 361.0, 642.0, 285.0, 1000.0, 506.0]
2025-09-12 00:14:08,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 25/100 (estimated time remaining: 19 hours, 38 minutes, 29 seconds)
2025-09-12 00:26:30,623 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:26:30,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:29:44,296 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2185.67188 ± 1092.712
2025-09-12 00:29:44,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3110.6582, 2974.9067, 3072.1248, 3003.7007, 2471.532, 208.66042, 3356.1572, 1133.844, 1909.026, 616.1102]
2025-09-12 00:29:44,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 747.0, 126.0, 1000.0, 359.0, 635.0, 185.0]
2025-09-12 00:29:44,302 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2185.67) for latency ExtremeClogL1U23
2025-09-12 00:29:44,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 26/100 (estimated time remaining: 19 hours, 14 minutes, 4 seconds)
2025-09-12 00:42:41,998 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:42:42,005 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:46:07,010 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2058.88916 ± 1056.222
2025-09-12 00:46:07,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2677.8503, 2089.229, 3145.7195, 226.21432, 199.78497, 2784.644, 2572.3123, 1200.524, 2912.1707, 2780.4417]
2025-09-12 00:46:07,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 799.0, 1000.0, 100.0, 80.0, 1000.0, 1000.0, 437.0, 1000.0, 1000.0]
2025-09-12 00:46:07,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 27/100 (estimated time remaining: 19 hours, 10 minutes, 43 seconds)
2025-09-12 00:59:09,874 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:59:09,878 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:02:07,635 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1670.88025 ± 937.704
2025-09-12 01:02:07,642 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1097.5068, 3165.4968, 746.8921, 1042.024, 2677.7954, 2326.9236, 1212.4572, 1054.3678, 479.08377, 2906.2546]
2025-09-12 01:02:07,642 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 334.0, 341.0, 717.0, 668.0, 375.0, 1000.0, 161.0, 898.0]
2025-09-12 01:02:07,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 28/100 (estimated time remaining: 19 hours, 13 minutes, 30 seconds)
2025-09-12 01:15:07,087 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:15:07,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:19:14,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2495.67627 ± 1018.723
2025-09-12 01:19:14,510 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3384.2244, 809.0848, 2957.2783, 3625.085, 3269.3625, 2442.4578, 3002.2625, 3216.6543, 879.4005, 1370.9532]
2025-09-12 01:19:14,510 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 276.0, 1000.0, 1000.0, 1000.0, 742.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 01:19:14,510 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2495.68) for latency ExtremeClogL1U23
2025-09-12 01:19:14,518 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 29/100 (estimated time remaining: 18 hours, 59 minutes, 21 seconds)
2025-09-12 01:31:52,400 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:31:52,404 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:34:57,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1550.77246 ± 1010.720
2025-09-12 01:34:57,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1025.8741, 1350.8274, 582.9381, 1493.6155, 3015.0085, 2874.6538, 169.51097, 1087.6877, 3097.5352, 810.07324]
2025-09-12 01:34:57,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 433.0, 200.0, 479.0, 948.0, 1000.0, 1000.0, 330.0, 919.0, 315.0]
2025-09-12 01:34:57,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 30/100 (estimated time remaining: 19 hours, 7 minutes, 28 seconds)
2025-09-12 01:46:35,903 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:46:35,905 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:49:37,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2202.60400 ± 1436.428
2025-09-12 01:49:37,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3467.5615, 3393.6155, 185.33113, 2913.5183, 3316.4617, 3384.9358, 778.9817, 3669.083, 195.00543, 721.5474]
2025-09-12 01:49:37,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 76.0, 1000.0, 1000.0, 1000.0, 257.0, 1000.0, 86.0, 225.0]
2025-09-12 01:49:37,127 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 31/100 (estimated time remaining: 18 hours, 38 minutes, 19 seconds)
2025-09-12 02:02:04,637 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:02:04,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:04:26,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1801.72913 ± 1408.822
2025-09-12 02:04:26,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2901.843, 3415.0144, 3372.9868, 2206.9116, 79.86889, 103.88572, 100.00708, 631.7334, 3604.2864, 1600.7542]
2025-09-12 02:04:26,249 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [748.0, 934.0, 1000.0, 613.0, 42.0, 41.0, 44.0, 185.0, 1000.0, 555.0]
2025-09-12 02:04:26,255 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 32/100 (estimated time remaining: 18 hours, 49 seconds)
2025-09-12 02:17:31,461 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:17:31,464 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:20:08,349 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1516.42566 ± 1262.847
2025-09-12 02:20:08,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1026.7794, 157.31485, 821.2891, 3361.5198, 2976.411, 237.99493, 1194.0507, 1860.6564, 3476.1726, 52.06808]
2025-09-12 02:20:08,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [302.0, 70.0, 268.0, 970.0, 1000.0, 94.0, 1000.0, 1000.0, 1000.0, 48.0]
2025-09-12 02:20:08,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 33/100 (estimated time remaining: 17 hours, 40 minutes, 57 seconds)
2025-09-12 02:32:53,273 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:32:53,276 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:36:35,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2682.81201 ± 1293.197
2025-09-12 02:36:35,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [90.896355, 865.63885, 1900.0519, 3121.909, 2339.238, 3713.1475, 4036.0854, 3662.1853, 3088.2322, 4010.7366]
2025-09-12 02:36:35,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [81.0, 253.0, 1000.0, 1000.0, 632.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 02:36:35,333 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2682.81) for latency ExtremeClogL1U23
2025-09-12 02:36:35,339 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 34/100 (estimated time remaining: 17 hours, 16 minutes, 26 seconds)
2025-09-12 02:49:04,643 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:49:04,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:53:19,583 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2764.93677 ± 771.309
2025-09-12 02:53:19,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3344.9294, 3504.592, 3032.2148, 2108.286, 3535.6619, 1361.857, 3154.6956, 3544.2874, 2318.672, 1744.1716]
2025-09-12 02:53:19,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 687.0, 525.0]
2025-09-12 02:53:19,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2764.94) for latency ExtremeClogL1U23
2025-09-12 02:53:19,615 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 35/100 (estimated time remaining: 17 hours, 14 minutes, 33 seconds)
2025-09-12 03:05:38,603 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:05:38,619 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:08:36,072 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2200.27881 ± 1185.391
2025-09-12 03:08:36,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1704.3035, 3799.8745, 1908.9185, 2127.921, 3322.9119, 3913.8967, 718.00244, 79.386345, 2577.2405, 1850.3337]
2025-09-12 03:08:36,078 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [504.0, 1000.0, 544.0, 1000.0, 958.0, 1000.0, 199.0, 41.0, 806.0, 514.0]
2025-09-12 03:08:36,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 36/100 (estimated time remaining: 17 hours, 6 minutes, 46 seconds)
2025-09-12 03:21:12,169 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:21:12,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:25:02,783 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2777.72778 ± 1181.391
2025-09-12 03:25:02,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [727.0145, 4054.1062, 1141.6239, 1786.9292, 1979.7391, 3753.0867, 3153.7554, 3789.6208, 3583.9727, 3807.4287]
2025-09-12 03:25:02,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [247.0, 1000.0, 1000.0, 519.0, 590.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 03:25:02,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2777.73) for latency ExtremeClogL1U23
2025-09-12 03:25:02,790 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 37/100 (estimated time remaining: 17 hours, 11 minutes, 47 seconds)
2025-09-12 03:37:32,402 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:37:32,407 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:40:07,076 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2142.87549 ± 1486.142
2025-09-12 03:40:07,089 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [318.84125, 389.0301, 442.38724, 3988.152, 417.76096, 2576.1277, 3962.7466, 2964.478, 3003.964, 3365.2654]
2025-09-12 03:40:07,089 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [127.0, 113.0, 133.0, 1000.0, 114.0, 624.0, 1000.0, 829.0, 755.0, 864.0]
2025-09-12 03:40:07,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 38/100 (estimated time remaining: 16 hours, 47 minutes, 44 seconds)
2025-09-12 03:52:25,569 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:52:25,584 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:55:08,108 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1911.36792 ± 1341.399
2025-09-12 03:55:08,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1947.0831, 300.01764, 3502.7678, 267.6508, 1901.7023, 1974.2886, 766.46277, 756.65405, 3785.1172, 3911.9358]
2025-09-12 03:55:08,114 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [531.0, 88.0, 933.0, 162.0, 643.0, 482.0, 1000.0, 183.0, 1000.0, 1000.0]
2025-09-12 03:55:08,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 39/100 (estimated time remaining: 16 hours, 13 minutes, 58 seconds)
2025-09-12 04:07:41,274 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:07:41,279 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:10:09,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2096.12842 ± 1373.404
2025-09-12 04:10:09,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4091.8416, 4114.543, 2167.4387, 1339.5779, 631.8037, 895.55206, 1544.4781, 2302.7122, 169.66989, 3703.668]
2025-09-12 04:10:09,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 542.0, 387.0, 200.0, 221.0, 445.0, 576.0, 113.0, 1000.0]
2025-09-12 04:10:09,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 40/100 (estimated time remaining: 15 hours, 37 minutes, 18 seconds)
2025-09-12 04:22:09,294 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:22:09,297 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:25:21,630 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2140.74780 ± 1443.163
2025-09-12 04:25:21,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2189.501, 1578.1188, 2673.1897, 3671.2334, 8.961576, 1274.096, 412.11337, 4211.1846, 4227.2734, 1161.807]
2025-09-12 04:25:21,634 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [627.0, 1000.0, 721.0, 1000.0, 25.0, 412.0, 1000.0, 1000.0, 1000.0, 309.0]
2025-09-12 04:25:21,643 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 41/100 (estimated time remaining: 15 hours, 21 minutes, 6 seconds)
2025-09-12 04:38:15,971 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:38:15,979 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:41:34,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2604.84106 ± 1379.283
2025-09-12 04:41:34,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [522.981, 2590.162, 4000.4148, 1548.9563, 3955.529, 3359.3765, 4067.694, 1167.8451, 3992.3257, 843.1266]
2025-09-12 04:41:34,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [151.0, 679.0, 1000.0, 1000.0, 1000.0, 965.0, 1000.0, 372.0, 1000.0, 201.0]
2025-09-12 04:41:34,891 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 42/100 (estimated time remaining: 15 hours, 3 minutes, 6 seconds)
2025-09-12 04:54:24,003 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:54:24,010 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:57:35,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2475.08765 ± 1352.155
2025-09-12 04:57:35,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [652.5157, 2983.133, 3706.954, 3651.705, 3412.3333, 867.4705, 1639.4979, 3723.5405, 363.95135, 3749.7769]
2025-09-12 04:57:35,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [221.0, 1000.0, 1000.0, 1000.0, 884.0, 226.0, 496.0, 1000.0, 131.0, 1000.0]
2025-09-12 04:57:35,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 43/100 (estimated time remaining: 14 hours, 58 minutes, 38 seconds)
2025-09-12 05:09:45,686 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:09:45,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:13:16,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2389.65967 ± 1290.536
2025-09-12 05:13:16,176 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [64.61602, 3242.9797, 3307.1934, 3830.7937, 3873.418, 2055.2146, 790.7988, 1005.2035, 3291.6614, 2434.7158]
2025-09-12 05:13:16,176 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [36.0, 899.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 255.0, 858.0, 705.0]
2025-09-12 05:13:16,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 44/100 (estimated time remaining: 14 hours, 50 minutes, 43 seconds)
2025-09-12 05:25:13,661 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:25:13,664 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:29:00,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2844.49707 ± 1160.250
2025-09-12 05:29:00,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3839.4434, 3570.8328, 3628.6543, 2210.576, 3355.4001, 2002.9944, 503.87704, 4089.005, 1432.0967, 3812.091]
2025-09-12 05:29:00,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 561.0, 1000.0, 540.0, 133.0, 1000.0, 1000.0, 1000.0]
2025-09-12 05:29:00,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (2844.50) for latency ExtremeClogL1U23
2025-09-12 05:29:00,778 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 45/100 (estimated time remaining: 14 hours, 43 minutes, 12 seconds)
2025-09-12 05:41:49,361 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:41:49,375 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:45:48,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3342.22974 ± 1001.365
2025-09-12 05:45:48,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3518.95, 3978.698, 4124.5884, 3616.0686, 3690.0283, 3874.2458, 803.6744, 3793.7412, 3914.6355, 2107.6692]
2025-09-12 05:45:48,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 240.0, 1000.0, 1000.0, 570.0]
2025-09-12 05:45:48,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (3342.23) for latency ExtremeClogL1U23
2025-09-12 05:45:48,615 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 46/100 (estimated time remaining: 14 hours, 44 minutes, 56 seconds)
2025-09-12 05:58:31,523 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:58:31,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:01:48,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2769.43604 ± 1314.518
2025-09-12 06:01:48,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1341.8484, 3517.2507, 3605.6887, 3852.3174, 1514.425, 3658.6528, 4038.685, 521.95386, 4195.992, 1447.5453]
2025-09-12 06:01:48,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [349.0, 1000.0, 1000.0, 1000.0, 379.0, 1000.0, 1000.0, 165.0, 1000.0, 370.0]
2025-09-12 06:01:48,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 47/100 (estimated time remaining: 14 hours, 26 minutes, 22 seconds)
2025-09-12 06:13:50,156 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:13:50,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:16:05,448 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1931.77344 ± 1595.293
2025-09-12 06:16:05,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [41.019608, 3915.7922, 404.2129, 97.31091, 4117.343, 2983.235, 2580.2117, 721.62225, 3694.1511, 762.8363]
2025-09-12 06:16:05,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [25.0, 1000.0, 120.0, 48.0, 1000.0, 762.0, 681.0, 205.0, 924.0, 251.0]
2025-09-12 06:16:05,471 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 48/100 (estimated time remaining: 13 hours, 52 minutes, 8 seconds)
2025-09-12 06:28:56,424 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:28:56,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:32:32,138 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3038.71826 ± 1249.161
2025-09-12 06:32:32,141 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4250.79, 3812.665, 1954.1221, 3112.021, 761.267, 3724.4116, 1023.4037, 3561.6528, 3831.261, 4355.5894]
2025-09-12 06:32:32,141 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 556.0, 1000.0, 196.0, 1000.0, 261.0, 1000.0, 1000.0, 1000.0]
2025-09-12 06:32:32,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 49/100 (estimated time remaining: 13 hours, 44 minutes, 22 seconds)
2025-09-12 06:44:15,211 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:44:15,228 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:46:55,870 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2347.86572 ± 1396.265
2025-09-12 06:46:55,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2911.093, 2507.5608, 730.889, 285.0149, 3885.9814, 538.44824, 4051.6274, 1875.9077, 2472.9402, 4219.1943]
2025-09-12 06:46:55,877 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [713.0, 691.0, 186.0, 90.0, 1000.0, 167.0, 1000.0, 487.0, 604.0, 1000.0]
2025-09-12 06:46:55,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 50/100 (estimated time remaining: 13 hours, 14 minutes, 46 seconds)
2025-09-12 07:00:00,265 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:00:00,269 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:04:06,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3616.39258 ± 892.451
2025-09-12 07:04:06,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4000.9404, 3620.0117, 4349.639, 3976.3762, 4311.201, 4216.3926, 3647.4692, 4076.7676, 2597.0105, 1368.1162]
2025-09-12 07:04:06,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 856.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 644.0, 459.0]
2025-09-12 07:04:06,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (3616.39) for latency ExtremeClogL1U23
2025-09-12 07:04:06,684 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 51/100 (estimated time remaining: 13 hours, 3 minutes)
2025-09-12 07:16:51,169 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:16:51,173 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:21:05,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3260.58301 ± 1078.756
2025-09-12 07:21:05,453 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4202.1577, 4049.3691, 3681.1697, 3916.293, 529.30676, 3627.1978, 3694.2493, 3952.6404, 2601.9905, 2351.4553]
2025-09-12 07:21:05,453 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 682.0, 601.0]
2025-09-12 07:21:05,460 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 52/100 (estimated time remaining: 12 hours, 57 minutes, 1 second)
2025-09-12 07:33:33,446 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:33:33,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:36:33,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2323.48120 ± 1695.102
2025-09-12 07:36:33,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [176.57664, 4421.5312, 3864.8987, 1714.6973, 4445.057, 3415.286, 891.0443, 441.7622, 3572.669, 291.2934]
2025-09-12 07:36:33,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [64.0, 1000.0, 1000.0, 460.0, 1000.0, 894.0, 1000.0, 138.0, 1000.0, 77.0]
2025-09-12 07:36:33,345 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 53/100 (estimated time remaining: 12 hours, 52 minutes, 27 seconds)
2025-09-12 07:48:12,487 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:48:12,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:52:31,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3885.84106 ± 554.615
2025-09-12 07:52:31,990 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3915.6338, 4197.2856, 3878.3423, 4082.5984, 3954.099, 4024.579, 4195.9575, 4167.5415, 4184.6084, 2257.7659]
2025-09-12 07:52:31,990 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 923.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 613.0]
2025-09-12 07:52:31,990 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (3885.84) for latency ExtremeClogL1U23
2025-09-12 07:52:32,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 54/100 (estimated time remaining: 12 hours, 31 minutes, 58 seconds)
2025-09-12 08:05:18,400 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:05:18,405 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:08:19,930 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2381.20801 ± 1631.726
2025-09-12 08:08:19,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1250.6663, 697.5266, 621.5295, 1352.0045, 3712.022, 4350.6787, 3476.6965, 3846.2083, 4411.9165, 92.83114]
2025-09-12 08:08:19,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 166.0, 166.0, 309.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 50.0]
2025-09-12 08:08:19,942 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 55/100 (estimated time remaining: 12 hours, 28 minutes, 53 seconds)
2025-09-12 08:20:29,440 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:20:29,442 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:24:25,821 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2665.95508 ± 1452.162
2025-09-12 08:24:25,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3477.8953, 2869.919, 2027.7362, 859.2809, 4095.1301, 3849.7173, 1042.8995, 166.19113, 3948.225, 4322.557]
2025-09-12 08:24:25,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 483.0, 1000.0, 1000.0, 1000.0, 236.0, 1000.0, 1000.0, 1000.0]
2025-09-12 08:24:25,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 56/100 (estimated time remaining: 12 hours, 2 minutes, 52 seconds)
2025-09-12 08:37:32,992 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:37:32,998 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:41:30,976 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3228.93994 ± 1503.400
2025-09-12 08:41:30,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4245.8813, 3981.8523, 417.9246, 268.28622, 4039.2676, 2817.4858, 4460.0073, 4162.2896, 3740.196, 4156.2085]
2025-09-12 08:41:30,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 149.0, 1000.0, 1000.0, 682.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 08:41:30,984 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 57/100 (estimated time remaining: 11 hours, 47 minutes, 44 seconds)
2025-09-12 08:53:26,759 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:53:26,768 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:57:14,356 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3516.91553 ± 1104.910
2025-09-12 08:57:14,358 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2380.3044, 2422.7185, 1037.3226, 4074.8372, 3822.8906, 4202.747, 3977.282, 4334.9087, 4604.4536, 4311.691]
2025-09-12 08:57:14,358 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [585.0, 578.0, 275.0, 1000.0, 901.0, 1000.0, 1000.0, 1000.0, 1000.0, 974.0]
2025-09-12 08:57:14,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 58/100 (estimated time remaining: 11 hours, 33 minutes, 52 seconds)
2025-09-12 09:10:23,130 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:10:23,134 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:13:05,049 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2478.22046 ± 1691.664
2025-09-12 09:13:05,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2251.5854, 4255.503, 788.71, 4599.426, 1243.8413, 4215.567, 2910.2178, 124.17978, 244.10301, 4149.0703]
2025-09-12 09:13:05,050 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [506.0, 1000.0, 189.0, 1000.0, 371.0, 1000.0, 751.0, 63.0, 130.0, 1000.0]
2025-09-12 09:13:05,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 59/100 (estimated time remaining: 11 hours, 16 minutes, 37 seconds)
2025-09-12 09:25:21,215 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:25:21,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:29:02,636 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3334.39526 ± 1421.267
2025-09-12 09:29:02,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [100.58635, 4265.7656, 4073.0186, 3967.2305, 995.8914, 4030.7593, 3557.0173, 3881.9895, 4256.038, 4215.6577]
2025-09-12 09:29:02,638 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [38.0, 1000.0, 1000.0, 1000.0, 266.0, 1000.0, 887.0, 1000.0, 1000.0, 1000.0]
2025-09-12 09:29:02,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 60/100 (estimated time remaining: 11 hours, 1 minute, 50 seconds)
2025-09-12 09:41:38,337 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:41:38,341 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:44:27,357 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2476.98438 ± 1524.091
2025-09-12 09:44:27,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4368.5264, 4156.3584, 534.2959, 496.28915, 1693.0883, 3722.6982, 3685.4453, 1885.23, 549.8761, 3678.0361]
2025-09-12 09:44:27,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 147.0, 140.0, 444.0, 980.0, 898.0, 460.0, 160.0, 893.0]
2025-09-12 09:44:27,368 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 61/100 (estimated time remaining: 10 hours, 40 minutes, 12 seconds)
2025-09-12 09:56:34,776 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:56:34,780 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:00:16,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3086.50928 ± 1417.033
2025-09-12 10:00:16,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3838.9011, 4219.5767, 4097.5254, 4341.9175, 1718.064, 2489.2278, 4026.4893, 1774.4279, 40.347054, 4318.615]
2025-09-12 10:00:16,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 512.0, 642.0, 1000.0, 1000.0, 31.0, 1000.0]
2025-09-12 10:00:16,590 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 62/100 (estimated time remaining: 10 hours, 14 minutes, 19 seconds)
2025-09-12 10:13:06,746 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:13:06,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:16:53,330 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3404.12036 ± 1540.263
2025-09-12 10:16:53,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4117.8335, 4487.5425, 4260.76, 4620.582, 225.04707, 3827.6543, 4137.6157, 4333.054, 3479.159, 551.9525]
2025-09-12 10:16:53,334 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 83.0, 1000.0, 1000.0, 1000.0, 1000.0, 160.0]
2025-09-12 10:16:53,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 63/100 (estimated time remaining: 10 hours, 5 minutes, 20 seconds)
2025-09-12 10:28:08,550 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:28:08,554 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:31:07,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2250.04150 ± 1890.730
2025-09-12 10:31:07,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [28.159079, 4543.9404, 1197.5096, 4173.5435, 1833.0616, 119.364395, 4248.245, 485.2469, 4933.2974, 938.0472]
2025-09-12 10:31:07,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [37.0, 1000.0, 1000.0, 1000.0, 1000.0, 44.0, 1000.0, 139.0, 1000.0, 283.0]
2025-09-12 10:31:07,034 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 64/100 (estimated time remaining: 9 hours, 37 minutes, 26 seconds)
2025-09-12 10:44:21,064 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:44:21,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:46:33,137 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 1566.13086 ± 1593.345
2025-09-12 10:46:33,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1942.8038, 51.474632, 4593.85, -13.735017, 2989.5647, -7.875785, 719.2823, 120.33482, 3555.2048, 1710.404]
2025-09-12 10:46:33,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [433.0, 29.0, 1000.0, 27.0, 678.0, 1000.0, 164.0, 57.0, 1000.0, 458.0]
2025-09-12 10:46:33,149 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 65/100 (estimated time remaining: 9 hours, 18 minutes, 3 seconds)
2025-09-12 10:58:16,819 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:58:16,823 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:02:06,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3570.07886 ± 1000.747
2025-09-12 11:02:06,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2668.684, 3802.9585, 4509.542, 1966.59, 4406.811, 4415.201, 4390.2344, 1732.065, 3851.193, 3957.51]
2025-09-12 11:02:06,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [687.0, 1000.0, 1000.0, 493.0, 1000.0, 1000.0, 994.0, 400.0, 846.0, 1000.0]
2025-09-12 11:02:06,230 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 66/100 (estimated time remaining: 9 hours, 3 minutes, 32 seconds)
2025-09-12 11:15:08,465 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:15:08,468 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:17:47,707 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2356.86572 ± 1791.504
2025-09-12 11:17:47,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4039.7478, 3781.2317, 4206.3594, 749.51636, 889.7938, 4481.9033, 4114.775, 329.13544, 55.013157, 921.1792]
2025-09-12 11:17:47,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 260.0, 263.0, 1000.0, 1000.0, 84.0, 28.0, 248.0]
2025-09-12 11:17:47,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 67/100 (estimated time remaining: 8 hours, 47 minutes, 7 seconds)
2025-09-12 11:29:54,241 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:29:54,244 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:33:39,933 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3367.97925 ± 1226.422
2025-09-12 11:33:39,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [45.357788, 2853.0615, 4069.1604, 4403.7646, 3562.6624, 3554.751, 3988.2993, 2814.1006, 4050.056, 4338.58]
2025-09-12 11:33:39,940 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [26.0, 787.0, 1000.0, 1000.0, 1000.0, 779.0, 1000.0, 780.0, 1000.0, 1000.0]
2025-09-12 11:33:39,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 68/100 (estimated time remaining: 8 hours, 26 minutes, 43 seconds)
2025-09-12 11:46:24,994 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:46:24,998 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:50:09,075 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3504.06836 ± 1059.788
2025-09-12 11:50:09,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4532.5884, 4010.8228, 4246.5063, 3192.4539, 4484.3076, 4160.4463, 1687.4387, 1610.6594, 4259.8467, 2855.6147]
2025-09-12 11:50:09,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 763.0, 1000.0, 1000.0, 419.0, 452.0, 1000.0, 644.0]
2025-09-12 11:50:09,087 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 25 minutes, 49 seconds)
2025-09-12 12:02:48,085 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:02:48,088 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:06:07,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3040.12231 ± 1414.429
2025-09-12 12:06:07,375 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2839.0208, 4564.062, 229.39667, 4026.7542, 1131.2417, 2739.066, 2248.4417, 3808.756, 4207.7476, 4606.7334]
2025-09-12 12:06:07,375 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [669.0, 1000.0, 82.0, 1000.0, 295.0, 726.0, 560.0, 1000.0, 1000.0, 1000.0]
2025-09-12 12:06:07,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 70/100 (estimated time remaining: 8 hours, 13 minutes, 20 seconds)
2025-09-12 12:18:01,962 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:18:01,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:21:42,421 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3351.10425 ± 1199.866
2025-09-12 12:21:42,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3472.5525, 3293.5627, 4475.8613, 2735.3638, 138.19748, 4184.09, 3840.1216, 4206.6406, 4147.9546, 3016.6992]
2025-09-12 12:21:42,431 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [849.0, 1000.0, 1000.0, 619.0, 47.0, 1000.0, 1000.0, 1000.0, 881.0, 688.0]
2025-09-12 12:21:42,437 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 71/100 (estimated time remaining: 7 hours, 57 minutes, 37 seconds)
2025-09-12 12:34:18,546 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:34:18,555 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:37:07,115 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2658.66455 ± 1401.163
2025-09-12 12:37:07,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [391.7321, 3660.1362, 4637.6445, 1922.5808, 4458.6772, 4221.9497, 2413.901, 1803.7483, 1385.4628, 1690.8099]
2025-09-12 12:37:07,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [205.0, 829.0, 1000.0, 445.0, 1000.0, 1000.0, 598.0, 450.0, 308.0, 391.0]
2025-09-12 12:37:07,126 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 72/100 (estimated time remaining: 7 hours, 40 minutes, 4 seconds)
2025-09-12 12:49:08,939 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:49:08,950 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:52:43,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3348.32373 ± 1284.202
2025-09-12 12:52:43,307 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2513.6375, 327.19598, 4514.4434, 3983.776, 3802.5796, 3530.8958, 4482.486, 1955.8485, 4318.821, 4053.553]
2025-09-12 12:52:43,308 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [575.0, 102.0, 930.0, 1000.0, 864.0, 1000.0, 1000.0, 399.0, 1000.0, 1000.0]
2025-09-12 12:52:43,314 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 73/100 (estimated time remaining: 7 hours, 22 minutes, 42 seconds)
2025-09-12 13:05:45,645 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:05:45,655 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:09:46,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3352.80469 ± 856.006
2025-09-12 13:09:46,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2362.2056, 4002.9458, 2186.069, 4370.093, 3970.7898, 2356.5713, 4175.2583, 3935.4302, 2379.738, 3788.9436]
2025-09-12 13:09:46,064 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [629.0, 1000.0, 1000.0, 1000.0, 1000.0, 616.0, 1000.0, 1000.0, 565.0, 1000.0]
2025-09-12 13:09:46,071 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 74/100 (estimated time remaining: 7 hours, 9 minutes, 55 seconds)
2025-09-12 13:22:16,439 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:22:16,443 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:26:19,535 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3600.58838 ± 1199.130
2025-09-12 13:26:19,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4280.8496, 3883.5554, 4574.2134, 3337.09, 172.09854, 3620.0652, 3669.1443, 3933.85, 4438.805, 4096.209]
2025-09-12 13:26:19,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 904.0, 73.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:26:19,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 75/100 (estimated time remaining: 6 hours, 57 minutes, 3 seconds)
2025-09-12 13:39:07,202 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:39:07,206 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:42:45,761 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3372.70703 ± 1622.076
2025-09-12 13:42:45,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [224.12854, 4089.9568, 4217.6035, 3840.6655, 83.4666, 4461.732, 4232.9473, 3968.5525, 4544.554, 4063.4592]
2025-09-12 13:42:45,782 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [69.0, 1000.0, 1000.0, 964.0, 45.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:42:45,797 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 76/100 (estimated time remaining: 6 hours, 45 minutes, 16 seconds)
2025-09-12 13:54:50,473 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:54:50,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:59:01,073 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3898.86841 ± 1063.909
2025-09-12 13:59:01,079 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4697.903, 4290.1484, 3667.9578, 810.1643, 4557.5894, 4001.8408, 4252.518, 4313.102, 4117.9893, 4279.4727]
2025-09-12 13:59:01,079 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 181.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:59:01,079 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (3898.87) for latency ExtremeClogL1U23
2025-09-12 13:59:01,086 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 33 minutes, 7 seconds)
2025-09-12 14:11:57,373 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:11:57,376 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:15:52,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3272.30176 ± 1576.995
2025-09-12 14:15:52,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4168.041, 4393.241, 460.89392, 225.44759, 4425.3687, 3858.77, 2337.8157, 4309.703, 4300.955, 4242.781]
2025-09-12 14:15:52,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 148.0, 1000.0, 1000.0, 617.0, 988.0, 996.0, 1000.0]
2025-09-12 14:15:52,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 78/100 (estimated time remaining: 6 hours, 22 minutes, 31 seconds)
2025-09-12 14:27:26,573 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:27:26,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:30:22,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2626.67554 ± 1252.802
2025-09-12 14:30:22,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2830.2979, 4112.919, 1905.1731, 680.53937, 3332.919, 2062.572, 2853.4294, 3897.4695, 502.072, 4089.3633]
2025-09-12 14:30:22,808 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [653.0, 1000.0, 486.0, 176.0, 1000.0, 618.0, 591.0, 854.0, 126.0, 1000.0]
2025-09-12 14:30:22,818 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 54 minutes, 41 seconds)
2025-09-12 14:43:10,983 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:43:10,987 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:47:05,850 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3725.60278 ± 903.216
2025-09-12 14:47:05,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4122.7983, 2883.3467, 4583.3086, 4669.2725, 4447.8936, 3888.008, 3525.3533, 1488.7101, 3660.0342, 3987.3022]
2025-09-12 14:47:05,855 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [999.0, 668.0, 1000.0, 1000.0, 1000.0, 1000.0, 823.0, 338.0, 873.0, 1000.0]
2025-09-12 14:47:05,864 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 39 minutes, 14 seconds)
2025-09-12 14:58:55,828 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:58:55,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:01:52,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2677.19995 ± 1793.595
2025-09-12 15:01:52,741 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [622.5934, 1084.4186, 3989.7007, 3957.6406, 4339.947, 132.08197, 181.42178, 4019.6514, 4248.223, 4196.323]
2025-09-12 15:01:52,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [203.0, 246.0, 1000.0, 1000.0, 1000.0, 78.0, 58.0, 1000.0, 976.0, 1000.0]
2025-09-12 15:01:52,751 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 16 minutes, 27 seconds)
2025-09-12 15:14:13,829 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:14:13,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:18:15,781 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3800.23364 ± 786.658
2025-09-12 15:18:15,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4512.0166, 4179.615, 2836.9312, 2956.1035, 4288.979, 2154.9026, 4458.8325, 4118.1514, 4334.1763, 4162.628]
2025-09-12 15:18:15,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 641.0, 698.0, 1000.0, 511.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 15:18:15,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 82/100 (estimated time remaining: 5 hours, 1 minute, 7 seconds)
2025-09-12 15:31:48,799 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:31:48,803 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:35:40,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3083.43945 ± 1386.610
2025-09-12 15:35:40,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1010.51355, 375.56744, 3961.3132, 2943.9863, 4656.492, 3204.5242, 4203.795, 2194.1455, 4018.0007, 4266.0566]
2025-09-12 15:35:40,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [265.0, 1000.0, 1000.0, 692.0, 1000.0, 1000.0, 1000.0, 570.0, 1000.0, 1000.0]
2025-09-12 15:35:40,153 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 47 minutes, 14 seconds)
2025-09-12 15:47:28,377 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:47:28,381 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:51:00,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3310.35791 ± 1643.740
2025-09-12 15:51:00,346 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [66.18514, 4258.11, 3875.9932, 4872.974, 4171.446, 464.2841, 4680.1387, 4021.0781, 2449.8772, 4243.4907]
2025-09-12 15:51:00,346 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [34.0, 1000.0, 1000.0, 1000.0, 1000.0, 123.0, 1000.0, 1000.0, 534.0, 1000.0]
2025-09-12 15:51:00,354 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 34 minutes, 7 seconds)
2025-09-12 16:03:29,757 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:03:29,762 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:07:27,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3776.64697 ± 841.818
2025-09-12 16:07:27,228 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3564.6182, 4212.7188, 4518.674, 4406.3906, 4640.5347, 3113.6782, 2788.4058, 4199.4097, 4345.3154, 1976.7233]
2025-09-12 16:07:27,228 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 686.0, 664.0, 1000.0, 1000.0, 446.0]
2025-09-12 16:07:27,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 17 minutes, 8 seconds)
2025-09-12 16:20:06,810 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:20:06,814 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:23:19,072 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3034.76196 ± 1850.063
2025-09-12 16:23:19,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4283.275, 4437.814, 153.69951, 4632.6875, 135.91525, 3809.4312, 441.26175, 4582.303, 3713.7043, 4157.5273]
2025-09-12 16:23:19,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 66.0, 1000.0, 49.0, 1000.0, 121.0, 1000.0, 860.0, 1000.0]
2025-09-12 16:23:19,090 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 86/100 (estimated time remaining: 4 hours, 4 minutes, 19 seconds)
2025-09-12 16:36:17,514 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:36:17,532 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:39:23,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2877.96802 ± 1486.519
2025-09-12 16:39:23,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4245.6196, 3965.9167, 2346.8918, 376.73553, 3951.4128, 4258.202, 374.27524, 2907.2385, 1951.9602, 4401.425]
2025-09-12 16:39:23,567 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 518.0, 178.0, 1000.0, 1000.0, 101.0, 662.0, 435.0, 1000.0]
2025-09-12 16:39:23,574 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 47 minutes, 9 seconds)
2025-09-12 16:51:10,723 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:51:10,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:54:32,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3136.66650 ± 1600.668
2025-09-12 16:54:32,516 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [136.68254, 3212.2634, 4304.1216, 4309.61, 1851.5035, 4313.365, 550.329, 4665.033, 4388.5117, 3635.2446]
2025-09-12 16:54:32,516 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [53.0, 786.0, 1000.0, 1000.0, 459.0, 1000.0, 128.0, 1000.0, 1000.0, 1000.0]
2025-09-12 16:54:32,523 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 25 minutes, 4 seconds)
2025-09-12 17:07:18,746 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:07:18,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:10:30,054 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2899.72925 ± 1664.077
2025-09-12 17:10:30,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4038.9631, 777.8595, 2674.5466, 4208.829, 4530.552, 3005.1501, 4506.5176, 806.19324, 4412.8584, 35.819565]
2025-09-12 17:10:30,061 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 191.0, 619.0, 1000.0, 1000.0, 1000.0, 1000.0, 206.0, 1000.0, 24.0]
2025-09-12 17:10:30,085 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 10 minutes, 47 seconds)
2025-09-12 17:22:53,613 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:22:53,622 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:26:26,170 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3032.49902 ± 1839.897
2025-09-12 17:26:26,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [833.6277, 4628.241, 4550.976, 213.3529, 1021.6994, 1158.5248, 4732.2466, 4542.828, 4611.3687, 4032.1255]
2025-09-12 17:26:26,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [238.0, 1000.0, 1000.0, 1000.0, 246.0, 287.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:26:26,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 53 minutes, 45 seconds)
2025-09-12 17:38:46,758 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:38:46,766 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:43:12,056 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3419.99878 ± 1427.233
2025-09-12 17:43:12,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1475.4681, 4057.936, 4146.896, 4511.807, 4274.4165, -78.04271, 4088.5662, 4119.7437, 4167.0796, 3436.119]
2025-09-12 17:43:12,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 842.0]
2025-09-12 17:43:12,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 39 minutes, 45 seconds)
2025-09-12 17:54:55,025 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:54:55,029 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:58:47,978 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3571.61328 ± 1264.108
2025-09-12 17:58:47,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1396.644, 1076.7485, 3808.0063, 4190.892, 4804.5103, 4521.674, 3014.7146, 4679.6973, 3967.4092, 4255.838]
2025-09-12 17:58:47,999 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [317.0, 338.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:58:48,024 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 22 minutes, 56 seconds)
2025-09-12 18:11:54,373 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:11:54,377 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:15:55,943 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3848.05029 ± 1111.884
2025-09-12 18:15:55,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4202.445, 4763.644, 4312.0913, 4400.4976, 4209.4995, 4540.528, 1601.9851, 4496.5493, 4258.0693, 1695.1969]
2025-09-12 18:15:55,947 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 419.0, 1000.0, 1000.0, 509.0]
2025-09-12 18:15:55,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 93/100 (estimated time remaining: 2 hours, 10 minutes, 13 seconds)
2025-09-12 18:27:50,745 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:27:50,749 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:31:39,133 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3625.04761 ± 1313.009
2025-09-12 18:31:39,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1881.1044, 4156.087, 421.77908, 3950.5933, 4158.1143, 4271.363, 4541.8853, 3618.902, 4680.382, 4570.2637]
2025-09-12 18:31:39,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [511.0, 1000.0, 163.0, 1000.0, 910.0, 1000.0, 1000.0, 876.0, 997.0, 1000.0]
2025-09-12 18:31:39,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 53 minutes, 36 seconds)
2025-09-12 18:44:11,643 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:44:11,646 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:47:58,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3671.72314 ± 1129.071
2025-09-12 18:47:58,686 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4831.084, 4087.3196, 3081.3193, 4056.8281, 4540.066, 1961.5272, 4021.3904, 4388.7373, 1270.1201, 4478.841]
2025-09-12 18:47:58,687 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 673.0, 921.0, 1000.0, 502.0, 1000.0, 1000.0, 280.0, 1000.0]
2025-09-12 18:47:58,694 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 37 minutes, 50 seconds)
2025-09-12 19:00:55,222 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:00:55,226 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:05:08,336 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3395.79761 ± 1153.809
2025-09-12 19:05:08,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [2153.5586, 2773.4128, 4270.208, 4466.011, 3625.1917, 4639.9824, 4036.2837, 3891.2998, 3392.4773, 709.5523]
2025-09-12 19:05:08,340 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 706.0, 1000.0, 1000.0, 812.0, 995.0, 1000.0, 1000.0, 733.0, 1000.0]
2025-09-12 19:05:08,348 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 21 minutes, 56 seconds)
2025-09-12 19:17:26,341 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:17:26,346 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:20:36,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2975.48755 ± 1609.551
2025-09-12 19:20:36,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [51.522133, 2785.8977, 3394.6782, 4360.545, 740.46027, 4028.9128, 4571.0366, 1230.5127, 4446.445, 4144.8647]
2025-09-12 19:20:36,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [34.0, 622.0, 756.0, 1000.0, 218.0, 1000.0, 1000.0, 270.0, 1000.0, 1000.0]
2025-09-12 19:20:36,754 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 5 minutes, 26 seconds)
2025-09-12 19:33:01,956 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:33:01,962 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:35:50,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 2725.54736 ± 1728.347
2025-09-12 19:35:50,974 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1610.253, 45.8929, 4600.9136, 1447.9143, 2670.3428, 4501.7144, 4733.389, 4787.992, 565.2134, 2291.849]
2025-09-12 19:35:50,974 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [380.0, 26.0, 1000.0, 394.0, 651.0, 1000.0, 1000.0, 1000.0, 172.0, 525.0]
2025-09-12 19:35:50,985 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 98/100 (estimated time remaining: 47 minutes, 57 seconds)
2025-09-12 19:48:25,093 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:48:25,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:52:02,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3453.21997 ± 1594.372
2025-09-12 19:52:02,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [1822.6562, 4776.973, 155.87515, 4634.0566, 4445.6133, 4258.491, 4392.308, 4649.231, 4050.8018, 1346.1917]
2025-09-12 19:52:02,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [369.0, 1000.0, 67.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 300.0]
2025-09-12 19:52:02,675 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 99/100 (estimated time remaining: 32 minutes, 9 seconds)
2025-09-12 20:04:14,970 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:04:14,980 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:08:28,598 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3939.91162 ± 752.947
2025-09-12 20:08:28,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [4260.5063, 4040.1104, 1970.141, 4864.1943, 3325.7517, 3925.3152, 4367.4873, 4076.09, 4337.534, 4231.9824]
2025-09-12 20:08:28,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 427.0, 1000.0, 747.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 20:08:28,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1226 [INFO]: New best (3939.91) for latency ExtremeClogL1U23
2025-09-12 20:08:28,618 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1199 [INFO]: Iteration 100/100 (estimated time remaining: 16 minutes, 5 seconds)
2025-09-12 20:20:39,258 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:20:39,261 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:24:30,095 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1221 [DEBUG]: Total Reward: 3453.84448 ± 1477.506
2025-09-12 20:24:30,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1222 [DEBUG]: All rewards: [3918.8052, 4120.036, 4090.6123, 4496.7446, 4285.5713, 855.4252, 219.29303, 4181.7544, 4478.297, 3891.9053]
2025-09-12 20:24:30,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 189.0, 77.0, 1000.0, 1000.0, 1000.0]
2025-09-12 20:24:30,138 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc5-ant):1251 [DEBUG]: Training session finished
