2025-09-11 17:56:39,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc0-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 17:56:39,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc0-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 17:56:39,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x1496cc000e10>}
2025-09-11 17:56:39,945 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1111 [DEBUG]: using device: cuda
2025-09-11 17:56:39,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1133 [INFO]: Creating new trainer
2025-09-11 17:56:39,977 baseline-mbpac-noiseperc0-ant:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 17:56:39,978 baseline-mbpac-noiseperc0-ant:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=35, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 17:56:39,988 baseline-mbpac-noiseperc0-ant:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=27, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=8, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 17:56:41,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1194 [DEBUG]: Starting training session...
2025-09-11 17:56:41,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 1/100
2025-09-11 18:08:26,602 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:08:26,609 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:12:45,726 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: -694.50793 ± 186.209
2025-09-11 18:12:45,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [-164.71779, -799.43097, -621.82135, -724.6751, -756.57275, -790.46765, -803.77246, -815.9276, -678.9736, -788.71985]
2025-09-11 18:12:45,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [228.0, 1000.0, 807.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 18:12:45,728 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (-694.51) for latency ExtremeClogL1U23
2025-09-11 18:12:45,734 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 2/100 (estimated time remaining: 26 hours, 31 minutes, 39 seconds)
2025-09-11 18:25:18,517 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:25:18,525 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:28:00,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 124.51540 ± 103.706
2025-09-11 18:28:00,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [213.16743, 289.8148, 212.0617, 46.353344, 84.07454, 18.300364, 38.927883, -43.85491, 210.16693, 176.14194]
2025-09-11 18:28:00,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [953.0, 1000.0, 1000.0, 107.0, 232.0, 62.0, 180.0, 184.0, 1000.0, 1000.0]
2025-09-11 18:28:00,282 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (124.52) for latency ExtremeClogL1U23
2025-09-11 18:28:00,285 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 3/100 (estimated time remaining: 25 hours, 34 minutes, 40 seconds)
2025-09-11 18:41:10,984 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:41:11,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:45:22,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 482.44904 ± 110.577
2025-09-11 18:45:22,480 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [502.5092, 507.01697, 491.3713, 587.29095, 287.75668, 540.4766, 502.3644, 272.37396, 495.21768, 638.11255]
2025-09-11 18:45:22,480 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 457.0, 1000.0, 1000.0, 506.0, 916.0, 1000.0]
2025-09-11 18:45:22,480 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (482.45) for latency ExtremeClogL1U23
2025-09-11 18:45:22,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 4/100 (estimated time remaining: 26 hours, 14 minutes, 18 seconds)
2025-09-11 18:58:07,506 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:58:07,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:02:36,201 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 589.95081 ± 123.597
2025-09-11 19:02:36,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [408.38947, 674.24817, 474.39667, 643.3886, 463.336, 771.1528, 518.3076, 506.84744, 688.90717, 750.5342]
2025-09-11 19:02:36,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [696.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 698.0, 1000.0, 1000.0]
2025-09-11 19:02:36,204 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (589.95) for latency ExtremeClogL1U23
2025-09-11 19:02:36,212 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 5/100 (estimated time remaining: 26 hours, 22 minutes, 2 seconds)
2025-09-11 19:14:32,966 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:14:32,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:19:03,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 659.30457 ± 96.455
2025-09-11 19:19:03,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [546.9645, 717.99677, 716.6422, 810.64856, 648.394, 727.7952, 456.86374, 673.2791, 694.65955, 599.80206]
2025-09-11 19:19:03,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 987.0, 1000.0, 1000.0, 1000.0, 1000.0, 502.0, 1000.0, 1000.0, 1000.0]
2025-09-11 19:19:03,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (659.30) for latency ExtremeClogL1U23
2025-09-11 19:19:03,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 6/100 (estimated time remaining: 26 hours, 5 minutes, 12 seconds)
2025-09-11 19:32:38,525 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:32:38,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:36:55,713 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 661.42401 ± 256.131
2025-09-11 19:36:55,715 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [970.9786, 591.7597, 855.48145, 805.7236, 801.36127, 180.44029, 277.94962, 669.1709, 519.454, 941.9211]
2025-09-11 19:36:55,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 190.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 19:36:55,716 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (661.42) for latency ExtremeClogL1U23
2025-09-11 19:36:55,722 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 7/100 (estimated time remaining: 26 hours, 22 minutes, 19 seconds)
2025-09-11 19:49:06,184 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:49:06,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:53:19,287 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 645.33606 ± 293.792
2025-09-11 19:53:19,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [471.96198, 211.73125, 901.3722, 992.4676, 176.56378, 571.60156, 951.0281, 467.3226, 957.26807, 752.04346]
2025-09-11 19:53:19,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 257.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 19:53:19,295 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 8/100 (estimated time remaining: 26 hours, 26 minutes, 53 seconds)
2025-09-11 20:04:52,726 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:04:52,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:08:21,575 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 785.24976 ± 398.584
2025-09-11 20:08:21,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [197.60037, 694.0456, 1059.2977, 669.55457, 196.63802, 367.5805, 1039.1428, 1170.3351, 1132.2148, 1326.088]
2025-09-11 20:08:21,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [219.0, 1000.0, 881.0, 1000.0, 172.0, 308.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 20:08:21,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (785.25) for latency ExtremeClogL1U23
2025-09-11 20:08:21,583 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 9/100 (estimated time remaining: 25 hours, 26 minutes, 55 seconds)
2025-09-11 20:21:51,702 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:21:51,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:25:41,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 670.87122 ± 410.698
2025-09-11 20:25:41,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [99.0863, 367.19342, 558.8786, 699.2552, 1445.9835, 441.68637, 516.0114, 1340.8806, 372.07632, 867.6603]
2025-09-11 20:25:41,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [84.0, 270.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 20:25:41,021 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 10/100 (estimated time remaining: 25 hours, 12 minutes, 3 seconds)
2025-09-11 20:38:07,095 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:38:07,099 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:42:00,255 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1054.92310 ± 519.359
2025-09-11 20:42:00,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [485.5326, 1707.859, 770.93854, 328.59558, 812.8568, 1395.9438, 1653.618, 1537.8617, 408.47287, 1447.5531]
2025-09-11 20:42:00,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 980.0, 1000.0, 228.0, 1000.0, 1000.0, 1000.0, 1000.0, 278.0, 1000.0]
2025-09-11 20:42:00,256 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (1054.92) for latency ExtremeClogL1U23
2025-09-11 20:42:00,262 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 11/100 (estimated time remaining: 24 hours, 52 minutes, 55 seconds)
2025-09-11 20:53:15,191 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:53:15,194 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:55:48,624 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 827.99084 ± 552.143
2025-09-11 20:55:48,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1022.48627, 871.32086, 1619.4001, 1722.0005, 136.92215, 183.58519, 946.5743, 639.44666, 65.687386, 1072.4846]
2025-09-11 20:55:48,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [554.0, 1000.0, 1000.0, 1000.0, 79.0, 152.0, 434.0, 345.0, 53.0, 1000.0]
2025-09-11 20:55:48,639 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 12/100 (estimated time remaining: 23 hours, 24 minutes, 5 seconds)
2025-09-11 21:09:11,794 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:09:11,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:12:28,648 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 797.86432 ± 519.115
2025-09-11 21:12:28,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [97.655945, 1142.3373, 696.60095, 1046.7657, 759.96796, 424.4151, 2057.357, 669.7199, 805.5202, 278.30338]
2025-09-11 21:12:28,650 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [62.0, 1000.0, 1000.0, 687.0, 1000.0, 244.0, 1000.0, 1000.0, 1000.0, 169.0]
2025-09-11 21:12:28,655 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 13/100 (estimated time remaining: 23 hours, 13 minutes, 8 seconds)
2025-09-11 21:24:48,048 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:24:48,052 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:28:11,872 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1005.74200 ± 682.594
2025-09-11 21:28:11,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [915.0766, 264.90887, 161.25548, 700.83453, 425.55972, 1295.4059, 552.06555, 2067.3035, 2119.6887, 1555.3218]
2025-09-11 21:28:11,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 103.0, 60.0, 1000.0, 1000.0, 1000.0, 278.0, 1000.0, 981.0, 1000.0]
2025-09-11 21:28:11,879 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 14/100 (estimated time remaining: 23 hours, 9 minutes, 11 seconds)
2025-09-11 21:40:41,034 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:40:41,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:45:02,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1803.97620 ± 679.309
2025-09-11 21:45:02,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [851.667, 2229.0217, 2234.817, 2339.071, 1996.3942, 2277.321, 418.9697, 2273.8206, 2273.7485, 1144.9326]
2025-09-11 21:45:02,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 526.0]
2025-09-11 21:45:02,692 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (1803.98) for latency ExtremeClogL1U23
2025-09-11 21:45:02,698 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 15/100 (estimated time remaining: 22 hours, 45 minutes)
2025-09-11 21:57:40,033 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:57:40,037 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:01:18,155 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1380.58228 ± 683.176
2025-09-11 22:01:18,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [407.93906, 725.96906, 1010.0877, 546.488, 2259.6755, 1640.7461, 995.6601, 2131.1682, 1927.0813, 2161.0068]
2025-09-11 22:01:18,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [178.0, 409.0, 1000.0, 283.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-11 22:01:18,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 16/100 (estimated time remaining: 22 hours, 28 minutes, 4 seconds)
2025-09-11 22:12:56,894 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:12:56,896 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:16:10,096 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1446.72498 ± 915.045
2025-09-11 22:16:10,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1354.9899, 2456.5325, 673.4625, 698.64355, 2478.6155, 278.28104, 377.29807, 1045.6724, 2480.87, 2622.8833]
2025-09-11 22:16:10,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 272.0, 1000.0, 140.0, 146.0, 405.0, 1000.0, 1000.0]
2025-09-11 22:16:10,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 17/100 (estimated time remaining: 22 hours, 30 minutes)
2025-09-11 22:29:14,344 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:29:14,347 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:32:41,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1896.24829 ± 764.282
2025-09-11 22:32:41,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [906.62036, 2690.9822, 712.95746, 2390.2654, 2510.168, 2327.6892, 1233.1926, 1064.1722, 2537.3152, 2589.12]
2025-09-11 22:32:41,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [343.0, 1000.0, 281.0, 1000.0, 1000.0, 1000.0, 512.0, 444.0, 1000.0, 1000.0]
2025-09-11 22:32:41,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (1896.25) for latency ExtremeClogL1U23
2025-09-11 22:32:41,749 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 18/100 (estimated time remaining: 22 hours, 11 minutes, 37 seconds)
2025-09-11 22:44:45,871 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:44:45,874 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:49:06,853 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2473.58398 ± 455.784
2025-09-11 22:49:06,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2540.362, 2691.143, 2520.997, 2918.734, 2794.9355, 2973.5908, 2005.5577, 2516.5874, 2417.4272, 1356.5045]
2025-09-11 22:49:06,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 477.0]
2025-09-11 22:49:06,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (2473.58) for latency ExtremeClogL1U23
2025-09-11 22:49:06,858 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 19/100 (estimated time remaining: 22 hours, 7 minutes, 1 second)
2025-09-11 23:02:12,462 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:02:12,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:05:18,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1493.18384 ± 825.692
2025-09-11 23:05:18,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1039.6241, 1477.7157, 1736.9899, 434.81128, 2925.9429, 1695.0137, 873.6144, 187.67332, 2356.043, 2204.4097]
2025-09-11 23:05:18,796 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [305.0, 1000.0, 607.0, 1000.0, 1000.0, 524.0, 296.0, 82.0, 1000.0, 1000.0]
2025-09-11 23:05:18,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 20/100 (estimated time remaining: 21 hours, 40 minutes, 20 seconds)
2025-09-11 23:16:53,424 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:16:53,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:20:45,565 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2077.00757 ± 1006.858
2025-09-11 23:20:45,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [71.75311, 2335.5713, 2857.6663, 2529.4077, 2910.5642, 3004.295, 1832.3862, 378.71503, 2922.3135, 1927.404]
2025-09-11 23:20:45,580 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [52.0, 1000.0, 1000.0, 824.0, 1000.0, 1000.0, 598.0, 1000.0, 1000.0, 1000.0]
2025-09-11 23:20:45,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 21/100 (estimated time remaining: 21 hours, 11 minutes, 18 seconds)
2025-09-11 23:33:38,663 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:33:38,668 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:36:53,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2180.32593 ± 753.343
2025-09-11 23:36:53,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1282.2542, 2473.999, 1980.5388, 3145.3464, 2167.439, 3094.1006, 702.82367, 2069.1204, 1877.2874, 3010.3518]
2025-09-11 23:36:53,275 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [448.0, 832.0, 629.0, 1000.0, 701.0, 1000.0, 210.0, 695.0, 616.0, 1000.0]
2025-09-11 23:36:53,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 22/100 (estimated time remaining: 21 hours, 15 minutes, 22 seconds)
2025-09-11 23:49:07,097 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:49:07,100 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:52:55,795 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2014.93030 ± 1162.827
2025-09-11 23:52:55,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1759.3519, 2968.819, 3373.7676, 1215.8895, 471.38205, 1211.137, 3141.0312, 2895.924, 44.80455, 3067.1958]
2025-09-11 23:52:55,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 405.0, 1000.0, 1000.0, 1000.0, 964.0, 26.0, 1000.0]
2025-09-11 23:52:55,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 23/100 (estimated time remaining: 20 hours, 51 minutes, 39 seconds)
2025-09-12 00:05:50,666 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:05:50,671 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:09:59,652 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2404.35596 ± 937.744
2025-09-12 00:09:59,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2736.8628, 886.42816, 1448.5991, 3083.149, 3419.4932, 2672.7007, 2105.2693, 3334.6614, 3370.8882, 985.50867]
2025-09-12 00:09:59,653 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 237.0, 1000.0, 858.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 00:09:59,656 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 24/100 (estimated time remaining: 20 hours, 45 minutes, 33 seconds)
2025-09-12 00:21:46,900 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:21:46,904 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:23:50,180 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 1240.03455 ± 1082.623
2025-09-12 00:23:50,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [779.2414, 93.591835, 3235.3228, 3090.647, 1850.5515, 474.2502, 34.42125, 699.3436, 956.7913, 1186.185]
2025-09-12 00:23:50,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [209.0, 65.0, 1000.0, 956.0, 514.0, 182.0, 26.0, 225.0, 292.0, 1000.0]
2025-09-12 00:23:50,222 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 25/100 (estimated time remaining: 19 hours, 53 minutes, 33 seconds)
2025-09-12 00:36:07,005 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:36:07,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:40:05,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2652.12598 ± 895.278
2025-09-12 00:40:05,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1363.7709, 3513.4885, 3473.587, 2369.3325, 3120.95, 629.9667, 2872.2776, 3025.7983, 3042.6926, 3109.3945]
2025-09-12 00:40:05,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [424.0, 1000.0, 1000.0, 1000.0, 1000.0, 176.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 00:40:05,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (2652.13) for latency ExtremeClogL1U23
2025-09-12 00:40:05,459 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 26/100 (estimated time remaining: 19 hours, 49 minutes, 57 seconds)
2025-09-12 00:52:58,307 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:52:58,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:56:32,140 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2646.31787 ± 960.940
2025-09-12 00:56:32,147 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3318.406, 2585.5913, 3486.541, 3507.7856, 1003.1832, 3239.183, 1030.0371, 1749.1522, 3071.8577, 3471.4434]
2025-09-12 00:56:32,147 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 762.0, 1000.0, 943.0, 314.0, 916.0, 434.0, 411.0, 1000.0, 1000.0]
2025-09-12 00:56:32,161 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 27/100 (estimated time remaining: 19 hours, 38 minutes, 47 seconds)
2025-09-12 01:08:23,690 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:08:23,694 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:11:48,303 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2489.09814 ± 1550.098
2025-09-12 01:11:48,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3830.0767, 581.37463, 3197.5776, 3663.7017, 3763.1616, 4034.0989, 520.2118, 3966.2917, 639.27136, 695.21405]
2025-09-12 01:11:48,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 902.0, 1000.0, 1000.0, 1000.0, 158.0, 1000.0, 178.0, 254.0]
2025-09-12 01:11:48,315 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 28/100 (estimated time remaining: 19 hours, 11 minutes, 34 seconds)
2025-09-12 01:25:18,418 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:25:18,423 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:29:25,984 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2848.27783 ± 1157.593
2025-09-12 01:29:25,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3605.6113, 3641.8755, 221.65814, 3462.7627, 3310.4885, 3713.3108, 3010.239, 3423.1262, 3123.1853, 970.52045]
2025-09-12 01:29:25,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 84.0, 1000.0, 1000.0, 1000.0, 850.0, 1000.0, 1000.0, 1000.0]
2025-09-12 01:29:25,986 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (2848.28) for latency ExtremeClogL1U23
2025-09-12 01:29:25,991 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 29/100 (estimated time remaining: 19 hours, 3 minutes, 55 seconds)
2025-09-12 01:41:40,965 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:41:40,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:45:44,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3515.82031 ± 948.855
2025-09-12 01:45:44,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4043.2478, 3940.1028, 1382.7262, 3412.4163, 4540.8555, 2090.5361, 4101.6284, 3562.634, 4080.7693, 4003.2842]
2025-09-12 01:45:44,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 356.0, 1000.0, 1000.0, 511.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 01:45:44,462 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (3515.82) for latency ExtremeClogL1U23
2025-09-12 01:45:44,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 30/100 (estimated time remaining: 19 hours, 23 minutes, 2 seconds)
2025-09-12 01:58:21,116 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:58:21,120 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:02:00,742 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2583.13403 ± 1302.939
2025-09-12 02:02:00,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [434.36783, 4008.4395, 3857.7546, 3338.685, 708.1523, 2204.3833, 3022.4492, 3590.6091, 1080.1753, 3586.3242]
2025-09-12 02:02:00,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 203.0, 733.0, 784.0, 1000.0, 299.0, 1000.0]
2025-09-12 02:02:00,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 31/100 (estimated time remaining: 19 hours, 6 minutes, 54 seconds)
2025-09-12 02:14:06,183 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:14:06,187 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:17:56,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3090.99658 ± 1144.029
2025-09-12 02:17:56,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1935.4344, 3933.1577, 3336.6335, 3526.7427, 2595.6155, 3417.3792, 247.13626, 4058.568, 3855.7866, 4003.512]
2025-09-12 02:17:56,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [576.0, 1000.0, 1000.0, 1000.0, 785.0, 847.0, 80.0, 987.0, 1000.0, 1000.0]
2025-09-12 02:17:56,681 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 32/100 (estimated time remaining: 18 hours, 43 minutes, 26 seconds)
2025-09-12 02:30:27,753 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:30:27,769 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:34:34,846 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3064.55493 ± 1340.799
2025-09-12 02:34:34,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2848.3513, 1030.2421, 3919.23, 4000.0117, 3049.5195, 4283.045, 3973.2998, 3676.7498, 3795.1123, 69.98606]
2025-09-12 02:34:34,856 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 42.0]
2025-09-12 02:34:34,867 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 33/100 (estimated time remaining: 18 hours, 45 minutes, 45 seconds)
2025-09-12 02:47:23,274 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:47:23,280 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:51:46,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3917.71338 ± 454.178
2025-09-12 02:51:46,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4077.6233, 4065.1418, 3992.458, 3482.6555, 3419.0833, 4662.6426, 4178.5576, 3063.0344, 4380.969, 3854.9678]
2025-09-12 02:51:46,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 838.0, 1000.0, 1000.0, 798.0, 1000.0, 1000.0]
2025-09-12 02:51:46,032 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (3917.71) for latency ExtremeClogL1U23
2025-09-12 02:51:46,039 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 34/100 (estimated time remaining: 18 hours, 23 minutes, 16 seconds)
2025-09-12 03:03:51,443 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:03:51,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:07:44,432 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3001.22070 ± 1378.498
2025-09-12 03:07:44,444 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4443.428, 3794.061, 2567.5469, 3646.2603, 4031.1987, 2534.181, 3802.9114, 744.38025, 291.8844, 4156.353]
2025-09-12 03:07:44,444 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 652.0, 1000.0, 1000.0, 673.0, 1000.0, 243.0, 1000.0, 1000.0]
2025-09-12 03:07:44,449 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 35/100 (estimated time remaining: 18 hours, 2 minutes, 23 seconds)
2025-09-12 03:20:09,334 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:20:09,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:24:31,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3483.94653 ± 1041.985
2025-09-12 03:24:31,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2857.3252, 4185.173, 603.1042, 3998.3247, 4469.572, 3640.4138, 3702.0469, 3995.5327, 3605.52, 3782.4526]
2025-09-12 03:24:31,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [781.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 869.0, 1000.0]
2025-09-12 03:24:31,988 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 36/100 (estimated time remaining: 17 hours, 52 minutes, 46 seconds)
2025-09-12 03:36:43,607 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:36:43,626 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:40:49,631 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3436.81982 ± 1327.438
2025-09-12 03:40:49,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [75.58363, 3944.4019, 4146.319, 4590.84, 3464.7288, 3985.9475, 1944.7367, 3500.053, 4382.2617, 4333.3276]
2025-09-12 03:40:49,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [30.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 03:40:49,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 37/100 (estimated time remaining: 17 hours, 40 minutes, 53 seconds)
2025-09-12 03:53:43,161 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:53:43,165 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:57:21,236 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2572.33301 ± 1484.452
2025-09-12 03:57:21,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3863.115, 1842.5739, 934.7238, 364.3865, 4190.57, 3912.4666, 4132.287, 3456.478, 2610.606, 416.12436]
2025-09-12 03:57:21,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 499.0, 281.0, 108.0, 1000.0, 1000.0, 990.0, 1000.0, 1000.0, 1000.0]
2025-09-12 03:57:21,264 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 38/100 (estimated time remaining: 17 hours, 22 minutes, 56 seconds)
2025-09-12 04:08:50,725 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:08:50,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:13:16,337 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3433.34814 ± 1296.760
2025-09-12 04:13:16,344 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4254.815, 2439.6858, 4806.0044, 3994.035, 2690.222, 4445.0596, 352.42566, 4362.0586, 2781.4082, 4207.769]
2025-09-12 04:13:16,344 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 687.0, 1000.0]
2025-09-12 04:13:16,352 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 39/100 (estimated time remaining: 16 hours, 50 minutes, 39 seconds)
2025-09-12 04:26:56,103 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:26:56,108 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:30:22,694 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2945.02588 ± 1291.856
2025-09-12 04:30:22,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3871.496, 3193.7043, 4318.0073, 1286.9034, 3689.7075, 1222.1705, 4057.9944, 2593.6685, 4343.6113, 872.99567]
2025-09-12 04:30:22,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [859.0, 882.0, 931.0, 309.0, 1000.0, 291.0, 1000.0, 1000.0, 1000.0, 237.0]
2025-09-12 04:30:22,704 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 40/100 (estimated time remaining: 16 hours, 48 minutes, 10 seconds)
2025-09-12 04:42:02,684 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:42:02,706 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:46:40,666 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3968.15112 ± 1225.472
2025-09-12 04:46:40,681 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4598.6006, 4508.1484, 3970.289, 4819.613, 4633.157, 4089.5266, 427.68307, 3956.5864, 3928.617, 4749.2886]
2025-09-12 04:46:40,681 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 04:46:40,681 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (3968.15) for latency ExtremeClogL1U23
2025-09-12 04:46:40,696 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 41/100 (estimated time remaining: 16 hours, 25 minutes, 44 seconds)
2025-09-12 04:58:54,972 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:58:54,984 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:03:33,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4220.35840 ± 215.289
2025-09-12 05:03:33,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4059.8347, 4576.2183, 4138.355, 4238.919, 4551.038, 4245.1763, 3830.62, 4026.96, 4258.948, 4277.511]
2025-09-12 05:03:33,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 05:03:33,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (4220.36) for latency ExtremeClogL1U23
2025-09-12 05:03:33,258 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 42/100 (estimated time remaining: 16 hours, 16 minutes, 10 seconds)
2025-09-12 05:16:32,580 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:16:32,596 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:20:10,600 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3578.07275 ± 1404.124
2025-09-12 05:20:10,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1483.3398, 4499.204, 3724.375, 4713.0264, 933.99994, 4205.2603, 4882.461, 2168.9014, 4398.1006, 4772.0605]
2025-09-12 05:20:10,602 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [418.0, 920.0, 853.0, 1000.0, 253.0, 1000.0, 1000.0, 505.0, 1000.0, 1000.0]
2025-09-12 05:20:10,611 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 43/100 (estimated time remaining: 16 hours, 44 seconds)
2025-09-12 05:32:41,306 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:32:41,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:36:49,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3907.60083 ± 1294.214
2025-09-12 05:36:49,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4404.926, 4572.62, 4301.46, 4302.349, 4272.5547, 4041.9092, 4415.294, 4687.9434, 4006.2434, 70.7102]
2025-09-12 05:36:49,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 986.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 43.0]
2025-09-12 05:36:49,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 44/100 (estimated time remaining: 15 hours, 52 minutes, 31 seconds)
2025-09-12 05:48:51,695 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:48:51,705 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:52:05,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2971.18555 ± 1230.415
2025-09-12 05:52:05,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4259.61, 2794.736, 2321.1306, 4358.8623, 4418.019, 1104.1173, 4276.82, 2872.3894, 1246.055, 2060.118]
2025-09-12 05:52:05,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 657.0, 600.0, 1000.0, 1000.0, 264.0, 1000.0, 686.0, 315.0, 536.0]
2025-09-12 05:52:05,154 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 45/100 (estimated time remaining: 15 hours, 15 minutes, 7 seconds)
2025-09-12 06:04:11,277 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:04:11,281 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:07:52,714 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3198.62109 ± 1201.909
2025-09-12 06:07:52,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2013.1788, 4155.836, 1798.0282, 3759.2563, 4441.1494, 4468.202, 2997.4595, 4696.457, 2399.822, 1256.8215]
2025-09-12 06:07:52,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 370.0, 1000.0, 1000.0, 1000.0, 706.0, 1000.0, 655.0, 272.0]
2025-09-12 06:07:52,725 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 46/100 (estimated time remaining: 14 hours, 53 minutes, 12 seconds)
2025-09-12 06:21:03,578 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:21:03,588 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:25:38,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3933.42456 ± 1089.320
2025-09-12 06:25:38,764 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3934.8518, 846.92554, 4326.9746, 4837.6323, 4217.4385, 4753.774, 4699.6963, 3949.9868, 3780.6038, 3986.3577]
2025-09-12 06:25:38,764 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 06:25:38,770 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 47/100 (estimated time remaining: 14 hours, 46 minutes, 35 seconds)
2025-09-12 06:37:22,016 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:37:22,020 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:41:28,793 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3761.67773 ± 1108.029
2025-09-12 06:41:28,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4559.3135, 4214.975, 2294.387, 4225.78, 3905.55, 3619.0483, 1101.2942, 4646.292, 4509.269, 4540.8657]
2025-09-12 06:41:28,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 598.0, 1000.0, 1000.0, 1000.0, 250.0, 1000.0, 1000.0, 1000.0]
2025-09-12 06:41:28,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 48/100 (estimated time remaining: 14 hours, 21 minutes, 48 seconds)
2025-09-12 06:54:05,360 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:54:05,363 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:57:43,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3473.94873 ± 1456.380
2025-09-12 06:57:43,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1885.1552, 1747.4386, 4686.6885, 4485.908, 4914.059, 3811.1907, 4238.885, 4193.276, 4333.1387, 443.74695]
2025-09-12 06:57:43,329 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [461.0, 393.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 111.0]
2025-09-12 06:57:43,335 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 49/100 (estimated time remaining: 14 hours, 1 minute, 18 seconds)
2025-09-12 07:10:08,503 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:10:08,507 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:13:08,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2508.80176 ± 1674.033
2025-09-12 07:13:08,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4489.846, 826.902, 2746.8374, 768.88257, 4517.0786, 3054.2173, 4568.6934, 3324.6135, 481.7988, 309.14868]
2025-09-12 07:13:08,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 188.0, 585.0, 1000.0, 1000.0, 745.0, 1000.0, 796.0, 153.0, 131.0]
2025-09-12 07:13:08,813 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 50/100 (estimated time remaining: 13 hours, 46 minutes, 49 seconds)
2025-09-12 07:26:22,717 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:26:22,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:30:19,919 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3878.13013 ± 1193.693
2025-09-12 07:30:19,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4917.0986, 1419.6283, 4463.1123, 4612.788, 4452.229, 1683.6976, 4500.693, 4262.9507, 3841.7803, 4627.323]
2025-09-12 07:30:19,920 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 320.0, 1000.0, 1000.0, 1000.0, 427.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 07:30:19,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 51/100 (estimated time remaining: 13 hours, 44 minutes, 32 seconds)
2025-09-12 07:41:50,530 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:41:50,534 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:45:36,537 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3174.57495 ± 1640.466
2025-09-12 07:45:36,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4731.1772, 1425.2617, 4341.5073, 4239.0103, 4150.816, 2863.244, 4691.506, 85.837234, 4252.1123, 965.27826]
2025-09-12 07:45:36,538 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 386.0, 1000.0, 1000.0, 1000.0, 660.0, 1000.0, 1000.0, 1000.0, 198.0]
2025-09-12 07:45:36,546 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 52/100 (estimated time remaining: 13 hours, 3 minutes, 38 seconds)
2025-09-12 07:58:54,312 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:58:54,316 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:03:18,426 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4295.06396 ± 479.428
2025-09-12 08:03:18,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4489.879, 4138.7485, 3999.0903, 4602.3906, 4447.963, 4154.6226, 4824.274, 4811.27, 4393.4766, 3088.9302]
2025-09-12 08:03:18,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [937.0, 1000.0, 1000.0, 1000.0, 933.0, 1000.0, 1000.0, 1000.0, 1000.0, 660.0]
2025-09-12 08:03:18,428 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (4295.06) for latency ExtremeClogL1U23
2025-09-12 08:03:18,436 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 53/100 (estimated time remaining: 13 hours, 5 minutes, 32 seconds)
2025-09-12 08:15:51,643 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:15:51,647 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:19:45,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3838.05029 ± 1045.535
2025-09-12 08:19:45,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4603.9897, 3743.959, 4156.4624, 1238.8026, 4230.1523, 2637.3035, 3957.8965, 4758.3853, 4335.946, 4717.6055]
2025-09-12 08:19:45,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 811.0, 1000.0, 277.0, 1000.0, 619.0, 813.0, 1000.0, 1000.0, 1000.0]
2025-09-12 08:19:45,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 54/100 (estimated time remaining: 12 hours, 51 minutes, 11 seconds)
2025-09-12 08:31:07,629 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:31:07,633 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:34:54,469 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3574.70044 ± 1404.999
2025-09-12 08:34:54,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1581.7638, 5003.733, 628.776, 4961.7915, 3600.2024, 3099.596, 4748.2085, 4651.771, 3342.8582, 4128.3047]
2025-09-12 08:34:54,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [368.0, 1000.0, 152.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 747.0, 1000.0]
2025-09-12 08:34:54,489 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 55/100 (estimated time remaining: 12 hours, 32 minutes, 12 seconds)
2025-09-12 08:48:05,987 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:48:05,991 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:52:10,122 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3382.25513 ± 1568.830
2025-09-12 08:52:10,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3720.0166, 5070.457, 3471.8691, 4296.612, 1311.5542, 650.9084, 4976.2695, 4805.3594, 1396.8685, 4122.634]
2025-09-12 08:52:10,123 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [845.0, 1000.0, 817.0, 1000.0, 284.0, 1000.0, 1000.0, 1000.0, 1000.0, 911.0]
2025-09-12 08:52:10,129 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 56/100 (estimated time remaining: 12 hours, 16 minutes, 31 seconds)
2025-09-12 09:04:34,375 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:04:34,379 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:09:09,504 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4424.04004 ± 790.874
2025-09-12 09:09:09,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4273.4673, 4877.769, 5040.0264, 4740.7, 4972.3364, 4802.0293, 4478.3003, 2223.5361, 4775.127, 4057.1094]
2025-09-12 09:09:09,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 09:09:09,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (4424.04) for latency ExtremeClogL1U23
2025-09-12 09:09:09,512 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 57/100 (estimated time remaining: 12 hours, 15 minutes, 14 seconds)
2025-09-12 09:22:05,341 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:22:05,345 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:24:56,096 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2371.67212 ± 1580.709
2025-09-12 09:24:56,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1331.0474, 537.24097, 598.46375, 3144.9941, 1094.4855, 4406.621, 3633.4604, 3349.162, 799.2932, 4821.9526]
2025-09-12 09:24:56,098 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [258.0, 118.0, 149.0, 1000.0, 1000.0, 1000.0, 794.0, 676.0, 177.0, 1000.0]
2025-09-12 09:24:56,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 58/100 (estimated time remaining: 11 hours, 41 minutes, 59 seconds)
2025-09-12 09:37:03,303 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:37:03,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:41:06,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3333.78955 ± 1750.265
2025-09-12 09:41:06,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4132.131, 4327.3584, 107.47819, 4350.143, 3216.7354, 4776.958, 5185.0435, 1391.9807, 932.63184, 4917.4346]
2025-09-12 09:41:06,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 46.0, 907.0, 766.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 09:41:06,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 59/100 (estimated time remaining: 11 hours, 23 minutes, 17 seconds)
2025-09-12 09:53:13,311 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:53:13,315 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:55:55,107 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2191.44189 ± 2078.107
2025-09-12 09:55:55,110 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [139.16673, 4133.8174, 5150.3037, 90.28811, 85.49852, 3316.6387, 4752.7305, 3719.2163, 282.53613, 244.2233]
2025-09-12 09:55:55,110 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [45.0, 1000.0, 1000.0, 38.0, 37.0, 810.0, 1000.0, 783.0, 1000.0, 92.0]
2025-09-12 09:55:55,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 60/100 (estimated time remaining: 11 hours, 4 minutes, 17 seconds)
2025-09-12 10:08:02,351 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:08:02,355 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:11:27,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2880.12134 ± 1770.003
2025-09-12 10:11:27,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [108.77398, 3183.438, 4587.1104, 4498.836, 481.4301, 425.322, 2692.7432, 4631.612, 3793.0593, 4398.888]
2025-09-12 10:11:27,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [45.0, 608.0, 1000.0, 1000.0, 138.0, 1000.0, 656.0, 1000.0, 1000.0, 1000.0]
2025-09-12 10:11:27,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 61/100 (estimated time remaining: 10 hours, 34 minutes, 21 seconds)
2025-09-12 10:24:09,468 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:24:09,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:28:25,677 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3701.54883 ± 1472.376
2025-09-12 10:28:25,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4486.254, 1701.5435, 1167.1506, 4717.195, 1536.9124, 4588.7056, 4603.5337, 4790.6875, 4493.9463, 4929.562]
2025-09-12 10:28:25,679 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 300.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 10:28:25,685 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 62/100 (estimated time remaining: 10 hours, 18 minutes, 18 seconds)
2025-09-12 10:41:05,336 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:41:05,341 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:45:39,674 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4716.77441 ± 256.874
2025-09-12 10:45:39,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4876.328, 4587.6455, 4210.6475, 5027.405, 5159.346, 4794.2583, 4741.912, 4492.3066, 4633.998, 4643.896]
2025-09-12 10:45:39,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 10:45:39,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (4716.77) for latency ExtremeClogL1U23
2025-09-12 10:45:39,684 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 63/100 (estimated time remaining: 10 hours, 13 minutes, 31 seconds)
2025-09-12 10:57:45,648 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:57:45,651 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:01:49,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3540.54492 ± 1414.042
2025-09-12 11:01:49,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3029.9211, 4577.7607, 2239.4062, 2626.7866, 4882.221, 4930.86, 3255.2532, 4950.556, 459.20877, 4453.476]
2025-09-12 11:01:49,817 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [737.0, 1000.0, 463.0, 578.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 11:01:49,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 64/100 (estimated time remaining: 9 hours, 57 minutes, 20 seconds)
2025-09-12 11:14:59,178 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:14:59,182 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:18:02,171 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3119.32812 ± 1936.637
2025-09-12 11:18:02,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4987.386, 5088.222, 404.50018, 4356.0522, 300.3229, 940.0797, 3890.8425, 1631.8892, 4565.784, 5028.203]
2025-09-12 11:18:02,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 92.0, 1000.0, 90.0, 212.0, 917.0, 326.0, 1000.0, 1000.0]
2025-09-12 11:18:02,185 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 65/100 (estimated time remaining: 9 hours, 51 minutes, 14 seconds)
2025-09-12 11:30:18,637 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:30:18,640 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:34:14,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4101.64648 ± 1151.436
2025-09-12 11:34:14,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4919.805, 2023.8036, 4550.2676, 4850.364, 3954.132, 4557.1597, 1706.3708, 4830.205, 4846.0547, 4778.2993]
2025-09-12 11:34:14,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 382.0, 1000.0, 1000.0, 876.0, 1000.0, 336.0, 1000.0, 1000.0, 1000.0]
2025-09-12 11:34:14,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 66/100 (estimated time remaining: 9 hours, 39 minutes, 24 seconds)
2025-09-12 11:46:05,604 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:46:05,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:50:27,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3770.68896 ± 1483.365
2025-09-12 11:50:27,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4985.482, 4431.2173, 5128.914, 2565.0437, 2180.3806, 4462.931, 433.05737, 3686.816, 4837.9014, 4995.1504]
2025-09-12 11:50:27,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 536.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 11:50:27,533 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 67/100 (estimated time remaining: 9 hours, 17 minutes, 48 seconds)
2025-09-12 12:03:29,352 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:03:29,360 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:07:33,969 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3644.27930 ± 1070.740
2025-09-12 12:07:33,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4540.8438, 4060.506, 4382.95, 4081.704, 3946.0708, 801.5232, 2672.388, 4220.8843, 3550.6748, 4185.2446]
2025-09-12 12:07:33,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 204.0, 661.0, 1000.0, 1000.0, 1000.0]
2025-09-12 12:07:33,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 68/100 (estimated time remaining: 9 hours, 34 seconds)
2025-09-12 12:19:39,263 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:19:39,267 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:23:09,566 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2901.93213 ± 1649.919
2025-09-12 12:23:09,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4763.683, 4609.878, 3543.7175, 894.7145, 844.94763, 3171.3716, 4158.907, 404.743, 4717.164, 1910.1947]
2025-09-12 12:23:09,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 274.0, 209.0, 667.0, 1000.0, 1000.0, 1000.0, 437.0]
2025-09-12 12:23:09,605 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 69/100 (estimated time remaining: 8 hours, 40 minutes, 30 seconds)
2025-09-12 12:35:23,812 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:35:23,816 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:39:19,207 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3517.00928 ± 940.953
2025-09-12 12:39:19,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4682.718, 4300.4604, 4637.8315, 1844.5674, 3048.6116, 2700.129, 3609.5535, 3956.4014, 4070.1606, 2319.6592]
2025-09-12 12:39:19,219 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 434.0, 619.0, 613.0, 1000.0, 1000.0, 832.0, 1000.0]
2025-09-12 12:39:19,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 70/100 (estimated time remaining: 8 hours, 23 minutes, 57 seconds)
2025-09-12 12:52:09,335 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:52:09,339 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:55:52,224 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3719.07666 ± 1342.819
2025-09-12 12:55:52,229 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1158.2985, 4167.8564, 4836.1074, 4615.103, 3616.783, 4578.6523, 4509.4146, 4398.001, 4255.861, 1054.688]
2025-09-12 12:55:52,229 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [258.0, 1000.0, 1000.0, 1000.0, 707.0, 1000.0, 1000.0, 1000.0, 885.0, 232.0]
2025-09-12 12:55:52,238 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 71/100 (estimated time remaining: 8 hours, 9 minutes, 48 seconds)
2025-09-12 13:08:39,855 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:08:39,859 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:13:17,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4049.97339 ± 952.406
2025-09-12 13:13:17,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4756.2275, 4310.0884, 4419.958, 3450.456, 4135.1875, 4610.2197, 1423.4971, 4370.191, 4871.8857, 4152.023]
2025-09-12 13:13:17,883 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:13:17,892 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 72/100 (estimated time remaining: 8 hours, 28 seconds)
2025-09-12 13:25:11,529 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:25:11,534 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:29:28,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4347.94238 ± 1127.411
2025-09-12 13:29:28,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1298.6742, 4918.387, 4658.047, 5210.1245, 4583.5947, 5107.1953, 3335.0347, 4835.8765, 4794.424, 4738.065]
2025-09-12 13:29:28,466 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [254.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:29:28,472 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 73/100 (estimated time remaining: 7 hours, 38 minutes, 41 seconds)
2025-09-12 13:41:38,483 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:41:38,486 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:45:55,737 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3901.06104 ± 1473.561
2025-09-12 13:45:55,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5007.886, 1407.9092, 4812.9673, 4825.556, 1457.9288, 4887.22, 3804.9988, 5319.027, 5080.6826, 2406.4346]
2025-09-12 13:45:55,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 349.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 13:45:55,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 74/100 (estimated time remaining: 7 hours, 26 minutes, 57 seconds)
2025-09-12 13:58:58,382 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:58:58,385 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:03:35,001 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4202.76807 ± 1321.467
2025-09-12 14:03:35,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3912.858, 4889.9688, 4745.3433, 4407.4917, 4665.7275, 5076.754, 369.38693, 5086.449, 4379.0854, 4494.6167]
2025-09-12 14:03:35,013 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 14:03:35,027 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 75/100 (estimated time remaining: 7 hours, 18 minutes, 10 seconds)
2025-09-12 14:16:31,787 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:16:31,792 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:20:23,372 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3920.99536 ± 1493.119
2025-09-12 14:20:23,373 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5290.688, 4467.309, 4623.487, 4773.796, 3634.1782, 4984.551, 1975.2151, 330.5793, 4266.3135, 4863.838]
2025-09-12 14:20:23,373 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 468.0, 86.0, 1000.0, 922.0]
2025-09-12 14:20:23,380 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 76/100 (estimated time remaining: 7 hours, 2 minutes, 35 seconds)
2025-09-12 14:32:00,895 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:32:00,898 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:35:42,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3947.88037 ± 1645.028
2025-09-12 14:35:42,027 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3135.1692, 4763.3843, 5223.9707, 4457.1704, 5025.5356, 4845.79, 5029.9824, 100.26723, 5092.9937, 1804.5394]
2025-09-12 14:35:42,027 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [757.0, 912.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 54.0, 1000.0, 373.0]
2025-09-12 14:35:42,036 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 77/100 (estimated time remaining: 6 hours, 35 minutes, 31 seconds)
2025-09-12 14:48:45,600 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:48:45,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:52:14,284 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3595.83911 ± 1650.440
2025-09-12 14:52:14,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [1331.5557, 88.27165, 4475.7344, 5361.4136, 3950.9915, 4512.7266, 2520.2747, 3820.12, 4743.86, 5153.4434]
2025-09-12 14:52:14,294 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [340.0, 41.0, 1000.0, 1000.0, 895.0, 995.0, 534.0, 763.0, 956.0, 1000.0]
2025-09-12 14:52:14,304 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 78/100 (estimated time remaining: 6 hours, 20 minutes, 42 seconds)
2025-09-12 15:04:59,977 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:04:59,982 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:09:01,491 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4162.46191 ± 1370.561
2025-09-12 15:09:01,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4461.2373, 4299.9165, 4713.4634, 4858.4307, 529.824, 5100.098, 5231.3403, 2842.7798, 4536.986, 5050.5474]
2025-09-12 15:09:01,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 130.0, 1000.0, 1000.0, 626.0, 1000.0, 1000.0]
2025-09-12 15:09:01,499 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 79/100 (estimated time remaining: 6 hours, 5 minutes, 37 seconds)
2025-09-12 15:21:29,808 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:21:29,837 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:25:49,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4338.82715 ± 1227.594
2025-09-12 15:25:49,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2184.7893, 5015.0273, 4696.171, 5398.9424, 4489.0933, 4623.0967, 5292.2607, 4965.261, 5006.0874, 1717.5463]
2025-09-12 15:25:49,625 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [477.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 15:25:49,632 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 80/100 (estimated time remaining: 5 hours, 45 minutes, 25 seconds)
2025-09-12 15:37:03,543 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:37:03,548 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:41:14,603 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3991.88403 ± 1901.472
2025-09-12 15:41:14,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4376.0654, 5009.5396, 4781.2974, 4711.4546, 346.9111, 104.73839, 5180.3813, 4907.868, 5157.1436, 5343.436]
2025-09-12 15:41:14,610 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 51.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 15:41:14,621 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 81/100 (estimated time remaining: 5 hours, 23 minutes, 24 seconds)
2025-09-12 15:54:17,367 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:54:17,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:58:01,560 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4063.94092 ± 1391.205
2025-09-12 15:58:01,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4800.3794, 4769.421, 2267.225, 4878.798, 2520.1663, 5007.7227, 1212.1023, 5139.6055, 5128.2324, 4915.755]
2025-09-12 15:58:01,562 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 430.0, 1000.0, 520.0, 1000.0, 249.0, 1000.0, 1000.0, 1000.0]
2025-09-12 15:58:01,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 82/100 (estimated time remaining: 5 hours, 12 minutes, 50 seconds)
2025-09-12 16:10:40,198 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:10:40,202 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:15:02,967 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4545.07324 ± 340.925
2025-09-12 16:15:02,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5151.3223, 4791.3677, 4878.8286, 4709.184, 4304.3096, 4751.5005, 4378.2896, 4103.621, 4295.9844, 4086.3237]
2025-09-12 16:15:02,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 769.0, 840.0, 1000.0]
2025-09-12 16:15:02,977 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 58 minutes, 7 seconds)
2025-09-12 16:26:43,855 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:26:43,862 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:31:17,185 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4569.00488 ± 696.305
2025-09-12 16:31:17,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5501.707, 5200.9346, 4575.415, 4011.2825, 3259.4172, 4644.9873, 5279.3413, 5069.9116, 4461.3213, 3685.736]
2025-09-12 16:31:17,208 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 16:31:17,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 39 minutes, 41 seconds)
2025-09-12 16:44:37,882 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:44:37,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:48:59,091 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4296.92627 ± 1485.766
2025-09-12 16:48:59,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5116.5605, 2365.7395, 572.2561, 5349.206, 4932.242, 5139.254, 4896.468, 4491.909, 5194.1006, 4911.526]
2025-09-12 16:48:59,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 496.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 16:48:59,104 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 85/100 (estimated time remaining: 4 hours, 26 minutes, 6 seconds)
2025-09-12 17:00:40,375 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:00:40,378 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:04:20,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3968.88135 ± 1666.196
2025-09-12 17:04:20,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4673.4214, 5205.663, 4898.4775, 667.7171, 3838.3303, 4814.671, 777.29834, 4568.4585, 5254.6914, 4990.084]
2025-09-12 17:04:20,116 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 151.0, 783.0, 1000.0, 171.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:04:20,124 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 86/100 (estimated time remaining: 4 hours, 9 minutes, 16 seconds)
2025-09-12 17:17:13,976 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:17:13,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:21:08,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4030.86792 ± 1043.725
2025-09-12 17:21:08,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3646.8687, 1621.2301, 4535.7856, 3074.9766, 3317.4922, 4991.0215, 4631.232, 4638.5903, 4992.7246, 4858.7563]
2025-09-12 17:21:08,018 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [772.0, 423.0, 1000.0, 609.0, 753.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:21:08,029 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 52 minutes, 42 seconds)
2025-09-12 17:33:41,481 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:33:41,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:37:58,390 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4487.51660 ± 992.804
2025-09-12 17:37:58,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4347.5645, 4915.8687, 1628.6666, 4986.6763, 5121.9663, 4808.0815, 4954.389, 5056.515, 4828.6323, 4226.8027]
2025-09-12 17:37:58,391 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 345.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:37:58,398 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 35 minutes, 36 seconds)
2025-09-12 17:49:09,432 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 17:49:09,436 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 17:53:31,135 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4719.07910 ± 658.291
2025-09-12 17:53:31,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5148.742, 4627.5693, 4488.2866, 5159.062, 5482.955, 4553.3296, 2945.5925, 4974.01, 4868.1064, 4943.1396]
2025-09-12 17:53:31,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 589.0, 1000.0, 1000.0, 1000.0]
2025-09-12 17:53:31,136 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1226 [INFO]: New best (4719.08) for latency ExtremeClogL1U23
2025-09-12 17:53:31,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 89/100 (estimated time remaining: 3 hours, 17 minutes, 21 seconds)
2025-09-12 18:05:52,992 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:05:52,996 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:09:39,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3810.21021 ± 1290.951
2025-09-12 18:09:39,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [3414.4802, 5014.7393, 4208.0054, 5181.6694, 4977.5146, 1665.7999, 4554.6943, 2511.0754, 4747.9595, 1826.1636]
2025-09-12 18:09:39,112 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 330.0, 1000.0, 497.0, 1000.0, 399.0]
2025-09-12 18:09:39,121 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 57 minutes, 28 seconds)
2025-09-12 18:22:37,124 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:22:37,130 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:26:32,476 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4238.90039 ± 1267.052
2025-09-12 18:26:32,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4670.6655, 5541.784, 4855.861, 1791.6624, 4726.664, 2592.413, 5288.729, 5079.8906, 5112.28, 2729.0564]
2025-09-12 18:26:32,477 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 359.0, 1000.0, 561.0, 1000.0, 1000.0, 1000.0, 617.0]
2025-09-12 18:26:32,485 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 44 minutes, 24 seconds)
2025-09-12 18:39:14,447 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:39:14,452 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 18:43:30,753 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4359.58154 ± 857.867
2025-09-12 18:43:30,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4542.1875, 2058.9258, 4750.6665, 4194.687, 5120.223, 4020.792, 4842.9424, 4104.4756, 5192.3584, 4768.5586]
2025-09-12 18:43:30,772 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 385.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 18:43:30,785 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 28 minutes, 16 seconds)
2025-09-12 18:56:04,262 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 18:56:04,266 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:00:13,448 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4361.84375 ± 1375.949
2025-09-12 19:00:13,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [390.048, 3922.9282, 4981.7407, 4851.6577, 5185.7954, 4969.0103, 5274.335, 4622.9263, 4436.508, 4983.4863]
2025-09-12 19:00:13,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [110.0, 800.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 19:00:13,463 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 93/100 (estimated time remaining: 2 hours, 11 minutes, 36 seconds)
2025-09-12 19:12:35,104 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:12:35,108 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:15:57,857 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3656.40430 ± 1864.207
2025-09-12 19:15:57,860 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5402.777, 4060.0615, 4764.6543, 5017.332, 4993.6196, 938.06213, 1578.5233, 4678.9556, 4966.762, 163.29607]
2025-09-12 19:15:57,860 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 834.0, 1000.0, 1000.0, 1000.0, 208.0, 307.0, 1000.0, 1000.0, 51.0]
2025-09-12 19:15:57,869 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 55 minutes, 25 seconds)
2025-09-12 19:27:43,920 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:27:43,929 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:31:25,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4043.21753 ± 1750.839
2025-09-12 19:31:25,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4946.871, 5178.3228, 5344.6943, 5226.1367, 4722.286, 4918.7876, 5078.885, 634.8463, 3719.5571, 661.78796]
2025-09-12 19:31:25,478 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 141.0, 766.0, 138.0]
2025-09-12 19:31:25,487 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 38 minutes, 7 seconds)
2025-09-12 19:43:46,123 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 19:43:46,128 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 19:48:10,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4464.56348 ± 996.588
2025-09-12 19:48:10,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4965.6675, 2652.357, 2476.2957, 4153.379, 4970.7026, 4814.772, 5152.1, 4936.901, 5284.799, 5238.6567]
2025-09-12 19:48:10,733 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 515.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0]
2025-09-12 19:48:10,741 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 21 minutes, 38 seconds)
2025-09-12 20:00:35,898 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:00:35,904 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:04:50,882 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4571.26221 ± 948.878
2025-09-12 20:04:50,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4434.5015, 5059.1587, 5408.6494, 4840.983, 5079.5586, 5244.363, 4740.7896, 5018.6235, 2012.8958, 3873.0964]
2025-09-12 20:04:50,884 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 487.0, 825.0]
2025-09-12 20:04:50,900 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 97/100 (estimated time remaining: 1 hour, 5 minutes, 4 seconds)
2025-09-12 20:17:53,845 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:17:53,850 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:21:16,016 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 2833.08447 ± 1820.418
2025-09-12 20:21:16,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [5021.3784, 2189.2598, 5137.1816, 231.31424, 2103.3972, 4848.2007, 4697.5684, 434.701, 1693.0962, 1974.746]
2025-09-12 20:21:16,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 462.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 102.0, 400.0, 437.0]
2025-09-12 20:21:16,028 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 98/100 (estimated time remaining: 48 minutes, 37 seconds)
2025-09-12 20:33:46,088 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:33:46,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:37:34,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4012.32080 ± 1397.958
2025-09-12 20:37:34,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [2689.1162, 5224.5312, 2511.5754, 4985.7856, 4864.2373, 5069.27, 4651.381, 4812.8867, 4467.985, 846.43634]
2025-09-12 20:37:34,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [593.0, 1000.0, 508.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 209.0]
2025-09-12 20:37:34,810 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 99/100 (estimated time remaining: 32 minutes, 38 seconds)
2025-09-12 20:49:20,988 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 20:49:20,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 20:53:12,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 4140.67871 ± 1241.936
2025-09-12 20:53:12,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4926.7124, 4798.081, 1824.3481, 5109.534, 5005.845, 4737.7246, 5013.0283, 2877.652, 2155.6401, 4958.2197]
2025-09-12 20:53:12,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 411.0, 1000.0, 1000.0, 1000.0, 1000.0, 551.0, 473.0, 1000.0]
2025-09-12 20:53:12,193 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1199 [INFO]: Iteration 100/100 (estimated time remaining: 16 minutes, 21 seconds)
2025-09-12 21:06:01,676 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 21:06:01,681 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 21:09:19,495 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1221 [DEBUG]: Total Reward: 3225.18823 ± 1894.896
2025-09-12 21:09:19,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1222 [DEBUG]: All rewards: [4722.568, 5254.9165, 2666.7717, 1240.8563, 652.6817, 1173.3325, 4680.714, 5291.5103, 1275.3068, 5293.2246]
2025-09-12 21:09:19,497 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 531.0, 231.0, 1000.0, 223.0, 1000.0, 1000.0, 309.0, 1000.0]
2025-09-12 21:09:19,509 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc0-ant):1251 [DEBUG]: Training session finished
