2025-09-11 18:15:17,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noiseperc20-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:15:17,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noiseperc20-ant/ExtremeClogL1U23-mbpac-highdim-memdelay
2025-09-11 18:15:17,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x14930e86c190>}
2025-09-11 18:15:17,529 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1111 [DEBUG]: using device: cuda
2025-09-11 18:15:17,534 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1133 [INFO]: Creating new trainer
2025-09-11 18:15:17,551 baseline-mbpac-noiseperc20-ant:110 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=512, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-09-11 18:15:17,551 baseline-mbpac-noiseperc20-ant:111 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=35, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-09-11 18:15:17,561 baseline-mbpac-noiseperc20-ant:140 [DEBUG]: Model structure:
NNPredictiveRecurrent(
  (emitter): NNGaussianProbabilisticEmitter(
    (emitter): NNLayerConcat(
      dim: -1
      (next): Sequential(
        (0): Sequential(
          (0): Linear(in_features=512, out_features=256, bias=True)
          (1): NNLayerClipSiLU(lower=-20.0)
          (2): Linear(in_features=256, out_features=256, bias=True)
          (3): NNLayerClipSiLU(lower=-20.0)
          (4): Linear(in_features=256, out_features=256, bias=True)
        )
        (1): NNLayerClipSiLU(lower=-20.0)
        (2): NNLayerHeadSplit(
          (heads): ModuleDict(
            (mu): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
            (log_std): Sequential(
              (0): Linear(in_features=256, out_features=256, bias=True)
              (1): NNLayerClipSiLU(lower=-20.0)
              (2): Linear(in_features=256, out_features=27, bias=True)
            )
          )
        )
      )
      (init_all): Identity()
    )
  )
  (net_embed_state): Sequential(
    (0): Linear(in_features=27, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): NNLayerClipSiLU(lower=-20.0)
    (4): Linear(in_features=256, out_features=512, bias=True)
  )
  (net_embed_action): Sequential(
    (0): Linear(in_features=8, out_features=256, bias=True)
    (1): NNLayerClipSiLU(lower=-20.0)
    (2): Linear(in_features=256, out_features=256, bias=True)
  )
  (net_rec): GRU(256, 512, batch_first=True)
)
2025-09-11 18:15:18,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1194 [DEBUG]: Starting training session...
2025-09-11 18:15:18,505 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 1/100
2025-09-11 18:26:44,713 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:26:44,720 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:27:34,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -144.26993 ± 209.016
2025-09-11 18:27:34,191 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-758.3845, -11.80152, -80.60722, -91.697266, -70.18517, -156.31815, -105.57187, -9.521349, -50.623806, -107.98841]
2025-09-11 18:27:34,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 19.0, 56.0, 66.0, 91.0, 158.0, 152.0, 22.0, 78.0, 99.0]
2025-09-11 18:27:34,192 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (-144.27) for latency ExtremeClogL1U23
2025-09-11 18:27:34,196 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 2/100 (estimated time remaining: 20 hours, 13 minutes, 53 seconds)
2025-09-11 18:40:08,987 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:40:08,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:40:23,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -33.45322 ± 32.562
2025-09-11 18:40:23,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-23.325703, -36.814976, -21.952585, -4.6826077, -14.663701, -121.3969, -49.336308, -26.527206, 0.23711306, -36.069378]
2025-09-11 18:40:23,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [35.0, 46.0, 18.0, 52.0, 38.0, 141.0, 72.0, 21.0, 34.0, 70.0]
2025-09-11 18:40:23,833 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (-33.45) for latency ExtremeClogL1U23
2025-09-11 18:40:23,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 3/100 (estimated time remaining: 20 hours, 29 minutes, 21 seconds)
2025-09-11 18:54:13,050 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 18:54:13,054 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 18:54:51,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -41.01680 ± 57.519
2025-09-11 18:54:51,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [2.4792202, -17.78666, -53.190666, -157.90099, 12.563256, 11.582043, -2.7274096, -53.665123, -135.6162, -15.905516]
2025-09-11 18:54:51,561 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [74.0, 38.0, 107.0, 440.0, 151.0, 42.0, 36.0, 119.0, 266.0, 90.0]
2025-09-11 18:54:51,568 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 4/100 (estimated time remaining: 21 hours, 18 minutes, 49 seconds)
2025-09-11 19:07:21,918 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:07:21,922 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:09:00,358 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -127.92863 ± 167.984
2025-09-11 19:09:00,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-2.9125068, -31.377155, -18.454788, -67.47971, -448.53284, -2.0944064, -366.14395, -322.84662, 2.3039827, -21.748178]
2025-09-11 19:09:00,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [31.0, 59.0, 60.0, 169.0, 1000.0, 16.0, 1000.0, 1000.0, 20.0, 49.0]
2025-09-11 19:09:00,366 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 5/100 (estimated time remaining: 21 hours, 28 minutes, 44 seconds)
2025-09-11 19:21:36,018 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:21:36,026 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:22:02,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -10.34531 ± 29.567
2025-09-11 19:22:02,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [43.01432, -48.735607, 1.9507749, -34.032116, -22.715832, -12.3726, -27.772444, -30.33918, -15.248335, 42.797905]
2025-09-11 19:22:02,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [91.0, 115.0, 95.0, 41.0, 61.0, 122.0, 72.0, 126.0, 97.0, 130.0]
2025-09-11 19:22:02,752 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (-10.35) for latency ExtremeClogL1U23
2025-09-11 19:22:02,758 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 6/100 (estimated time remaining: 21 hours, 8 minutes)
2025-09-11 19:34:20,109 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:34:20,113 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:35:07,092 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -54.09985 ± 116.738
2025-09-11 19:35:07,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [17.028341, -394.23544, 9.557396, -86.73046, -18.59196, -27.398567, -17.85545, 5.0341244, 0.10817877, -27.914661]
2025-09-11 19:35:07,093 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [72.0, 1000.0, 106.0, 136.0, 44.0, 30.0, 65.0, 41.0, 14.0, 152.0]
2025-09-11 19:35:07,097 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 7/100 (estimated time remaining: 21 hours, 9 minutes, 54 seconds)
2025-09-11 19:48:13,446 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 19:48:13,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 19:48:32,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -23.38565 ± 21.877
2025-09-11 19:48:32,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-4.4723363, -30.43734, -33.665104, -23.398108, -3.2346284, -78.41365, -27.628773, -17.675697, -19.328732, 4.3979096]
2025-09-11 19:48:32,852 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [38.0, 148.0, 41.0, 137.0, 14.0, 82.0, 88.0, 73.0, 29.0, 30.0]
2025-09-11 19:48:32,861 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 8/100 (estimated time remaining: 21 hours, 7 minutes, 35 seconds)
2025-09-11 20:00:50,740 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:00:50,743 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:01:08,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -4.12973 ± 24.163
2025-09-11 20:01:08,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [38.55191, -10.737665, -14.7298, -7.69531, -12.78068, 0.5494116, 13.817878, -59.9011, -1.9701262, 13.5981865]
2025-09-11 20:01:08,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [37.0, 61.0, 79.0, 16.0, 72.0, 27.0, 99.0, 159.0, 26.0, 38.0]
2025-09-11 20:01:08,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (-4.13) for latency ExtremeClogL1U23
2025-09-11 20:01:08,252 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 9/100 (estimated time remaining: 20 hours, 19 minutes, 30 seconds)
2025-09-11 20:13:09,666 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:13:09,672 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:13:49,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -45.71571 ± 102.321
2025-09-11 20:13:49,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-24.006098, 7.884982, 3.0959673, -2.160125, -7.0898557, 9.32755, -340.5218, -93.81797, -5.2489552, -4.620758]
2025-09-11 20:13:49,221 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [24.0, 27.0, 21.0, 56.0, 31.0, 16.0, 1000.0, 121.0, 25.0, 47.0]
2025-09-11 20:13:49,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 10/100 (estimated time remaining: 19 hours, 39 minutes, 37 seconds)
2025-09-11 20:26:36,394 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:26:36,396 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:27:20,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -29.77927 ± 64.912
2025-09-11 20:27:20,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [1.478168, -31.831709, -30.22905, 0.9946385, -7.1533194, -18.395899, -217.67346, 29.477139, -4.0551205, -20.404053]
2025-09-11 20:27:20,289 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [48.0, 38.0, 83.0, 26.0, 11.0, 134.0, 1000.0, 89.0, 46.0, 52.0]
2025-09-11 20:27:20,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 11/100 (estimated time remaining: 19 hours, 35 minutes, 15 seconds)
2025-09-11 20:40:14,093 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:40:14,096 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:40:25,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -7.49416 ± 22.322
2025-09-11 20:40:25,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-13.233745, -27.663261, -11.558507, -8.5211935, 9.844735, 9.1290245, -38.36899, -24.701157, -13.926552, 44.057995]
2025-09-11 20:40:25,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [42.0, 48.0, 42.0, 42.0, 28.0, 64.0, 30.0, 30.0, 31.0, 62.0]
2025-09-11 20:40:25,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 12/100 (estimated time remaining: 19 hours, 22 minutes, 33 seconds)
2025-09-11 20:53:11,679 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 20:53:11,681 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 20:53:47,948 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: -26.34160 ± 77.725
2025-09-11 20:53:47,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-11.001017, -12.46503, 5.5745573, 13.580057, -3.7729049, -9.557844, 10.20157, -2.8697014, 4.995607, -258.10135]
2025-09-11 20:53:47,949 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [29.0, 50.0, 15.0, 36.0, 14.0, 65.0, 40.0, 10.0, 20.0, 1000.0]
2025-09-11 20:53:47,960 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 13/100 (estimated time remaining: 19 hours, 8 minutes, 25 seconds)
2025-09-11 21:07:11,359 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:07:11,364 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:07:23,310 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 10.03320 ± 15.473
2025-09-11 21:07:23,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-3.7116678, 1.3319559, -2.4830985, 11.869802, -6.1268053, 40.88353, 36.359695, 2.1335204, 10.799526, 9.275555]
2025-09-11 21:07:23,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [8.0, 109.0, 9.0, 27.0, 51.0, 44.0, 84.0, 15.0, 50.0, 28.0]
2025-09-11 21:07:23,311 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (10.03) for latency ExtremeClogL1U23
2025-09-11 21:07:23,319 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 14/100 (estimated time remaining: 19 hours, 12 minutes, 46 seconds)
2025-09-11 21:19:34,546 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:19:34,554 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:19:46,260 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 8.93203 ± 17.136
2025-09-11 21:19:46,260 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [1.0237819, 17.186823, -11.875611, -22.76172, 26.749226, 10.988219, 34.768387, -3.745616, 22.699997, 14.286849]
2025-09-11 21:19:46,260 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [36.0, 27.0, 34.0, 42.0, 30.0, 62.0, 56.0, 21.0, 40.0, 68.0]
2025-09-11 21:19:46,272 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 15/100 (estimated time remaining: 18 hours, 54 minutes, 21 seconds)
2025-09-11 21:33:00,981 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:33:00,991 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:33:11,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 10.79572 ± 24.530
2025-09-11 21:33:11,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [67.443855, -15.152463, 9.506679, 10.886692, -1.2650914, -22.25073, 11.864662, 8.789916, 38.482098, -0.34841922]
2025-09-11 21:33:11,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [65.0, 33.0, 30.0, 25.0, 24.0, 71.0, 42.0, 20.0, 44.0, 10.0]
2025-09-11 21:33:11,362 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (10.80) for latency ExtremeClogL1U23
2025-09-11 21:33:11,371 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 16/100 (estimated time remaining: 18 hours, 39 minutes, 28 seconds)
2025-09-11 21:45:56,426 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:45:56,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:46:12,015 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 7.49549 ± 26.792
2025-09-11 21:46:12,016 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [7.718779, 5.733777, -19.022015, 3.1105702, -31.894522, 4.072257, 77.033424, 6.034845, 14.684735, 7.483093]
2025-09-11 21:46:12,016 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [47.0, 37.0, 43.0, 27.0, 113.0, 39.0, 128.0, 38.0, 31.0, 63.0]
2025-09-11 21:46:12,024 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 17/100 (estimated time remaining: 18 hours, 24 minutes, 55 seconds)
2025-09-11 21:57:58,382 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 21:57:58,390 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 21:58:09,901 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 6.87419 ± 14.392
2025-09-11 21:58:09,901 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [5.5990005, 2.7527423, 18.3508, 27.73265, 16.259369, -1.1974144, -19.740065, -14.034023, 17.342394, 15.67649]
2025-09-11 21:58:09,901 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [32.0, 11.0, 37.0, 23.0, 52.0, 11.0, 106.0, 57.0, 56.0, 32.0]
2025-09-11 21:58:09,911 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 18/100 (estimated time remaining: 17 hours, 48 minutes, 28 seconds)
2025-09-11 22:10:40,647 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:10:40,654 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:10:51,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 8.72252 ± 16.602
2025-09-11 22:10:51,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [12.893074, 9.654829, 21.64207, -3.5455468, 2.67955, 27.147072, 8.53014, -25.959255, -1.6367931, 35.8201]
2025-09-11 22:10:51,815 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [23.0, 50.0, 44.0, 11.0, 8.0, 19.0, 26.0, 176.0, 14.0, 31.0]
2025-09-11 22:10:51,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 19/100 (estimated time remaining: 17 hours, 20 minutes, 59 seconds)
2025-09-11 22:23:23,832 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:23:23,840 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:23:33,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 16.36480 ± 20.978
2025-09-11 22:23:33,958 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [4.488198, 25.033731, 20.474737, 38.53102, -32.663353, 21.587538, 17.82399, 40.533688, 31.906446, -4.067997]
2025-09-11 22:23:33,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [11.0, 77.0, 24.0, 38.0, 34.0, 47.0, 16.0, 82.0, 26.0, 13.0]
2025-09-11 22:23:33,959 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (16.36) for latency ExtremeClogL1U23
2025-09-11 22:23:33,968 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 20/100 (estimated time remaining: 17 hours, 13 minutes, 28 seconds)
2025-09-11 22:36:04,934 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:36:04,941 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:36:12,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 7.17520 ± 7.702
2025-09-11 22:36:12,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [4.100151, 19.713104, 4.601725, -8.830055, 10.682593, 0.0131228715, 7.9931803, 5.307019, 14.583457, 13.587734]
2025-09-11 22:36:12,992 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [12.0, 17.0, 112.0, 13.0, 50.0, 27.0, 10.0, 14.0, 16.0, 20.0]
2025-09-11 22:36:13,000 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 21/100 (estimated time remaining: 16 hours, 48 minutes, 26 seconds)
2025-09-11 22:48:47,388 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 22:48:47,390 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 22:48:58,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 22.17550 ± 22.969
2025-09-11 22:48:58,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [30.756708, 71.77566, 2.3072445, 23.551365, -0.1729302, 17.220503, 13.100152, 22.676395, 49.6939, -9.15401]
2025-09-11 22:48:58,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [27.0, 97.0, 9.0, 24.0, 25.0, 29.0, 39.0, 66.0, 65.0, 31.0]
2025-09-11 22:48:58,824 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (22.18) for latency ExtremeClogL1U23
2025-09-11 22:48:58,835 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 22/100 (estimated time remaining: 16 hours, 31 minutes, 55 seconds)
2025-09-11 23:01:29,065 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:01:29,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:01:46,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 39.99883 ± 44.572
2025-09-11 23:01:46,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [7.723706, 18.698877, 29.111176, 80.757835, 43.362926, 157.98207, 9.49955, 6.51042, 18.57268, 27.769024]
2025-09-11 23:01:46,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [18.0, 62.0, 69.0, 113.0, 33.0, 182.0, 14.0, 19.0, 33.0, 88.0]
2025-09-11 23:01:46,771 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (40.00) for latency ExtremeClogL1U23
2025-09-11 23:01:46,781 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 23/100 (estimated time remaining: 16 hours, 32 minutes, 23 seconds)
2025-09-11 23:14:15,539 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:14:15,541 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:14:26,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 27.31591 ± 36.076
2025-09-11 23:14:26,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [123.60472, 0.99262273, 23.809097, 16.535116, -3.6217916, 4.4319453, 8.243046, 44.91725, 7.532633, 46.71441]
2025-09-11 23:14:26,205 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [127.0, 13.0, 35.0, 46.0, 20.0, 10.0, 17.0, 66.0, 11.0, 33.0]
2025-09-11 23:14:26,213 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 24/100 (estimated time remaining: 16 hours, 19 minutes, 1 second)
2025-09-11 23:26:58,577 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:26:58,579 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:27:41,101 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 26.51171 ± 22.175
2025-09-11 23:27:41,101 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-4.6063933, 47.067474, 5.434275, 43.01198, 6.613593, 35.452774, 38.32803, -6.153162, 46.413765, 53.554775]
2025-09-11 23:27:41,101 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 145.0, 9.0, 79.0, 12.0, 35.0, 73.0, 21.0, 33.0, 122.0]
2025-09-11 23:27:41,111 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 25/100 (estimated time remaining: 16 hours, 14 minutes, 36 seconds)
2025-09-11 23:40:20,298 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:40:20,300 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:40:55,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 31.84597 ± 32.419
2025-09-11 23:40:55,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [13.051547, 11.521049, 10.37237, 108.43577, 29.329916, -4.10243, -0.83877814, 50.759377, 54.162117, 45.768726]
2025-09-11 23:40:55,563 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [13.0, 29.0, 18.0, 1000.0, 53.0, 17.0, 8.0, 41.0, 55.0, 36.0]
2025-09-11 23:40:55,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 26/100 (estimated time remaining: 16 hours, 10 minutes, 38 seconds)
2025-09-11 23:54:50,859 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-11 23:54:50,863 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-11 23:55:57,434 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 39.18446 ± 30.991
2025-09-11 23:55:57,441 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [61.91946, 3.1591403, 28.631733, 27.124874, 17.817812, 77.6431, 106.40783, 17.052807, 15.022211, 37.06566]
2025-09-11 23:55:57,441 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [100.0, 1000.0, 28.0, 26.0, 24.0, 63.0, 1000.0, 33.0, 39.0, 28.0]
2025-09-11 23:55:57,450 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 27/100 (estimated time remaining: 16 hours, 31 minutes, 15 seconds)
2025-09-12 00:07:24,626 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:07:24,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:08:05,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 23.44006 ± 26.050
2025-09-12 00:08:05,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [6.1660333, 5.6261477, 7.0268826, 5.5665765, 25.406511, 37.882015, 46.254642, -2.514118, 87.556305, 15.429592]
2025-09-12 00:08:05,409 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [10.0, 21.0, 14.0, 166.0, 26.0, 59.0, 1000.0, 8.0, 98.0, 17.0]
2025-09-12 00:08:05,418 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 28/100 (estimated time remaining: 16 hours, 8 minutes, 8 seconds)
2025-09-12 00:20:57,802 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:20:57,805 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:21:08,188 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 36.32797 ± 35.899
2025-09-12 00:21:08,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [65.152, 28.731663, 0.5068226, 1.1290652, 113.08038, 0.89838576, 29.224796, 7.204693, 43.955887, 73.395996]
2025-09-12 00:21:08,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [82.0, 32.0, 24.0, 12.0, 79.0, 11.0, 28.0, 10.0, 40.0, 57.0]
2025-09-12 00:21:08,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 29/100 (estimated time remaining: 16 hours, 28 seconds)
2025-09-12 00:33:45,450 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:33:45,457 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:34:00,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 33.98922 ± 36.435
2025-09-12 00:34:00,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [3.613102, 75.3736, 31.026356, 88.24622, 30.768686, -6.919267, 27.82598, 92.33394, -7.7633348, 5.386952]
2025-09-12 00:34:00,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [88.0, 105.0, 33.0, 129.0, 28.0, 33.0, 30.0, 57.0, 22.0, 12.0]
2025-09-12 00:34:00,590 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 30/100 (estimated time remaining: 15 hours, 41 minutes, 48 seconds)
2025-09-12 00:46:47,003 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 00:46:47,010 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 00:47:27,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 38.56365 ± 58.608
2025-09-12 00:47:27,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-0.5516884, 46.552967, -3.4155726, 15.5371895, 63.63309, 25.720837, 201.76865, -2.359709, 37.021667, 1.7290599]
2025-09-12 00:47:27,060 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [22.0, 79.0, 23.0, 37.0, 43.0, 20.0, 165.0, 22.0, 1000.0, 10.0]
2025-09-12 00:47:27,068 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 31/100 (estimated time remaining: 15 hours, 31 minutes, 20 seconds)
2025-09-12 01:01:09,881 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:01:09,885 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:01:24,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 35.97027 ± 32.671
2025-09-12 01:01:24,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [0.8887818, 10.926076, 44.56568, 87.11695, 66.90105, 12.570543, -4.4470057, 8.757019, 80.67689, 51.74672]
2025-09-12 01:01:24,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [40.0, 25.0, 29.0, 38.0, 41.0, 28.0, 17.0, 42.0, 150.0, 94.0]
2025-09-12 01:01:24,174 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 32/100 (estimated time remaining: 15 hours, 3 minutes, 8 seconds)
2025-09-12 01:13:57,962 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:13:57,966 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:14:18,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 46.03245 ± 58.306
2025-09-12 01:14:18,894 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [19.531061, 48.81777, 17.15513, 10.733548, 216.49564, 30.829954, 49.581844, 10.642008, 31.486204, 25.051348]
2025-09-12 01:14:18,895 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [52.0, 61.0, 35.0, 21.0, 263.0, 100.0, 69.0, 17.0, 37.0, 81.0]
2025-09-12 01:14:18,895 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (46.03) for latency ExtremeClogL1U23
2025-09-12 01:14:18,900 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 33/100 (estimated time remaining: 15 hours, 39 seconds)
2025-09-12 01:26:18,180 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:26:18,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:26:53,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 12.64053 ± 20.086
2025-09-12 01:26:53,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-3.3428679, -4.706278, 49.523083, 2.924888, 25.948349, -9.394227, 47.419476, 5.7973747, 9.217272, 3.018198]
2025-09-12 01:26:53,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [31.0, 85.0, 46.0, 32.0, 21.0, 19.0, 1000.0, 18.0, 16.0, 8.0]
2025-09-12 01:26:53,915 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 34/100 (estimated time remaining: 14 hours, 41 minutes, 12 seconds)
2025-09-12 01:40:28,732 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:40:28,750 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:41:44,798 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 68.55553 ± 86.423
2025-09-12 01:41:44,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [299.5988, 68.137276, -26.498741, 98.24477, 27.960686, 71.77622, 3.6566975, 28.854576, 99.7742, 14.050799]
2025-09-12 01:41:44,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [242.0, 55.0, 125.0, 102.0, 50.0, 65.0, 10.0, 1000.0, 1000.0, 58.0]
2025-09-12 01:41:44,799 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (68.56) for latency ExtremeClogL1U23
2025-09-12 01:41:44,804 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 35/100 (estimated time remaining: 14 hours, 54 minutes, 7 seconds)
2025-09-12 01:54:15,319 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 01:54:15,323 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 01:55:04,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 68.19682 ± 84.726
2025-09-12 01:55:04,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [51.83201, 75.47347, -7.2017794, 15.502231, 18.085997, 56.236046, 264.5504, 189.51662, 14.623363, 3.3497791]
2025-09-12 01:55:04,755 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [118.0, 74.0, 47.0, 11.0, 25.0, 1000.0, 210.0, 120.0, 31.0, 85.0]
2025-09-12 01:55:04,763 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 36/100 (estimated time remaining: 14 hours, 39 minutes, 10 seconds)
2025-09-12 02:07:27,170 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:07:27,177 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:07:50,628 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 57.61868 ± 86.498
2025-09-12 02:07:50,628 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [31.16037, 92.75394, 173.34903, 2.023865, 30.84904, -64.46815, 228.75327, 100.64192, -27.43483, 8.558362]
2025-09-12 02:07:50,628 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [47.0, 69.0, 116.0, 39.0, 64.0, 168.0, 146.0, 72.0, 66.0, 51.0]
2025-09-12 02:07:50,641 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 37/100 (estimated time remaining: 14 hours, 10 minutes, 26 seconds)
2025-09-12 02:20:32,912 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:20:32,916 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:22:14,142 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 87.73317 ± 67.323
2025-09-12 02:22:14,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [5.2081695, 69.8062, 18.448404, 216.36528, 82.50055, 105.40413, -1.4642519, 176.09244, 114.88098, 90.08973]
2025-09-12 02:22:14,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [28.0, 1000.0, 44.0, 232.0, 60.0, 60.0, 14.0, 168.0, 1000.0, 1000.0]
2025-09-12 02:22:14,143 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (87.73) for latency ExtremeClogL1U23
2025-09-12 02:22:14,151 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 38/100 (estimated time remaining: 14 hours, 15 minutes, 48 seconds)
2025-09-12 02:34:49,982 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:34:49,983 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:35:33,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 67.51693 ± 57.485
2025-09-12 02:35:33,683 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [126.667114, 138.58823, 16.582191, 10.959039, 14.546531, 64.406296, 64.51323, 178.26018, 50.90298, 9.743532]
2025-09-12 02:35:33,684 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [91.0, 93.0, 22.0, 52.0, 16.0, 1000.0, 61.0, 144.0, 56.0, 34.0]
2025-09-12 02:35:33,695 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 39/100 (estimated time remaining: 14 hours, 11 minutes, 25 seconds)
2025-09-12 02:48:18,224 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 02:48:18,226 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 02:48:56,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 151.39833 ± 140.230
2025-09-12 02:48:56,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [11.4548435, 121.40752, 42.29569, 79.35157, 341.38153, 82.78981, 363.5992, 32.645252, 61.65155, 377.40637]
2025-09-12 02:48:56,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [12.0, 132.0, 51.0, 89.0, 290.0, 64.0, 310.0, 44.0, 110.0, 289.0]
2025-09-12 02:48:56,886 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (151.40) for latency ExtremeClogL1U23
2025-09-12 02:48:56,896 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 40/100 (estimated time remaining: 13 hours, 39 minutes, 51 seconds)
2025-09-12 03:01:48,936 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:01:48,939 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:02:36,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 58.77061 ± 78.552
2025-09-12 03:02:36,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [47.22063, 68.85246, 89.672386, 7.1183925, 25.12377, 2.610729, 10.378044, 278.57892, 5.20375, 52.947018]
2025-09-12 03:02:36,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [29.0, 102.0, 135.0, 17.0, 33.0, 57.0, 25.0, 269.0, 1000.0, 59.0]
2025-09-12 03:02:36,698 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 41/100 (estimated time remaining: 13 hours, 30 minutes, 23 seconds)
2025-09-12 03:15:42,609 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:15:42,613 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:16:25,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 91.44994 ± 77.923
2025-09-12 03:16:25,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [78.21413, 1.2181919, 234.03656, 84.60104, 49.398605, 121.26543, 14.955353, 4.1176934, 110.660416, 216.03198]
2025-09-12 03:16:25,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [62.0, 15.0, 1000.0, 98.0, 36.0, 90.0, 21.0, 9.0, 67.0, 129.0]
2025-09-12 03:16:25,701 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 42/100 (estimated time remaining: 13 hours, 29 minutes, 17 seconds)
2025-09-12 03:29:32,154 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:29:32,158 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:30:12,033 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 35.34196 ± 32.673
2025-09-12 03:30:12,033 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [31.308115, 33.870213, 20.213823, 68.12392, -19.885632, 14.521497, 59.260418, 34.192608, 8.736127, 103.07849]
2025-09-12 03:30:12,033 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [37.0, 76.0, 22.0, 48.0, 1000.0, 47.0, 33.0, 36.0, 10.0, 101.0]
2025-09-12 03:30:12,041 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 43/100 (estimated time remaining: 13 hours, 8 minutes, 23 seconds)
2025-09-12 03:42:31,498 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:42:31,509 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:42:54,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 84.44726 ± 56.336
2025-09-12 03:42:54,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [168.30537, 17.448029, 52.207928, 44.489365, 29.765217, 173.79814, 103.5993, 149.42343, 51.20785, 54.228043]
2025-09-12 03:42:54,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [159.0, 43.0, 53.0, 32.0, 32.0, 200.0, 114.0, 133.0, 36.0, 46.0]
2025-09-12 03:42:54,984 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 44/100 (estimated time remaining: 12 hours, 47 minutes, 50 seconds)
2025-09-12 03:55:16,141 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 03:55:16,146 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 03:56:32,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 93.40966 ± 92.175
2025-09-12 03:56:32,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [200.68481, 13.504582, 93.94226, 310.53793, 108.81091, 89.358025, 7.1985497, 77.62398, 22.206049, 10.229568]
2025-09-12 03:56:32,994 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [142.0, 17.0, 78.0, 212.0, 122.0, 98.0, 16.0, 1000.0, 1000.0, 17.0]
2025-09-12 03:56:33,011 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 45/100 (estimated time remaining: 12 hours, 37 minutes, 8 seconds)
2025-09-12 04:08:45,219 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:08:45,220 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:09:31,680 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 65.69360 ± 73.963
2025-09-12 04:09:31,680 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [139.41864, 75.35384, -123.613884, 20.978003, 137.56468, 36.112633, 130.87202, 96.43876, 68.14548, 75.66585]
2025-09-12 04:09:31,680 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [104.0, 54.0, 1000.0, 37.0, 130.0, 31.0, 110.0, 95.0, 32.0, 66.0]
2025-09-12 04:09:31,690 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 46/100 (estimated time remaining: 12 hours, 16 minutes, 4 seconds)
2025-09-12 04:22:25,825 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:22:25,828 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:23:13,576 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 96.60290 ± 102.304
2025-09-12 04:23:13,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [89.26183, 8.075921, 174.83253, -6.6038194, 47.146687, 5.6904593, 180.6044, 45.119556, 340.239, 81.66242]
2025-09-12 04:23:13,577 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 9.0, 104.0, 39.0, 63.0, 23.0, 171.0, 39.0, 210.0, 52.0]
2025-09-12 04:23:13,586 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 47/100 (estimated time remaining: 12 hours, 1 minute, 25 seconds)
2025-09-12 04:35:57,467 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:35:57,474 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:37:00,425 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 133.74026 ± 164.971
2025-09-12 04:37:00,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [274.41452, 29.447723, 494.93155, 320.38293, 137.41826, 58.21793, 59.8866, -37.211674, 0.25108987, -0.33642566]
2025-09-12 04:37:00,427 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [298.0, 68.0, 407.0, 179.0, 87.0, 61.0, 95.0, 1000.0, 19.0, 53.0]
2025-09-12 04:37:00,437 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 48/100 (estimated time remaining: 11 hours, 48 minutes, 8 seconds)
2025-09-12 04:49:43,542 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 04:49:43,544 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 04:50:22,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 39.83495 ± 75.427
2025-09-12 04:50:22,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [89.14504, -9.92508, 3.085513, 21.769657, -100.31013, 55.62237, 212.55988, 45.264122, 23.922823, 57.215305]
2025-09-12 04:50:22,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [60.0, 12.0, 9.0, 19.0, 1000.0, 67.0, 135.0, 45.0, 24.0, 43.0]
2025-09-12 04:50:22,935 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 49/100 (estimated time remaining: 11 hours, 41 minutes, 38 seconds)
2025-09-12 05:02:49,800 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:02:49,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:03:08,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 79.13647 ± 82.011
2025-09-12 05:03:08,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [144.86174, 45.734276, 15.008103, 26.232439, 18.963903, 52.183933, 148.6997, 42.493095, 280.37744, 16.810108]
2025-09-12 05:03:08,259 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [94.0, 49.0, 29.0, 37.0, 23.0, 32.0, 124.0, 34.0, 229.0, 19.0]
2025-09-12 05:03:08,268 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 50/100 (estimated time remaining: 11 hours, 19 minutes, 11 seconds)
2025-09-12 05:15:52,876 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:15:52,878 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:16:49,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 98.55392 ± 97.184
2025-09-12 05:16:49,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-52.678246, 173.41426, 184.51517, 38.3235, 10.23906, 10.270491, 138.07695, 76.86106, 115.499695, 291.01724]
2025-09-12 05:16:49,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 134.0, 151.0, 199.0, 21.0, 9.0, 75.0, 75.0, 86.0, 291.0]
2025-09-12 05:16:49,836 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 51/100 (estimated time remaining: 11 hours, 13 minutes, 1 second)
2025-09-12 05:29:26,396 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:29:26,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:30:21,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 107.58622 ± 174.324
2025-09-12 05:30:21,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [9.937847, 559.17896, 9.165244, 76.24098, 12.36241, 275.81116, -78.54355, 74.75564, 26.056147, 110.89732]
2025-09-12 05:30:21,223 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [74.0, 348.0, 25.0, 98.0, 40.0, 199.0, 1000.0, 127.0, 19.0, 55.0]
2025-09-12 05:30:21,233 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 52/100 (estimated time remaining: 10 hours, 57 minutes, 50 seconds)
2025-09-12 05:43:49,169 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:43:49,178 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:45:34,321 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 81.88737 ± 75.825
2025-09-12 05:45:34,322 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [103.32266, 16.0655, 114.47767, 150.73358, 177.89807, 59.190205, 154.92693, 87.51939, -93.17191, 47.911583]
2025-09-12 05:45:34,322 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [189.0, 47.0, 114.0, 128.0, 1000.0, 86.0, 99.0, 53.0, 1000.0, 1000.0]
2025-09-12 05:45:34,331 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 53/100 (estimated time remaining: 10 hours, 58 minutes, 13 seconds)
2025-09-12 05:57:21,738 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 05:57:21,747 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 05:58:12,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 123.86123 ± 107.090
2025-09-12 05:58:12,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [72.36025, 100.48631, 13.391436, 72.57446, 72.50967, 138.74622, 289.10272, 19.996992, 100.71582, 358.72845]
2025-09-12 05:58:12,708 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [81.0, 93.0, 13.0, 136.0, 110.0, 89.0, 1000.0, 14.0, 51.0, 252.0]
2025-09-12 05:58:12,722 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 54/100 (estimated time remaining: 10 hours, 37 minutes, 36 seconds)
2025-09-12 06:10:36,168 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:10:36,170 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:11:04,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 143.44862 ± 111.053
2025-09-12 06:11:04,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [80.69892, 98.74028, 440.01584, 78.053764, 45.470814, 177.73077, 96.57274, 228.5634, 102.04829, 86.591415]
2025-09-12 06:11:04,199 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [44.0, 48.0, 246.0, 67.0, 48.0, 134.0, 80.0, 226.0, 67.0, 47.0]
2025-09-12 06:11:04,210 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 55/100 (estimated time remaining: 10 hours, 24 minutes, 58 seconds)
2025-09-12 06:23:39,494 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:23:39,503 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:24:08,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 128.11351 ± 182.873
2025-09-12 06:24:08,581 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [9.773846, 85.02982, 53.218548, 654.82086, 199.15797, 39.998108, 69.01111, 89.60353, 74.252846, 6.2686086]
2025-09-12 06:24:08,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [15.0, 38.0, 60.0, 454.0, 152.0, 40.0, 162.0, 51.0, 69.0, 10.0]
2025-09-12 06:24:08,589 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 56/100 (estimated time remaining: 10 hours, 5 minutes, 48 seconds)
2025-09-12 06:36:56,079 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:36:56,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:38:16,246 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 126.41399 ± 124.233
2025-09-12 06:38:16,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [332.2982, -9.240618, 53.468754, 206.49065, 12.741684, 349.18088, 16.257154, 147.1307, 46.301044, 109.51147]
2025-09-12 06:38:16,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [211.0, 1000.0, 35.0, 159.0, 48.0, 228.0, 25.0, 86.0, 38.0, 1000.0]
2025-09-12 06:38:16,255 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 57/100 (estimated time remaining: 9 hours, 57 minutes, 40 seconds)
2025-09-12 06:50:51,697 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 06:50:51,699 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 06:52:27,801 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 121.57951 ± 112.451
2025-09-12 06:52:27,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [25.981335, 154.83353, 434.48956, 111.33018, 47.285114, 119.616936, 99.31721, 98.361885, 14.328729, 110.25054]
2025-09-12 06:52:27,802 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [22.0, 139.0, 1000.0, 71.0, 54.0, 1000.0, 77.0, 59.0, 33.0, 1000.0]
2025-09-12 06:52:27,814 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 58/100 (estimated time remaining: 9 hours, 35 minutes, 15 seconds)
2025-09-12 07:05:04,490 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:05:04,492 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:06:40,445 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 222.44475 ± 148.304
2025-09-12 07:06:40,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [24.73477, 480.4567, 120.556244, 175.62398, 228.85423, 266.90244, 437.35538, 330.39078, 103.74745, 55.82537]
2025-09-12 07:06:40,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 348.0, 92.0, 150.0, 1000.0, 300.0, 209.0, 215.0, 70.0, 36.0]
2025-09-12 07:06:40,451 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (222.44) for latency ExtremeClogL1U23
2025-09-12 07:06:40,461 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 59/100 (estimated time remaining: 9 hours, 35 minutes, 5 seconds)
2025-09-12 07:19:14,550 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:19:14,551 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:20:57,902 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 94.05008 ± 134.231
2025-09-12 07:20:57,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [99.734, 382.6547, 114.84103, 6.691594, -20.313942, 126.23319, 213.24522, 120.68793, -144.67053, 41.39763]
2025-09-12 07:20:57,903 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [114.0, 193.0, 148.0, 15.0, 1000.0, 77.0, 1000.0, 67.0, 1000.0, 65.0]
2025-09-12 07:20:57,912 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 60/100 (estimated time remaining: 9 hours, 33 minutes, 8 seconds)
2025-09-12 07:33:59,007 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:33:59,008 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:35:44,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 189.25375 ± 156.959
2025-09-12 07:35:44,745 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [495.77795, 39.30032, 25.236616, 98.2231, 293.83292, 383.357, 293.90317, 133.9566, 86.46795, 42.481934]
2025-09-12 07:35:44,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 103.0, 1000.0, 76.0, 148.0, 206.0, 1000.0, 77.0, 92.0, 61.0]
2025-09-12 07:35:44,762 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 61/100 (estimated time remaining: 9 hours, 32 minutes, 49 seconds)
2025-09-12 07:48:22,015 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 07:48:22,017 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 07:49:42,843 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 89.68270 ± 127.584
2025-09-12 07:49:42,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-27.964064, 115.49134, 26.739092, 231.86554, -157.9703, 139.52074, 61.187176, 17.699968, 299.69336, 190.56413]
2025-09-12 07:49:42,845 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 123.0, 21.0, 185.0, 1000.0, 74.0, 61.0, 23.0, 254.0, 169.0]
2025-09-12 07:49:42,854 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 62/100 (estimated time remaining: 9 hours, 17 minutes, 15 seconds)
2025-09-12 08:02:34,782 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:02:34,787 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:03:49,393 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 61.91070 ± 101.958
2025-09-12 08:03:49,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [46.237106, 199.74667, -29.206696, -124.558975, 29.455626, 95.48162, 68.658844, 27.18378, 256.95514, 49.153877]
2025-09-12 08:03:49,400 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [40.0, 162.0, 1000.0, 1000.0, 62.0, 60.0, 43.0, 86.0, 153.0, 59.0]
2025-09-12 08:03:49,407 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 63/100 (estimated time remaining: 9 hours, 2 minutes, 20 seconds)
2025-09-12 08:16:15,493 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:16:15,496 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:16:44,071 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 143.27654 ± 118.205
2025-09-12 08:16:44,071 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [158.19945, 277.39862, 151.51566, 122.09292, 416.59113, 7.8997865, 24.073406, 62.89796, 153.61829, 58.47812]
2025-09-12 08:16:44,071 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [100.0, 140.0, 119.0, 179.0, 190.0, 10.0, 24.0, 71.0, 147.0, 42.0]
2025-09-12 08:16:44,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 64/100 (estimated time remaining: 8 hours, 38 minutes, 26 seconds)
2025-09-12 08:29:14,647 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:29:14,649 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:30:43,709 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 161.63040 ± 183.760
2025-09-12 08:30:43,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [220.65504, 512.6321, 3.236994, 467.9108, 31.696182, 44.945755, 95.97074, 36.890507, -30.001562, 232.36748]
2025-09-12 08:30:43,711 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [219.0, 356.0, 11.0, 314.0, 39.0, 34.0, 1000.0, 34.0, 1000.0, 141.0]
2025-09-12 08:30:43,720 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 65/100 (estimated time remaining: 8 hours, 22 minutes, 17 seconds)
2025-09-12 08:44:02,253 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:44:02,257 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:44:58,069 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 108.32103 ± 135.502
2025-09-12 08:44:58,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [62.606384, -125.09534, 84.32137, 1.9130546, 31.15697, 259.9561, 8.888878, 209.33803, 202.62842, 347.49643]
2025-09-12 08:44:58,070 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [52.0, 1000.0, 68.0, 23.0, 56.0, 219.0, 14.0, 148.0, 207.0, 227.0]
2025-09-12 08:44:58,080 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 66/100 (estimated time remaining: 8 hours, 4 minutes, 33 seconds)
2025-09-12 08:56:52,742 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 08:56:52,746 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 08:58:19,776 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 127.11353 ± 145.187
2025-09-12 08:58:19,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-101.862595, 167.11688, 143.21725, -82.27874, 31.749132, 59.333454, 398.53278, 228.30841, 221.67546, 205.34322]
2025-09-12 08:58:19,777 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 101.0, 113.0, 1000.0, 103.0, 35.0, 323.0, 198.0, 129.0, 111.0]
2025-09-12 08:58:19,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 67/100 (estimated time remaining: 7 hours, 46 minutes, 35 seconds)
2025-09-12 09:11:50,883 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:11:50,888 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:13:21,168 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 201.08675 ± 251.142
2025-09-12 09:13:21,170 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [41.79418, 285.20312, 771.23346, -13.464343, 143.21289, 49.472332, 53.70603, 90.17541, 21.581306, 567.9531]
2025-09-12 09:13:21,170 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [28.0, 147.0, 466.0, 1000.0, 99.0, 46.0, 40.0, 1000.0, 17.0, 370.0]
2025-09-12 09:13:21,180 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 68/100 (estimated time remaining: 7 hours, 38 minutes, 53 seconds)
2025-09-12 09:25:35,155 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:25:35,159 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:26:31,162 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 176.43217 ± 121.656
2025-09-12 09:26:31,163 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [20.556349, 277.0598, 404.18033, 45.090317, 162.89813, 136.82483, 66.02709, 278.90762, 86.8427, 285.93466]
2025-09-12 09:26:31,163 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [27.0, 120.0, 219.0, 44.0, 87.0, 1000.0, 39.0, 159.0, 95.0, 220.0]
2025-09-12 09:26:31,172 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 69/100 (estimated time remaining: 7 hours, 26 minutes, 37 seconds)
2025-09-12 09:39:03,916 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:39:03,921 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:40:46,522 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 50.80351 ± 137.932
2025-09-12 09:40:46,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-119.814606, 268.61215, 119.8657, 17.908028, 232.95322, 23.154501, 106.13005, -127.5817, 120.75159, -133.94385]
2025-09-12 09:40:46,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 142.0, 102.0, 30.0, 154.0, 44.0, 58.0, 1000.0, 137.0, 1000.0]
2025-09-12 09:40:46,536 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 70/100 (estimated time remaining: 7 hours, 14 minutes, 17 seconds)
2025-09-12 09:52:44,181 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 09:52:44,183 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 09:54:01,604 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 267.34433 ± 272.685
2025-09-12 09:54:01,606 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [227.4255, 146.13438, 247.46877, 32.978985, 336.7671, -31.022053, 602.24677, 175.64351, 902.1284, 33.6721]
2025-09-12 09:54:01,606 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [154.0, 95.0, 196.0, 28.0, 152.0, 1000.0, 432.0, 148.0, 566.0, 28.0]
2025-09-12 09:54:01,606 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (267.34) for latency ExtremeClogL1U23
2025-09-12 09:54:01,614 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 71/100 (estimated time remaining: 6 hours, 54 minutes, 21 seconds)
2025-09-12 10:06:42,185 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:06:42,189 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:07:38,297 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 166.19603 ± 121.680
2025-09-12 10:07:38,297 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [108.25512, 78.30656, -5.82088, 86.46906, 341.93887, 170.25764, 397.96106, 210.15302, 217.23177, 57.208138]
2025-09-12 10:07:38,297 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [128.0, 49.0, 29.0, 1000.0, 180.0, 72.0, 261.0, 123.0, 159.0, 48.0]
2025-09-12 10:07:38,305 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 72/100 (estimated time remaining: 6 hours, 41 minutes, 59 seconds)
2025-09-12 10:19:42,673 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:19:42,676 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:21:15,720 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 375.35773 ± 295.559
2025-09-12 10:21:15,721 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [196.63058, 2.4164443, 93.82867, 463.2082, 779.23816, 551.40765, 97.77969, 313.45575, 301.393, 954.21924]
2025-09-12 10:21:15,722 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [169.0, 15.0, 58.0, 354.0, 484.0, 337.0, 1000.0, 221.0, 140.0, 613.0]
2025-09-12 10:21:15,722 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (375.36) for latency ExtremeClogL1U23
2025-09-12 10:21:15,738 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 73/100 (estimated time remaining: 6 hours, 20 minutes, 17 seconds)
2025-09-12 10:33:51,908 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:33:51,911 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:35:06,907 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 306.16589 ± 209.755
2025-09-12 10:35:06,908 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [624.6233, 60.3797, 143.32494, 24.630482, 482.80377, 514.29767, 186.59444, 532.77875, 357.87735, 134.34843]
2025-09-12 10:35:06,908 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [390.0, 94.0, 120.0, 17.0, 230.0, 278.0, 90.0, 330.0, 1000.0, 90.0]
2025-09-12 10:35:06,926 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 74/100 (estimated time remaining: 6 hours, 10 minutes, 25 seconds)
2025-09-12 10:48:27,414 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 10:48:27,417 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 10:49:40,433 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 330.54462 ± 198.168
2025-09-12 10:49:40,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [494.1571, 235.26006, 125.66168, 223.15971, 88.47907, 348.16092, 806.40784, 222.51833, 355.5832, 406.05807]
2025-09-12 10:49:40,435 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [332.0, 176.0, 151.0, 160.0, 124.0, 224.0, 576.0, 243.0, 312.0, 322.0]
2025-09-12 10:49:40,446 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 75/100 (estimated time remaining: 5 hours, 58 minutes, 16 seconds)
2025-09-12 11:01:24,017 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:01:24,019 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:02:41,825 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 141.46976 ± 192.453
2025-09-12 11:02:41,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [503.80432, 447.50085, 216.191, -117.40038, -2.044542, 8.447338, 216.47159, 8.548969, 71.40389, 61.774513]
2025-09-12 11:02:41,826 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 267.0, 126.0, 1000.0, 31.0, 16.0, 162.0, 18.0, 94.0, 46.0]
2025-09-12 11:02:41,838 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 76/100 (estimated time remaining: 5 hours, 43 minutes, 21 seconds)
2025-09-12 11:15:36,106 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:15:36,108 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:16:41,717 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 203.24895 ± 227.842
2025-09-12 11:16:41,718 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [28.209904, 73.89473, 294.47122, 696.6332, 1.3799564, 76.699005, 120.02714, 50.023533, 130.72993, 560.4209]
2025-09-12 11:16:41,718 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [25.0, 74.0, 139.0, 471.0, 13.0, 44.0, 105.0, 1000.0, 94.0, 380.0]
2025-09-12 11:16:41,727 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 77/100 (estimated time remaining: 5 hours, 31 minutes, 28 seconds)
2025-09-12 11:29:05,543 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:29:05,547 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:30:26,101 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 443.70752 ± 363.852
2025-09-12 11:30:26,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [348.7548, 625.0021, 296.89972, 424.82156, 204.27873, 249.90965, 1451.5288, 135.55785, 214.06712, 486.25482]
2025-09-12 11:30:26,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [383.0, 382.0, 246.0, 240.0, 110.0, 145.0, 747.0, 141.0, 180.0, 306.0]
2025-09-12 11:30:26,102 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (443.71) for latency ExtremeClogL1U23
2025-09-12 11:30:26,117 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 78/100 (estimated time remaining: 5 hours, 18 minutes, 11 seconds)
2025-09-12 11:42:59,250 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:42:59,252 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 11:44:42,305 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 459.26089 ± 571.751
2025-09-12 11:44:42,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [55.67052, 669.66003, 48.308918, -55.11209, 210.11728, 1978.61, 396.83218, 771.67664, 452.9815, 63.86398]
2025-09-12 11:44:42,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [47.0, 318.0, 51.0, 1000.0, 106.0, 1000.0, 290.0, 448.0, 337.0, 74.0]
2025-09-12 11:44:42,306 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (459.26) for latency ExtremeClogL1U23
2025-09-12 11:44:42,318 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 79/100 (estimated time remaining: 5 hours, 6 minutes, 11 seconds)
2025-09-12 11:57:57,054 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 11:57:57,057 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:00:46,819 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 382.13354 ± 445.433
2025-09-12 12:00:46,820 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [592.9966, 1234.8102, 3.2691453, -131.25052, 69.66086, 381.07724, 564.2853, 1037.6173, -5.643956, 74.51347]
2025-09-12 12:00:46,820 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 668.0, 1000.0, 1000.0, 91.0, 185.0, 371.0, 639.0, 1000.0, 48.0]
2025-09-12 12:00:46,831 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 80/100 (estimated time remaining: 4 hours, 58 minutes, 38 seconds)
2025-09-12 12:13:25,355 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:13:25,359 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:13:52,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 160.18159 ± 141.036
2025-09-12 12:13:52,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [60.034466, 16.741737, 184.11084, 131.39005, 161.93997, 282.24097, 507.10922, 63.90806, 12.215457, 182.12509]
2025-09-12 12:13:52,723 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [68.0, 22.0, 106.0, 72.0, 73.0, 212.0, 253.0, 43.0, 13.0, 124.0]
2025-09-12 12:13:52,735 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 81/100 (estimated time remaining: 4 hours, 44 minutes, 43 seconds)
2025-09-12 12:26:05,763 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:26:05,765 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:27:37,217 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 175.12428 ± 250.162
2025-09-12 12:27:37,218 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-141.49054, 798.14417, 173.43983, -132.71555, 359.2549, 152.88495, 159.66452, 119.81439, 107.934425, 154.31169]
2025-09-12 12:27:37,218 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 439.0, 94.0, 1000.0, 189.0, 119.0, 120.0, 85.0, 59.0, 98.0]
2025-09-12 12:27:37,227 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 82/100 (estimated time remaining: 4 hours, 29 minutes, 30 seconds)
2025-09-12 12:40:00,898 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:40:00,900 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:42:12,184 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 251.25468 ± 385.418
2025-09-12 12:42:12,185 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [171.51064, 315.39587, 20.526928, -204.5908, -52.278584, 256.86984, 186.35632, 252.23486, 258.12988, 1308.3918]
2025-09-12 12:42:12,185 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [91.0, 221.0, 24.0, 1000.0, 1000.0, 189.0, 162.0, 148.0, 1000.0, 827.0]
2025-09-12 12:42:12,194 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 83/100 (estimated time remaining: 4 hours, 18 minutes, 21 seconds)
2025-09-12 12:54:43,635 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 12:54:43,637 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 12:55:41,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 173.12317 ± 202.304
2025-09-12 12:55:41,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [119.8211, 654.95496, 51.16504, 8.661955, 37.195835, 318.52383, 211.85777, -89.171455, 109.52059, 308.70203]
2025-09-12 12:55:41,822 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [74.0, 286.0, 167.0, 10.0, 63.0, 143.0, 118.0, 1000.0, 67.0, 154.0]
2025-09-12 12:55:41,844 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 84/100 (estimated time remaining: 4 hours, 1 minute, 22 seconds)
2025-09-12 13:08:49,439 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:08:49,443 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:10:30,756 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 453.78085 ± 437.022
2025-09-12 13:10:30,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [749.84674, 151.92303, 251.02309, 663.48444, 171.35123, 20.21944, 220.84889, 380.14862, 333.22443, 1595.7388]
2025-09-12 13:10:30,757 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [429.0, 139.0, 110.0, 398.0, 122.0, 28.0, 107.0, 273.0, 1000.0, 1000.0]
2025-09-12 13:10:30,769 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 85/100 (estimated time remaining: 3 hours, 43 minutes, 8 seconds)
2025-09-12 13:23:35,352 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:23:35,356 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:25:48,784 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 258.16718 ± 368.089
2025-09-12 13:25:48,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [661.50757, 91.42537, 130.18637, -30.202623, 124.94766, 14.743429, 561.48413, 1079.0199, -200.19579, 148.75589]
2025-09-12 13:25:48,791 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [383.0, 82.0, 61.0, 1000.0, 100.0, 1000.0, 327.0, 640.0, 1000.0, 105.0]
2025-09-12 13:25:48,800 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 86/100 (estimated time remaining: 3 hours, 35 minutes, 48 seconds)
2025-09-12 13:38:02,299 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:38:02,304 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:38:45,970 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 268.38132 ± 260.985
2025-09-12 13:38:45,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [262.51993, 152.78642, 319.9272, 105.10644, 159.06099, 10.547709, 93.05031, 970.59796, 175.79163, 434.42453]
2025-09-12 13:38:45,971 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [160.0, 99.0, 151.0, 83.0, 95.0, 12.0, 69.0, 496.0, 126.0, 273.0]
2025-09-12 13:38:45,981 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 87/100 (estimated time remaining: 3 hours, 19 minutes, 12 seconds)
2025-09-12 13:50:52,808 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 13:50:52,811 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 13:51:41,972 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 268.41479 ± 171.113
2025-09-12 13:51:41,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [9.942248, 282.76678, 295.66483, 327.819, 61.82031, 598.4884, 314.4919, 425.26608, 63.76686, 304.12164]
2025-09-12 13:51:41,973 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [14.0, 147.0, 219.0, 178.0, 83.0, 402.0, 147.0, 259.0, 63.0, 255.0]
2025-09-12 13:51:41,993 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 88/100 (estimated time remaining: 3 hours, 41 seconds)
2025-09-12 14:04:57,238 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:04:57,242 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:06:40,607 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 347.13330 ± 234.268
2025-09-12 14:06:40,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [304.8625, 613.5769, 489.7519, 91.92643, 224.30612, 296.7521, 444.2302, 21.368147, 164.21417, 820.3446]
2025-09-12 14:06:40,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [179.0, 370.0, 258.0, 85.0, 117.0, 1000.0, 1000.0, 88.0, 79.0, 502.0]
2025-09-12 14:06:40,623 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 89/100 (estimated time remaining: 2 hours, 50 minutes, 21 seconds)
2025-09-12 14:18:31,010 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:18:31,012 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:19:36,918 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 201.58842 ± 256.550
2025-09-12 14:19:36,924 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [20.45231, 501.14456, 213.27394, 91.9877, 51.148613, 118.86367, 139.02386, 294.7347, -189.474, 774.72894]
2025-09-12 14:19:36,924 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [29.0, 278.0, 158.0, 60.0, 32.0, 65.0, 105.0, 208.0, 1000.0, 409.0]
2025-09-12 14:19:36,939 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 90/100 (estimated time remaining: 2 hours, 32 minutes, 1 second)
2025-09-12 14:32:12,710 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:32:12,712 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:33:34,608 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 536.09802 ± 400.285
2025-09-12 14:33:34,609 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [1316.0681, 245.53232, 95.57604, 988.13043, 773.0286, 579.3385, 119.51725, 382.22482, 772.6179, 88.94621]
2025-09-12 14:33:34,609 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [604.0, 146.0, 70.0, 588.0, 484.0, 349.0, 62.0, 216.0, 348.0, 68.0]
2025-09-12 14:33:34,609 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1226 [INFO]: New best (536.10) for latency ExtremeClogL1U23
2025-09-12 14:33:34,678 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 91/100 (estimated time remaining: 2 hours, 15 minutes, 31 seconds)
2025-09-12 14:46:30,395 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 14:46:30,399 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 14:49:07,527 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 305.17337 ± 342.963
2025-09-12 14:49:07,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [1070.3998, 102.403824, 58.017456, -87.39646, 347.64587, 514.3107, 221.70222, 600.2504, -133.04921, 357.44928]
2025-09-12 14:49:07,528 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [548.0, 62.0, 83.0, 1000.0, 175.0, 398.0, 1000.0, 1000.0, 1000.0, 246.0]
2025-09-12 14:49:07,540 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 92/100 (estimated time remaining: 2 hours, 6 minutes, 38 seconds)
2025-09-12 15:02:09,174 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:02:09,178 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:03:17,415 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 215.15886 ± 176.707
2025-09-12 15:03:17,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [134.13359, 489.90106, -81.59997, 321.8995, 326.33112, 5.531028, 11.270991, 269.9289, 328.87405, 345.3183]
2025-09-12 15:03:17,416 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [178.0, 264.0, 1000.0, 156.0, 181.0, 10.0, 12.0, 140.0, 173.0, 300.0]
2025-09-12 15:03:17,430 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 93/100 (estimated time remaining: 1 hour, 54 minutes, 32 seconds)
2025-09-12 15:15:30,079 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:15:30,081 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:16:32,301 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 386.32977 ± 176.220
2025-09-12 15:16:32,301 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [351.05927, 400.0028, 361.12036, 537.6653, 747.9986, 421.3752, 296.9678, 475.50623, 213.47005, 58.132458]
2025-09-12 15:16:32,301 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [151.0, 205.0, 258.0, 282.0, 343.0, 221.0, 201.0, 333.0, 163.0, 64.0]
2025-09-12 15:16:32,317 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 94/100 (estimated time remaining: 1 hour, 37 minutes, 48 seconds)
2025-09-12 15:29:02,317 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:29:02,319 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:29:50,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 276.61176 ± 186.754
2025-09-12 15:29:50,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [353.14145, 114.35583, 16.49831, 567.58295, 353.07025, 589.54785, 94.13389, 116.28147, 307.67105, 253.83455]
2025-09-12 15:29:50,181 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [189.0, 80.0, 15.0, 326.0, 226.0, 367.0, 75.0, 58.0, 202.0, 148.0]
2025-09-12 15:29:50,194 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 95/100 (estimated time remaining: 1 hour, 24 minutes, 15 seconds)
2025-09-12 15:42:21,124 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:42:21,125 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:43:34,209 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 291.34918 ± 500.127
2025-09-12 15:43:34,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [62.845917, 1752.1968, 88.59685, 8.017132, 239.80708, 309.66434, 36.173645, 15.176723, 72.61281, 328.4003]
2025-09-12 15:43:34,211 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [39.0, 1000.0, 68.0, 11.0, 169.0, 1000.0, 35.0, 60.0, 43.0, 185.0]
2025-09-12 15:43:34,225 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 96/100 (estimated time remaining: 1 hour, 9 minutes, 59 seconds)
2025-09-12 15:56:40,231 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 15:56:40,235 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 15:57:48,185 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 437.33685 ± 425.879
2025-09-12 15:57:48,186 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [295.2236, 72.076294, 619.38666, 314.5614, -1.5077897, 180.47212, 342.28796, 1377.5453, 130.74844, 1042.5747]
2025-09-12 15:57:48,186 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [143.0, 36.0, 314.0, 166.0, 24.0, 128.0, 321.0, 649.0, 142.0, 513.0]
2025-09-12 15:57:48,197 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 97/100 (estimated time remaining: 54 minutes, 56 seconds)
2025-09-12 16:10:31,105 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:10:31,119 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:12:06,073 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 414.55664 ± 401.084
2025-09-12 16:12:06,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-162.21527, -0.7239195, 67.62224, 998.91626, 55.548656, 818.59235, 437.22955, 855.9962, 786.3762, 288.2243]
2025-09-12 16:12:06,074 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 95.0, 51.0, 466.0, 60.0, 438.0, 260.0, 426.0, 439.0, 140.0]
2025-09-12 16:12:06,083 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 98/100 (estimated time remaining: 41 minutes, 17 seconds)
2025-09-12 16:24:13,243 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:24:13,247 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:26:00,164 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 343.27008 ± 317.107
2025-09-12 16:26:00,165 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [-91.67521, 474.22943, 608.084, 540.8253, 59.441666, 170.77324, 662.84784, 71.25457, 899.9158, 37.004173]
2025-09-12 16:26:00,165 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [1000.0, 238.0, 279.0, 287.0, 1000.0, 117.0, 314.0, 53.0, 405.0, 68.0]
2025-09-12 16:26:00,190 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 99/100 (estimated time remaining: 27 minutes, 47 seconds)
2025-09-12 16:38:58,443 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:38:58,447 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:40:36,569 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 504.03598 ± 556.871
2025-09-12 16:40:36,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [348.72986, 1220.7848, 248.49167, 92.7664, 1881.1378, 396.27258, 80.892624, 106.76667, 220.7946, 443.72244]
2025-09-12 16:40:36,572 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [191.0, 609.0, 137.0, 37.0, 1000.0, 222.0, 82.0, 79.0, 126.0, 1000.0]
2025-09-12 16:40:36,582 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1199 [INFO]: Iteration 100/100 (estimated time remaining: 14 minutes, 9 seconds)
2025-09-12 16:53:21,870 latency_env.training.mbpac:635 [DEBUG]: train() done
2025-09-12 16:53:21,873 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-09-12 16:54:03,564 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1221 [DEBUG]: Total Reward: 248.20341 ± 208.105
2025-09-12 16:54:03,564 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1222 [DEBUG]: All rewards: [299.98618, 2.5005577, 93.37785, 562.0676, 90.19476, 193.95126, 539.2231, 1.6115357, 513.2466, 185.87485]
2025-09-12 16:54:03,564 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1223 [DEBUG]: All trajectory lengths: [208.0, 28.0, 92.0, 292.0, 42.0, 97.0, 267.0, 23.0, 340.0, 112.0]
2025-09-12 16:54:03,592 latency_env.delayed_mdp:training_loop(baseline-mbpac-noiseperc20-ant):1251 [DEBUG]: Training session finished
