2025-05-05 18:48:26,664 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1006 [DEBUG]: logdir: _logs/benchmark-v3-tc3/noisy-ant/SparseU15-sac-aug-mem32
2025-05-05 18:48:26,664 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1007 [DEBUG]: trainer_prefix: benchmark-v3-tc3/noisy-ant/SparseU15-sac-aug-mem32
2025-05-05 18:48:26,664 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1008 [DEBUG]: args.trainer_eval_latencies: {'SparseU15': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x7162c8fc3d00>}
2025-05-05 18:48:26,664 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1009 [DEBUG]: using device: cpu
2025-05-05 18:48:26,671 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1031 [INFO]: Creating new trainer
2025-05-05 18:48:26,677 baseline-sac-noisy-ant:105 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=283, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=8, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(8,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1., -1., -1.]]))
)
2025-05-05 18:48:26,677 baseline-sac-noisy-ant:106 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=291, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-05-05 18:48:27,148 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1092 [DEBUG]: Starting training session...
2025-05-05 18:48:27,148 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 1/100
2025-05-05 18:51:29,368 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 18:51:42,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -916.23566 ± 1078.222
2025-05-05 18:51:42,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-47.67601, 15.916318, -2281.4092, 17.504278, -2156.583, -25.365898, 24.19558, -2211.9468, -2285.4556, -211.53586]
2025-05-05 18:51:42,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [69.0, 37.0, 1000.0, 40.0, 1000.0, 78.0, 39.0, 1000.0, 1000.0, 135.0]
2025-05-05 18:51:42,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1124 [INFO]: New best (-916.24) for latency SparseU15
2025-05-05 18:51:42,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1127 [INFO]: saving network
2025-05-05 18:51:42,763 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-ant/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-05 18:51:42,770 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 2/100 (estimated time remaining: 5 hours, 22 minutes, 46 seconds)
2025-05-05 18:54:53,913 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 18:54:55,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -20.54900 ± 34.493
2025-05-05 18:54:55,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-1.9082139, -6.4046974, -33.04202, -50.022606, -61.91155, 8.185105, 26.143225, -88.80881, -0.9556293, 3.235217]
2025-05-05 18:54:55,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [69.0, 64.0, 78.0, 73.0, 97.0, 56.0, 40.0, 125.0, 62.0, 81.0]
2025-05-05 18:54:55,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1124 [INFO]: New best (-20.55) for latency SparseU15
2025-05-05 18:54:55,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1127 [INFO]: saving network
2025-05-05 18:54:55,907 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-ant/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-05 18:54:55,914 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 3/100 (estimated time remaining: 5 hours, 17 minutes, 29 seconds)
2025-05-05 18:58:07,372 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 18:58:09,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: 7.70711 ± 17.369
2025-05-05 18:58:09,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-20.174002, -22.651733, 3.9908516, 4.3282175, 31.449251, 6.080206, 5.7435966, 23.00565, 18.07414, 27.224894]
2025-05-05 18:58:09,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [67.0, 88.0, 68.0, 58.0, 63.0, 69.0, 51.0, 50.0, 47.0, 58.0]
2025-05-05 18:58:09,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1124 [INFO]: New best (7.71) for latency SparseU15
2025-05-05 18:58:09,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1127 [INFO]: saving network
2025-05-05 18:58:09,031 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-ant/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-05 18:58:09,038 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 4/100 (estimated time remaining: 5 hours, 13 minutes, 34 seconds)
2025-05-05 19:01:20,876 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:01:23,334 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -24.96434 ± 36.835
2025-05-05 19:01:23,334 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-103.6439, 20.31987, -0.62568676, -15.797763, -38.4199, -70.36759, 5.95463, -3.56668, -43.809216, 0.3128713]
2025-05-05 19:01:23,334 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [156.0, 57.0, 49.0, 127.0, 112.0, 107.0, 64.0, 59.0, 112.0, 74.0]
2025-05-05 19:01:23,335 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 5/100 (estimated time remaining: 5 hours, 10 minutes, 28 seconds)
2025-05-05 19:04:37,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:04:43,356 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -112.23003 ± 292.477
2025-05-05 19:04:43,357 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-66.93925, -83.31729, -2.688222, 14.0953045, 24.873003, -75.920815, 33.164333, -980.89404, 7.4414325, 7.885216]
2025-05-05 19:04:43,357 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [124.0, 170.0, 81.0, 89.0, 77.0, 116.0, 36.0, 1000.0, 124.0, 62.0]
2025-05-05 19:04:43,358 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 6/100 (estimated time remaining: 5 hours, 9 minutes, 7 seconds)
2025-05-05 19:07:53,368 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:07:58,332 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -102.96680 ± 277.107
2025-05-05 19:07:58,332 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-19.334625, -66.446915, -100.24951, -925.4017, 25.9415, 20.402042, 22.623802, 6.724235, 25.744326, -19.671173]
2025-05-05 19:07:58,332 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [97.0, 129.0, 159.0, 1000.0, 39.0, 58.0, 54.0, 66.0, 37.0, 79.0]
2025-05-05 19:07:58,334 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 7/100 (estimated time remaining: 5 hours, 5 minutes, 40 seconds)
2025-05-05 19:11:27,533 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:11:29,458 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -12.00849 ± 22.713
2025-05-05 19:11:29,458 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-38.310192, -37.57673, 6.0017233, -3.9670146, -10.296826, 8.649569, 14.422307, -49.23176, -24.998089, 15.222128]
2025-05-05 19:11:29,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [76.0, 78.0, 43.0, 64.0, 81.0, 56.0, 43.0, 103.0, 119.0, 55.0]
2025-05-05 19:11:29,460 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 8/100 (estimated time remaining: 5 hours, 7 minutes, 59 seconds)
2025-05-05 19:14:30,732 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:14:32,665 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -19.89234 ± 38.912
2025-05-05 19:14:32,665 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [19.922588, -13.624197, 3.408239, 18.606527, -34.275578, -7.860826, -107.917755, -6.838806, 1.9402839, -72.28387]
2025-05-05 19:14:32,665 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [38.0, 105.0, 42.0, 42.0, 84.0, 70.0, 115.0, 48.0, 55.0, 124.0]
2025-05-05 19:14:32,667 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 9/100 (estimated time remaining: 5 hours, 1 minute, 38 seconds)
2025-05-05 19:17:49,356 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:17:50,727 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: 7.23769 ± 26.231
2025-05-05 19:17:50,727 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [29.993168, 11.542771, 19.361826, -3.0801024, 25.440424, 30.326668, 24.354698, 11.598787, -57.496906, -19.664482]
2025-05-05 19:17:50,727 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [38.0, 48.0, 41.0, 59.0, 40.0, 50.0, 47.0, 42.0, 84.0, 66.0]
2025-05-05 19:17:50,729 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 10/100 (estimated time remaining: 4 hours, 59 minutes, 30 seconds)
2025-05-05 19:21:11,589 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:21:17,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -144.66693 ± 293.887
2025-05-05 19:21:17,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-50.027634, -142.38466, -6.550292, -1003.5154, -4.578968, 7.8409476, -63.03142, 19.987179, -7.94252, -196.46652]
2025-05-05 19:21:17,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [100.0, 177.0, 65.0, 1000.0, 47.0, 47.0, 151.0, 98.0, 73.0, 181.0]
2025-05-05 19:21:17,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 11/100 (estimated time remaining: 4 hours, 58 minutes, 7 seconds)
2025-05-05 19:24:37,064 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:24:41,501 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -99.07750 ± 317.902
2025-05-05 19:24:41,502 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [24.660019, 13.492167, -41.68226, -1051.0789, 24.652367, 7.6821704, 23.93618, -0.3838017, 14.300352, -6.353384]
2025-05-05 19:24:41,502 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [58.0, 41.0, 70.0, 1000.0, 44.0, 60.0, 49.0, 71.0, 74.0, 71.0]
2025-05-05 19:24:41,504 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 12/100 (estimated time remaining: 4 hours, 57 minutes, 36 seconds)
2025-05-05 19:28:14,716 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:28:16,506 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -5.38692 ± 27.868
2025-05-05 19:28:16,506 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-37.75456, -6.9640417, 38.87796, -15.487709, 32.320713, 1.3277177, -7.6336083, -1.3887836, 3.7483108, -60.91518]
2025-05-05 19:28:16,506 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [85.0, 48.0, 48.0, 122.0, 44.0, 57.0, 71.0, 50.0, 49.0, 98.0]
2025-05-05 19:28:16,508 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 13/100 (estimated time remaining: 4 hours, 55 minutes, 24 seconds)
2025-05-05 19:31:25,915 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:31:31,264 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -136.74448 ± 292.145
2025-05-05 19:31:31,264 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [0.48778445, -17.743267, -23.604263, -368.66016, -4.103919, 4.7624617, -15.93508, 17.498566, -9.885135, -950.26184]
2025-05-05 19:31:31,264 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [134.0, 86.0, 63.0, 299.0, 65.0, 53.0, 64.0, 42.0, 64.0, 1000.0]
2025-05-05 19:31:31,266 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 14/100 (estimated time remaining: 4 hours, 55 minutes, 23 seconds)
2025-05-05 19:34:50,985 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:34:55,914 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -114.55861 ± 265.247
2025-05-05 19:34:55,914 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-891.6771, -53.46057, -9.991306, 11.977452, 19.87009, 28.656328, -67.31063, -175.56137, -2.6051018, -5.483927]
2025-05-05 19:34:55,914 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 109.0, 85.0, 51.0, 47.0, 51.0, 93.0, 160.0, 44.0, 69.0]
2025-05-05 19:34:55,917 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 15/100 (estimated time remaining: 4 hours, 53 minutes, 53 seconds)
2025-05-05 19:38:34,099 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:38:39,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -133.76828 ± 269.287
2025-05-05 19:38:39,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [14.764448, -83.321724, -25.55656, -77.262726, -934.91754, -44.47249, -16.17735, -81.130516, -3.5635784, -86.04463]
2025-05-05 19:38:39,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [73.0, 87.0, 115.0, 118.0, 1000.0, 91.0, 66.0, 186.0, 69.0, 103.0]
2025-05-05 19:38:39,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 16/100 (estimated time remaining: 4 hours, 55 minutes, 21 seconds)
2025-05-05 19:41:51,847 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:41:53,544 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -11.10203 ± 48.392
2025-05-05 19:41:53,544 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-2.7751586, -46.619022, 35.27082, 8.980142, 2.7158911, -16.552355, 17.638153, 15.583363, -141.57433, 16.31223]
2025-05-05 19:41:53,544 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [61.0, 81.0, 42.0, 47.0, 65.0, 86.0, 42.0, 41.0, 128.0, 41.0]
2025-05-05 19:41:53,547 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 17/100 (estimated time remaining: 4 hours, 48 minutes, 58 seconds)
2025-05-05 19:45:20,508 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:45:22,587 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -21.07401 ± 37.282
2025-05-05 19:45:22,587 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-96.896645, 26.067184, 15.763486, -27.47267, 6.663989, -2.5761983, -46.303043, -48.96655, 13.0440645, -50.063732]
2025-05-05 19:45:22,587 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [121.0, 51.0, 41.0, 102.0, 54.0, 67.0, 132.0, 86.0, 42.0, 81.0]
2025-05-05 19:45:22,590 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 18/100 (estimated time remaining: 4 hours, 43 minutes, 52 seconds)
2025-05-05 19:48:28,132 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:48:30,036 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -17.96449 ± 61.152
2025-05-05 19:48:30,036 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-77.02429, -160.13287, 20.347301, 25.499851, 21.976656, 11.6243, 13.261637, 17.425785, -78.75513, 26.131903]
2025-05-05 19:48:30,036 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [86.0, 181.0, 38.0, 39.0, 37.0, 59.0, 52.0, 52.0, 129.0, 37.0]
2025-05-05 19:48:30,039 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 19/100 (estimated time remaining: 4 hours, 38 minutes, 27 seconds)
2025-05-05 19:51:42,919 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:51:47,798 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -103.93245 ± 275.380
2025-05-05 19:51:47,798 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-39.57728, 12.661501, -26.456789, -34.511227, 9.105999, -926.1363, -50.906776, -31.960056, 15.708873, 32.7475]
2025-05-05 19:51:47,799 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [89.0, 64.0, 80.0, 98.0, 61.0, 1000.0, 118.0, 82.0, 50.0, 55.0]
2025-05-05 19:51:47,802 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 20/100 (estimated time remaining: 4 hours, 33 minutes, 12 seconds)
2025-05-05 19:54:52,907 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:55:03,689 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -316.28058 ± 427.599
2025-05-05 19:55:03,689 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-995.0579, -20.033508, -51.20018, -964.3802, -91.87302, -20.10708, 19.524853, -944.24756, -56.98607, -38.44499]
2025-05-05 19:55:03,689 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 124.0, 117.0, 1000.0, 98.0, 71.0, 60.0, 1000.0, 95.0, 72.0]
2025-05-05 19:55:03,693 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 21/100 (estimated time remaining: 4 hours, 22 minutes, 26 seconds)
2025-05-05 19:58:21,526 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 19:58:26,986 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -134.49762 ± 252.961
2025-05-05 19:58:26,986 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [41.360775, 19.66876, -16.46166, -224.31851, 18.432564, -835.96423, -25.950891, -92.9644, -243.69406, 14.915513]
2025-05-05 19:58:26,986 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [47.0, 54.0, 77.0, 194.0, 44.0, 1000.0, 93.0, 115.0, 217.0, 62.0]
2025-05-05 19:58:26,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 22/100 (estimated time remaining: 4 hours, 21 minutes, 36 seconds)
2025-05-05 20:01:20,846 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:01:22,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -26.83548 ± 55.042
2025-05-05 20:01:22,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-4.961291, 8.961985, -22.61516, -22.773338, -133.81894, 20.047909, -26.596909, 16.416742, -128.22174, 25.205906]
2025-05-05 20:01:22,904 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [68.0, 52.0, 104.0, 69.0, 126.0, 45.0, 68.0, 44.0, 141.0, 50.0]
2025-05-05 20:01:22,907 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 23/100 (estimated time remaining: 4 hours, 9 minutes, 40 seconds)
2025-05-05 20:04:33,049 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:04:35,116 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -6.27860 ± 39.039
2025-05-05 20:04:35,116 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [21.915474, -8.542224, -76.88227, 24.618546, 24.677805, -15.6898365, 28.768541, -81.60776, 13.353421, 6.6023526]
2025-05-05 20:04:35,116 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [39.0, 96.0, 109.0, 42.0, 38.0, 70.0, 91.0, 166.0, 42.0, 77.0]
2025-05-05 20:04:35,120 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 24/100 (estimated time remaining: 4 hours, 7 minutes, 42 seconds)
2025-05-05 20:07:45,773 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:07:48,376 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -33.93338 ± 70.089
2025-05-05 20:07:48,376 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [3.1273842, -4.474977, -51.788605, 22.871517, 22.14101, -117.210205, -46.592075, 16.557741, -201.09276, 17.127188]
2025-05-05 20:07:48,376 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [57.0, 82.0, 114.0, 42.0, 48.0, 153.0, 143.0, 46.0, 242.0, 39.0]
2025-05-05 20:07:48,380 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 25/100 (estimated time remaining: 4 hours, 3 minutes, 20 seconds)
2025-05-05 20:11:08,940 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:11:10,748 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: 5.33606 ± 20.650
2025-05-05 20:11:10,748 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [31.656094, 14.039093, 22.442608, 7.8124957, 16.484705, 3.3082356, -19.837875, 2.0783014, 17.698215, -42.3213]
2025-05-05 20:11:10,748 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [36.0, 61.0, 40.0, 80.0, 91.0, 51.0, 85.0, 92.0, 39.0, 100.0]
2025-05-05 20:11:10,752 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 26/100 (estimated time remaining: 4 hours, 1 minute, 45 seconds)
2025-05-05 20:14:19,736 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:14:25,069 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -126.16071 ± 270.829
2025-05-05 20:14:25,069 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [26.308748, -47.356087, -108.675896, -915.18097, -98.55613, 17.537424, 18.495014, 24.322557, -7.204768, -171.29698]
2025-05-05 20:14:25,069 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [52.0, 108.0, 125.0, 1000.0, 156.0, 65.0, 46.0, 42.0, 81.0, 183.0]
2025-05-05 20:14:25,073 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 27/100 (estimated time remaining: 3 hours, 56 minutes, 19 seconds)
2025-05-05 20:17:41,346 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:17:46,200 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -99.87377 ± 266.268
2025-05-05 20:17:46,200 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-891.0854, -89.949875, 14.721186, 25.970758, -8.420606, 0.1669319, -52.00639, -43.64783, 23.818369, 21.69513]
2025-05-05 20:17:46,200 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 160.0, 68.0, 40.0, 57.0, 92.0, 81.0, 87.0, 44.0, 41.0]
2025-05-05 20:17:46,205 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 28/100 (estimated time remaining: 3 hours, 59 minutes, 16 seconds)
2025-05-05 20:21:00,634 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:21:02,556 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -14.11976 ± 46.147
2025-05-05 20:21:02,556 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [26.04985, 15.608244, 4.0059586, 24.71919, 25.70534, -36.75248, -83.66342, 14.55328, -21.662071, -109.76148]
2025-05-05 20:21:02,556 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [55.0, 43.0, 59.0, 46.0, 65.0, 77.0, 130.0, 42.0, 76.0, 122.0]
2025-05-05 20:21:02,560 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 29/100 (estimated time remaining: 3 hours, 56 minutes, 59 seconds)
2025-05-05 20:23:59,239 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:24:01,179 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -11.13759 ± 29.016
2025-05-05 20:24:01,179 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-1.51303, 20.315756, 14.454723, -8.038138, -24.435194, -28.963913, 27.559383, -73.510155, -37.036728, -0.2086445]
2025-05-05 20:24:01,179 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [65.0, 59.0, 46.0, 60.0, 80.0, 101.0, 39.0, 101.0, 101.0, 68.0]
2025-05-05 20:24:01,183 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 30/100 (estimated time remaining: 3 hours, 50 minutes, 13 seconds)
2025-05-05 20:27:11,653 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:27:14,169 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -43.94655 ± 61.198
2025-05-05 20:27:14,169 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-63.51398, -109.7022, -177.46913, -52.200848, -0.22215195, -28.263681, 19.766851, 17.7492, -67.142494, 21.532913]
2025-05-05 20:27:14,169 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [131.0, 132.0, 184.0, 115.0, 65.0, 97.0, 51.0, 40.0, 80.0, 41.0]
2025-05-05 20:27:14,174 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 31/100 (estimated time remaining: 3 hours, 44 minutes, 47 seconds)
2025-05-05 20:30:41,940 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:30:44,153 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -17.93743 ± 26.590
2025-05-05 20:30:44,153 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [14.8957, -36.09654, -53.64929, -30.329552, -10.020013, 12.862878, -15.160897, 6.0957003, -66.55877, -1.4134599]
2025-05-05 20:30:44,153 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [52.0, 113.0, 91.0, 106.0, 57.0, 69.0, 92.0, 69.0, 119.0, 60.0]
2025-05-05 20:30:44,158 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 32/100 (estimated time remaining: 3 hours, 45 minutes, 11 seconds)
2025-05-05 20:33:38,414 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:33:40,503 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -30.17997 ± 59.832
2025-05-05 20:33:40,503 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-2.9762516, -15.342238, 4.836948, -6.478612, 14.898775, -105.82073, -178.50148, 2.6982477, -31.981655, 16.867247]
2025-05-05 20:33:40,503 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [86.0, 76.0, 45.0, 53.0, 53.0, 118.0, 159.0, 56.0, 83.0, 51.0]
2025-05-05 20:33:40,508 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 33/100 (estimated time remaining: 3 hours, 36 minutes, 18 seconds)
2025-05-05 20:36:55,606 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:36:57,762 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -25.86333 ± 40.796
2025-05-05 20:36:57,762 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-7.944061, 17.429932, -43.128593, -84.40042, 15.976784, -30.024729, 17.7042, -17.511374, -17.684952, -109.05009]
2025-05-05 20:36:57,762 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [66.0, 75.0, 104.0, 108.0, 54.0, 71.0, 49.0, 69.0, 80.0, 127.0]
2025-05-05 20:36:57,767 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 34/100 (estimated time remaining: 3 hours, 33 minutes, 19 seconds)
2025-05-05 20:40:15,137 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:40:20,088 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -78.89061 ± 165.145
2025-05-05 20:40:20,089 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-29.521042, -78.50516, 20.618464, 22.10282, -12.16388, 8.267454, -13.744443, -32.079956, -558.8924, -114.98794]
2025-05-05 20:40:20,089 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [73.0, 118.0, 39.0, 39.0, 89.0, 63.0, 70.0, 78.0, 1000.0, 152.0]
2025-05-05 20:40:20,094 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 35/100 (estimated time remaining: 3 hours, 35 minutes, 21 seconds)
2025-05-05 20:43:18,907 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:43:24,278 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -70.80317 ± 129.111
2025-05-05 20:43:24,278 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [3.0221508, 28.984581, -138.24455, -14.237772, 21.82088, 17.073563, 1.6844323, -148.23027, -68.53791, -411.36676]
2025-05-05 20:43:24,278 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [61.0, 43.0, 157.0, 96.0, 43.0, 53.0, 80.0, 197.0, 142.0, 1000.0]
2025-05-05 20:43:24,284 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 36/100 (estimated time remaining: 3 hours, 30 minutes, 11 seconds)
2025-05-05 20:46:48,753 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:46:59,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -151.44701 ± 200.649
2025-05-05 20:46:59,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [9.062938, -519.2238, -23.835176, -29.266863, -6.653205, 22.151737, -481.43265, -329.75775, -139.86992, -15.645368]
2025-05-05 20:46:59,802 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [54.0, 1000.0, 105.0, 93.0, 96.0, 45.0, 1000.0, 1000.0, 147.0, 92.0]
2025-05-05 20:46:59,807 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 37/100 (estimated time remaining: 3 hours, 28 minutes, 8 seconds)
2025-05-05 20:49:58,738 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:50:00,969 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -16.42890 ± 35.034
2025-05-05 20:50:00,969 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [22.841278, -95.234886, -21.91877, -31.811155, -6.0354524, -18.638813, 29.282469, -0.8616281, -50.694958, 8.7828865]
2025-05-05 20:50:00,969 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [42.0, 153.0, 96.0, 94.0, 102.0, 86.0, 63.0, 56.0, 89.0, 54.0]
2025-05-05 20:50:00,975 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 38/100 (estimated time remaining: 3 hours, 25 minutes, 53 seconds)
2025-05-05 20:53:15,539 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:53:18,280 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -33.22630 ± 58.768
2025-05-05 20:53:18,281 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [22.505888, -130.46399, -74.97204, -89.32102, 35.382027, -17.194296, 35.66767, 17.137667, -28.381104, -102.623825]
2025-05-05 20:53:18,281 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [39.0, 159.0, 175.0, 188.0, 66.0, 103.0, 47.0, 44.0, 64.0, 135.0]
2025-05-05 20:53:18,286 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 39/100 (estimated time remaining: 3 hours, 22 minutes, 38 seconds)
2025-05-05 20:56:30,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:56:33,224 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -44.52086 ± 75.645
2025-05-05 20:56:33,224 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [14.5923395, 10.002671, -15.364273, -253.11136, -70.313034, -22.245806, -65.41613, -45.932617, -15.995875, 18.575464]
2025-05-05 20:56:33,224 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [45.0, 63.0, 70.0, 192.0, 87.0, 129.0, 83.0, 75.0, 81.0, 52.0]
2025-05-05 20:56:33,230 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 40/100 (estimated time remaining: 3 hours, 17 minutes, 52 seconds)
2025-05-05 20:59:50,510 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 20:59:53,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -48.96840 ± 75.303
2025-05-05 20:59:53,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-71.28735, -196.7417, -20.221943, 28.755568, 23.221565, -177.82309, -18.911058, 9.695732, -53.742332, -12.629333]
2025-05-05 20:59:53,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [121.0, 178.0, 81.0, 46.0, 45.0, 208.0, 80.0, 41.0, 94.0, 74.0]
2025-05-05 20:59:53,127 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 41/100 (estimated time remaining: 3 hours, 17 minutes, 46 seconds)
2025-05-05 21:03:03,619 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:03:09,091 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -33.92172 ± 64.165
2025-05-05 21:03:09,091 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-46.18668, 2.2745264, -14.69389, -169.8953, 23.441093, -137.7411, 24.695168, -24.602652, -17.185045, 20.676702]
2025-05-05 21:03:09,091 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [146.0, 79.0, 89.0, 1000.0, 101.0, 195.0, 40.0, 95.0, 110.0, 39.0]
2025-05-05 21:03:09,097 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 42/100 (estimated time remaining: 3 hours, 10 minutes, 37 seconds)
2025-05-05 21:06:10,218 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:06:17,992 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -79.39351 ± 194.191
2025-05-05 21:06:17,992 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [12.470413, -22.689175, 6.355782, -628.6963, 19.925478, -206.46944, 13.175372, 20.778605, -20.090828, 11.305022]
2025-05-05 21:06:17,992 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [82.0, 146.0, 51.0, 1000.0, 40.0, 1000.0, 48.0, 41.0, 116.0, 52.0]
2025-05-05 21:06:17,998 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 43/100 (estimated time remaining: 3 hours, 8 minutes, 53 seconds)
2025-05-05 21:09:28,869 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:09:34,822 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -88.55608 ± 161.863
2025-05-05 21:09:34,822 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-92.65043, -6.941689, -177.0371, -528.28613, -9.479592, 1.134471, 23.850935, -137.05807, -2.7287164, 43.63542]
2025-05-05 21:09:34,822 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [134.0, 127.0, 200.0, 1000.0, 100.0, 71.0, 42.0, 177.0, 77.0, 95.0]
2025-05-05 21:09:34,828 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 44/100 (estimated time remaining: 3 hours, 5 minutes, 32 seconds)
2025-05-05 21:12:42,319 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:12:48,126 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -94.90726 ± 217.984
2025-05-05 21:12:48,126 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-5.132798, -3.0390708, -52.698643, -742.28156, -0.9194936, -73.25002, 12.730309, -10.552036, -76.18084, 2.2515447]
2025-05-05 21:12:48,126 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [104.0, 141.0, 106.0, 1000.0, 134.0, 129.0, 42.0, 81.0, 147.0, 96.0]
2025-05-05 21:12:48,132 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 45/100 (estimated time remaining: 3 hours, 1 minute, 58 seconds)
2025-05-05 21:15:59,017 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:16:01,633 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -28.65354 ± 47.572
2025-05-05 21:16:01,633 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-15.209711, -62.156826, -97.15795, -79.29809, -2.3731987, 14.955263, 21.736637, 12.632563, 19.26122, -98.92529]
2025-05-05 21:16:01,633 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [84.0, 163.0, 129.0, 136.0, 112.0, 43.0, 45.0, 47.0, 40.0, 169.0]
2025-05-05 21:16:01,639 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 46/100 (estimated time remaining: 2 hours, 57 minutes, 33 seconds)
2025-05-05 21:19:16,061 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:19:27,304 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -52.81139 ± 65.779
2025-05-05 21:19:27,304 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-30.335735, -223.84169, -60.67398, -96.77744, -11.830662, -48.548664, 25.25098, -24.749176, 1.1656855, -57.77318]
2025-05-05 21:19:27,304 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [127.0, 1000.0, 102.0, 1000.0, 94.0, 113.0, 94.0, 1000.0, 121.0, 88.0]
2025-05-05 21:19:27,311 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 47/100 (estimated time remaining: 2 hours, 56 minutes, 4 seconds)
2025-05-05 21:22:45,072 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:22:47,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -25.00029 ± 46.011
2025-05-05 21:22:47,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [21.295189, -31.914501, 13.265493, 41.996674, 17.708986, -10.165308, -56.65664, -84.63196, -68.72925, -92.17161]
2025-05-05 21:22:47,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [68.0, 105.0, 63.0, 62.0, 77.0, 81.0, 139.0, 145.0, 114.0, 172.0]
2025-05-05 21:22:47,843 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 48/100 (estimated time remaining: 2 hours, 54 minutes, 52 seconds)
2025-05-05 21:26:07,478 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:26:15,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -55.26154 ± 93.790
2025-05-05 21:26:15,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [27.988316, -290.83456, -6.1590505, 2.2957282, -80.89052, 25.224478, -55.964794, -27.015387, -148.4937, 1.2341214]
2025-05-05 21:26:15,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [61.0, 1000.0, 99.0, 66.0, 135.0, 96.0, 174.0, 90.0, 1000.0, 76.0]
2025-05-05 21:26:15,721 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 49/100 (estimated time remaining: 2 hours, 53 minutes, 29 seconds)
2025-05-05 21:29:13,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:29:18,877 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -10.78862 ± 34.624
2025-05-05 21:29:18,877 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-64.95223, -10.231947, 16.37086, -34.067574, 24.051966, -73.87764, -11.525114, 32.843513, 12.265557, 1.2364155]
2025-05-05 21:29:18,877 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [170.0, 78.0, 78.0, 104.0, 37.0, 1000.0, 109.0, 84.0, 78.0, 103.0]
2025-05-05 21:29:18,884 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 50/100 (estimated time remaining: 2 hours, 48 minutes, 25 seconds)
2025-05-05 21:32:27,238 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:32:34,615 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -128.19267 ± 262.206
2025-05-05 21:32:34,615 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-26.3229, 10.179845, -43.740696, -160.64326, 12.657618, -9.986993, 23.869328, -105.14953, -896.6795, -86.11062]
2025-05-05 21:32:34,615 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [162.0, 158.0, 144.0, 255.0, 146.0, 95.0, 222.0, 184.0, 1000.0, 193.0]
2025-05-05 21:32:34,622 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 51/100 (estimated time remaining: 2 hours, 45 minutes, 29 seconds)
2025-05-05 21:35:57,594 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:36:03,337 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -79.62191 ± 95.717
2025-05-05 21:36:03,337 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [2.0218856, 13.905114, -87.07, -234.59204, -87.497986, 15.038, -53.057343, -242.68529, 25.840437, -148.12198]
2025-05-05 21:36:03,337 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [75.0, 79.0, 144.0, 1000.0, 150.0, 72.0, 106.0, 168.0, 37.0, 160.0]
2025-05-05 21:36:03,344 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 52/100 (estimated time remaining: 2 hours, 42 minutes, 41 seconds)
2025-05-05 21:39:00,175 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:39:03,326 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -15.89648 ± 51.547
2025-05-05 21:39:03,326 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [46.256725, 21.934198, 4.3619905, -12.99383, 13.699557, -26.080208, -6.5998435, -153.39555, -42.86281, -3.284999]
2025-05-05 21:39:03,326 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [102.0, 113.0, 142.0, 139.0, 74.0, 133.0, 75.0, 183.0, 127.0, 82.0]
2025-05-05 21:39:03,333 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 53/100 (estimated time remaining: 2 hours, 36 minutes, 4 seconds)
2025-05-05 21:42:23,432 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:42:27,004 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -33.08142 ± 65.961
2025-05-05 21:42:27,005 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [7.1115484, -12.708035, -41.201412, 31.333872, 19.931866, -140.08838, 26.16423, -94.94104, -146.42496, 20.008114]
2025-05-05 21:42:27,005 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [143.0, 118.0, 157.0, 90.0, 89.0, 181.0, 103.0, 147.0, 216.0, 80.0]
2025-05-05 21:42:27,012 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 54/100 (estimated time remaining: 2 hours, 32 minutes, 10 seconds)
2025-05-05 21:45:36,791 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:45:43,010 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -68.11633 ± 83.939
2025-05-05 21:45:43,010 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-23.055214, 0.34373742, -265.576, -119.72157, -5.1070194, -96.98863, 26.359919, -138.73668, -15.879745, -42.802128]
2025-05-05 21:45:43,010 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [103.0, 124.0, 1000.0, 181.0, 105.0, 166.0, 41.0, 156.0, 134.0, 127.0]
2025-05-05 21:45:43,017 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 55/100 (estimated time remaining: 2 hours, 30 minutes, 54 seconds)
2025-05-05 21:48:52,907 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:48:58,879 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -39.78967 ± 66.894
2025-05-05 21:48:58,879 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [15.077873, 20.91663, -64.10883, 36.015926, -64.56273, -56.13426, -82.38211, -197.19048, 24.647566, -30.176258]
2025-05-05 21:48:58,879 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [107.0, 40.0, 142.0, 100.0, 1000.0, 107.0, 161.0, 193.0, 80.0, 122.0]
2025-05-05 21:48:58,887 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 56/100 (estimated time remaining: 2 hours, 27 minutes, 38 seconds)
2025-05-05 21:52:19,912 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:52:25,987 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -44.38215 ± 69.826
2025-05-05 21:52:25,987 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-233.20847, -78.71699, -65.77304, 1.7819589, -44.56971, 8.497148, -20.710932, -27.017353, -2.0126112, 17.908476]
2025-05-05 21:52:25,987 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 164.0, 132.0, 88.0, 125.0, 112.0, 117.0, 116.0, 107.0, 132.0]
2025-05-05 21:52:25,995 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 57/100 (estimated time remaining: 2 hours, 24 minutes, 7 seconds)
2025-05-05 21:55:56,367 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:56:05,175 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -75.74852 ± 108.539
2025-05-05 21:56:05,175 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-199.48814, -247.07721, 6.630386, 3.5196362, 8.94714, -249.32442, -97.42564, -26.60671, 42.947903, 0.39184746]
2025-05-05 21:56:05,175 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [223.0, 1000.0, 118.0, 94.0, 114.0, 1000.0, 172.0, 81.0, 60.0, 131.0]
2025-05-05 21:56:05,183 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 58/100 (estimated time remaining: 2 hours, 26 minutes, 27 seconds)
2025-05-05 21:59:24,654 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 21:59:28,450 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -23.33760 ± 31.389
2025-05-05 21:59:28,451 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-52.33348, -78.64321, -5.831301, -27.447866, -2.1584094, 18.212727, 6.850866, 1.204066, -25.906677, -67.3227]
2025-05-05 21:59:28,451 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [150.0, 208.0, 103.0, 160.0, 130.0, 116.0, 140.0, 76.0, 144.0, 165.0]
2025-05-05 21:59:28,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 59/100 (estimated time remaining: 2 hours, 23 minutes)
2025-05-05 22:02:52,539 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:03:12,546 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -187.74925 ± 167.392
2025-05-05 22:03:12,547 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-128.24765, -354.08017, -370.33716, -51.495552, -380.53717, -4.3314376, -232.36552, 13.839599, -389.46524, 19.527672]
2025-05-05 22:03:12,547 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [183.0, 1000.0, 1000.0, 144.0, 1000.0, 125.0, 1000.0, 41.0, 1000.0, 1000.0]
2025-05-05 22:03:12,555 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 60/100 (estimated time remaining: 2 hours, 23 minutes, 26 seconds)
2025-05-05 22:06:35,863 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:06:39,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -53.14469 ± 48.239
2025-05-05 22:06:39,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-20.634083, -93.615234, -142.15947, -122.73403, -54.849476, -15.037736, -30.7868, 14.542322, -48.37699, -17.795387]
2025-05-05 22:06:39,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [103.0, 176.0, 162.0, 153.0, 119.0, 118.0, 145.0, 78.0, 139.0, 162.0]
2025-05-05 22:06:39,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 61/100 (estimated time remaining: 2 hours, 21 minutes, 25 seconds)
2025-05-05 22:10:04,717 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:10:15,401 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -96.75999 ± 153.376
2025-05-05 22:10:15,401 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-25.217888, -308.5031, 19.015076, -451.2807, 28.367842, -164.02318, -6.1646495, -19.986967, -12.711716, -27.094545]
2025-05-05 22:10:15,402 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [88.0, 1000.0, 37.0, 1000.0, 43.0, 1000.0, 80.0, 86.0, 79.0, 119.0]
2025-05-05 22:10:15,410 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 62/100 (estimated time remaining: 2 hours, 19 minutes, 1 second)
2025-05-05 22:13:37,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:13:44,434 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -78.65343 ± 110.395
2025-05-05 22:13:44,434 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-144.98593, -55.532333, -42.812107, -333.09436, 42.512733, 33.445427, -188.35422, -69.118484, -54.64614, 26.051083]
2025-05-05 22:13:44,434 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [139.0, 210.0, 130.0, 1000.0, 108.0, 95.0, 246.0, 114.0, 140.0, 54.0]
2025-05-05 22:13:44,442 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 63/100 (estimated time remaining: 2 hours, 14 minutes, 10 seconds)
2025-05-05 22:17:05,282 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:17:12,527 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -60.80462 ± 158.654
2025-05-05 22:17:12,527 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-12.651895, -526.0609, -20.309246, 28.892408, -40.719913, -24.742083, -68.09004, 4.958947, -8.71882, 59.395306]
2025-05-05 22:17:12,527 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [163.0, 1000.0, 187.0, 137.0, 165.0, 121.0, 229.0, 148.0, 171.0, 204.0]
2025-05-05 22:17:12,535 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 64/100 (estimated time remaining: 2 hours, 11 minutes, 14 seconds)
2025-05-05 22:20:25,687 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:20:31,085 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -14.31206 ± 56.058
2025-05-05 22:20:31,085 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-124.12676, -44.11665, 26.714722, -68.63325, 41.212883, 25.646935, 36.567284, -59.08986, -28.97335, 51.67739]
2025-05-05 22:20:31,085 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [157.0, 168.0, 47.0, 597.0, 134.0, 37.0, 177.0, 214.0, 151.0, 276.0]
2025-05-05 22:20:31,094 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 65/100 (estimated time remaining: 2 hours, 4 minutes, 37 seconds)
2025-05-05 22:23:40,151 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:23:43,591 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -30.62027 ± 43.373
2025-05-05 22:23:43,591 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-97.876526, 21.667791, 21.407549, -23.925436, -72.71411, -32.656307, -100.140564, 10.091458, -23.452892, -8.603655]
2025-05-05 22:23:43,592 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [163.0, 90.0, 98.0, 155.0, 133.0, 148.0, 151.0, 85.0, 128.0, 127.0]
2025-05-05 22:23:43,600 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 66/100 (estimated time remaining: 1 hour, 59 minutes, 28 seconds)
2025-05-05 22:27:04,157 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:27:11,138 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -60.24290 ± 88.887
2025-05-05 22:27:11,138 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-8.840508, 7.7605243, -95.47939, 28.013083, -5.019535, 31.472015, -67.32297, -64.98928, -169.91618, -258.10666]
2025-05-05 22:27:11,138 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [169.0, 96.0, 203.0, 117.0, 119.0, 76.0, 279.0, 121.0, 253.0, 1000.0]
2025-05-05 22:27:11,147 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 67/100 (estimated time remaining: 1 hour, 55 minutes, 7 seconds)
2025-05-05 22:30:19,322 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:30:27,100 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -37.24004 ± 102.640
2025-05-05 22:30:27,100 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [78.29152, -53.087555, 64.8945, -107.98774, 47.32273, 26.657236, -227.26115, 51.437416, -75.17726, -177.49016]
2025-05-05 22:30:27,100 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [204.0, 258.0, 60.0, 247.0, 236.0, 107.0, 1000.0, 132.0, 213.0, 277.0]
2025-05-05 22:30:27,109 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 68/100 (estimated time remaining: 1 hour, 50 minutes, 17 seconds)
2025-05-05 22:33:43,201 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:33:52,934 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -96.28162 ± 187.140
2025-05-05 22:33:52,935 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-3.9875085, 21.177237, -73.77749, -84.03879, -520.5599, 41.44522, -2.9311848, -394.65118, 26.116655, 28.390839]
2025-05-05 22:33:52,935 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [210.0, 45.0, 194.0, 209.0, 1000.0, 194.0, 253.0, 1000.0, 155.0, 47.0]
2025-05-05 22:33:52,944 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 69/100 (estimated time remaining: 1 hour, 46 minutes, 42 seconds)
2025-05-05 22:36:52,475 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:37:02,224 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -163.72334 ± 210.778
2025-05-05 22:37:02,225 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-180.27269, -125.23813, -496.425, 18.314543, 33.141037, -171.97823, -46.80227, -619.1076, -27.449652, -21.415413]
2025-05-05 22:37:02,225 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [263.0, 197.0, 1000.0, 119.0, 59.0, 239.0, 132.0, 1000.0, 156.0, 119.0]
2025-05-05 22:37:02,234 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 70/100 (estimated time remaining: 1 hour, 42 minutes, 25 seconds)
2025-05-05 22:40:22,900 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:40:29,047 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -44.54736 ± 127.396
2025-05-05 22:40:29,047 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [25.594595, -33.117786, 8.716941, -412.8518, 7.94278, 21.594578, 11.246498, -37.625557, 41.149994, -78.1238]
2025-05-05 22:40:29,047 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [118.0, 114.0, 110.0, 1000.0, 121.0, 78.0, 119.0, 115.0, 121.0, 208.0]
2025-05-05 22:40:29,057 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 71/100 (estimated time remaining: 1 hour, 40 minutes, 32 seconds)
2025-05-05 22:43:43,013 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:43:52,074 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -61.42552 ± 110.021
2025-05-05 22:43:52,075 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-328.7144, -14.256301, -45.17839, -35.48204, 21.678871, 15.638408, -209.2876, -45.595512, 24.915716, 2.0260456]
2025-05-05 22:43:52,075 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 122.0, 183.0, 158.0, 104.0, 105.0, 1000.0, 158.0, 150.0, 85.0]
2025-05-05 22:43:52,084 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 72/100 (estimated time remaining: 1 hour, 36 minutes, 45 seconds)
2025-05-05 22:46:51,212 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:47:11,418 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -109.93658 ± 129.880
2025-05-05 22:47:11,418 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [11.981133, -113.42312, -220.53883, 26.597145, -5.457281, -381.23495, -235.68954, -41.425373, 15.746536, -155.92155]
2025-05-05 22:47:11,418 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [279.0, 1000.0, 1000.0, 37.0, 103.0, 1000.0, 1000.0, 1000.0, 149.0, 1000.0]
2025-05-05 22:47:11,427 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 73/100 (estimated time remaining: 1 hour, 33 minutes, 44 seconds)
2025-05-05 22:50:23,433 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:50:39,839 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -181.20580 ± 199.058
2025-05-05 22:50:39,839 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-396.44577, 21.132343, -10.443848, -37.506123, -463.73138, -497.05392, -4.008347, -81.93592, -309.09256, -32.97252]
2025-05-05 22:50:39,839 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 386.0, 131.0, 214.0, 1000.0, 1000.0, 204.0, 355.0, 1000.0, 200.0]
2025-05-05 22:50:39,849 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 74/100 (estimated time remaining: 1 hour, 30 minutes, 37 seconds)
2025-05-05 22:53:48,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:53:57,930 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -70.97527 ± 133.460
2025-05-05 22:53:57,931 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [13.367308, 50.562473, -294.8001, 7.759531, -109.53321, -341.4152, 38.801296, 30.187113, -18.201506, -86.48039]
2025-05-05 22:53:57,931 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [150.0, 60.0, 1000.0, 94.0, 259.0, 1000.0, 109.0, 88.0, 210.0, 203.0]
2025-05-05 22:53:57,941 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 75/100 (estimated time remaining: 1 hour, 28 minutes, 1 second)
2025-05-05 22:57:11,336 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 22:57:21,692 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -99.07933 ± 109.355
2025-05-05 22:57:21,692 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-74.868034, -244.46161, -53.460052, -5.805131, 13.123877, -194.56726, 5.3015513, -314.70926, -110.83226, -10.515093]
2025-05-05 22:57:21,692 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [365.0, 1000.0, 174.0, 164.0, 96.0, 259.0, 106.0, 1000.0, 183.0, 126.0]
2025-05-05 22:57:21,702 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 76/100 (estimated time remaining: 1 hour, 24 minutes, 23 seconds)
2025-05-05 23:00:33,277 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:00:43,825 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -36.70670 ± 151.353
2025-05-05 23:00:43,825 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [38.316372, 20.987535, -31.296204, 43.620506, -475.53723, 89.97889, 1.2819954, -47.298336, 21.588028, -28.708498]
2025-05-05 23:00:43,825 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [229.0, 144.0, 183.0, 92.0, 1000.0, 394.0, 251.0, 198.0, 128.0, 1000.0]
2025-05-05 23:00:43,835 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 77/100 (estimated time remaining: 1 hour, 20 minutes, 56 seconds)
2025-05-05 23:04:04,046 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:04:11,911 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -59.65176 ± 152.003
2025-05-05 23:04:11,911 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-310.362, 35.522095, 16.791235, -77.51356, 32.601204, 37.500698, 21.400549, 23.420832, -398.45947, 22.58075]
2025-05-05 23:04:11,911 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 43.0, 79.0, 210.0, 38.0, 42.0, 48.0, 37.0, 1000.0, 122.0]
2025-05-05 23:04:11,922 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 78/100 (estimated time remaining: 1 hour, 18 minutes, 14 seconds)
2025-05-05 23:07:15,021 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:07:33,849 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -185.48140 ± 167.738
2025-05-05 23:07:33,849 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-91.36717, -255.56848, -362.0955, -60.16969, 5.1814322, -455.42322, -8.713233, -18.37871, -414.0161, -194.26335]
2025-05-05 23:07:33,849 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [393.0, 1000.0, 1000.0, 271.0, 117.0, 1000.0, 313.0, 180.0, 1000.0, 1000.0]
2025-05-05 23:07:33,859 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 79/100 (estimated time remaining: 1 hour, 14 minutes, 21 seconds)
2025-05-05 23:10:53,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:11:01,166 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -40.46321 ± 78.487
2025-05-05 23:11:01,166 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-108.44728, 28.397476, -17.930595, 12.815765, -19.269981, 14.422658, -238.40723, -83.15731, 20.277382, -13.332965]
2025-05-05 23:11:01,166 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [211.0, 96.0, 181.0, 132.0, 149.0, 199.0, 1000.0, 185.0, 129.0, 210.0]
2025-05-05 23:11:01,176 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 80/100 (estimated time remaining: 1 hour, 11 minutes, 37 seconds)
2025-05-05 23:14:07,375 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:14:20,233 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -35.50403 ± 54.700
2025-05-05 23:14:20,233 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-2.676603, -114.04212, -48.33735, -27.887598, -3.1750343, -132.90677, -40.829308, -50.591934, -1.6426008, 67.04905]
2025-05-05 23:14:20,233 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [169.0, 1000.0, 290.0, 142.0, 205.0, 224.0, 1000.0, 1000.0, 161.0, 101.0]
2025-05-05 23:14:20,244 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 81/100 (estimated time remaining: 1 hour, 7 minutes, 54 seconds)
2025-05-05 23:17:27,370 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:17:33,821 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -40.02130 ± 124.040
2025-05-05 23:17:33,821 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-102.777885, -69.36322, 17.973242, -383.83395, 30.456211, 1.307664, 32.62532, 64.6916, -13.43916, 22.14715]
2025-05-05 23:17:33,821 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [176.0, 214.0, 206.0, 1000.0, 56.0, 130.0, 38.0, 88.0, 188.0, 111.0]
2025-05-05 23:17:33,832 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 82/100 (estimated time remaining: 1 hour, 3 minutes, 57 seconds)
2025-05-05 23:20:58,846 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:21:06,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -84.12416 ± 216.590
2025-05-05 23:21:06,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [56.34669, -11.55974, -20.753174, 7.5138, -172.98618, -707.59265, 36.122063, 21.392159, 10.301507, -60.026047]
2025-05-05 23:21:06,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [127.0, 178.0, 223.0, 169.0, 363.0, 1000.0, 127.0, 169.0, 165.0, 181.0]
2025-05-05 23:21:06,672 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 83/100 (estimated time remaining: 1 hour, 53 seconds)
2025-05-05 23:24:05,246 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:24:09,710 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -34.30075 ± 55.753
2025-05-05 23:24:09,710 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-43.810417, -16.819149, 18.538486, -57.14388, -4.0696287, -142.60829, -127.5242, 7.174082, 18.715025, 4.5404305]
2025-05-05 23:24:09,710 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [201.0, 139.0, 79.0, 261.0, 87.0, 255.0, 263.0, 161.0, 41.0, 145.0]
2025-05-05 23:24:09,721 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 84/100 (estimated time remaining: 56 minutes, 25 seconds)
2025-05-05 23:27:18,826 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:27:31,353 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -257.53638 ± 381.046
2025-05-05 23:27:31,354 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-13.042179, -723.9644, -121.010704, 17.792124, 56.1279, -29.174961, -15.779166, -1.1981962, -1029.0652, -716.0491]
2025-05-05 23:27:31,354 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [183.0, 1000.0, 248.0, 85.0, 154.0, 201.0, 203.0, 80.0, 1000.0, 1000.0]
2025-05-05 23:27:31,365 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 85/100 (estimated time remaining: 52 minutes, 48 seconds)
2025-05-05 23:30:45,878 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:30:53,255 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -99.17181 ± 225.234
2025-05-05 23:30:53,256 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-73.21655, -35.958313, -57.738293, -766.8858, -20.7548, 35.49906, -65.85729, 23.233175, -31.36994, 1.3306559]
2025-05-05 23:30:53,256 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [227.0, 175.0, 210.0, 1000.0, 192.0, 139.0, 208.0, 94.0, 173.0, 151.0]
2025-05-05 23:30:53,267 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 86/100 (estimated time remaining: 49 minutes, 39 seconds)
2025-05-05 23:34:08,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:34:11,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: 18.92511 ± 23.638
2025-05-05 23:34:11,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [5.972402, 20.502111, 26.4005, 62.544666, 14.525703, 8.644704, -28.110744, 39.42235, 0.96561235, 38.38383]
2025-05-05 23:34:11,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [157.0, 140.0, 105.0, 176.0, 148.0, 95.0, 118.0, 114.0, 108.0, 209.0]
2025-05-05 23:34:11,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1124 [INFO]: New best (18.93) for latency SparseU15
2025-05-05 23:34:11,845 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1127 [INFO]: saving network
2025-05-05 23:34:11,848 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-ant/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-05 23:34:11,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 87/100 (estimated time remaining: 46 minutes, 34 seconds)
2025-05-05 23:37:16,094 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:37:20,145 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -10.21937 ± 51.100
2025-05-05 23:37:20,145 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [16.035906, -128.63878, 14.865115, 49.161034, -30.981468, -51.694008, 36.021225, 14.296776, -44.93656, 23.67704]
2025-05-05 23:37:20,145 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [188.0, 212.0, 116.0, 117.0, 130.0, 215.0, 135.0, 71.0, 143.0, 170.0]
2025-05-05 23:37:20,157 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 88/100 (estimated time remaining: 42 minutes, 11 seconds)
2025-05-05 23:40:26,949 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:40:34,822 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -61.29066 ± 145.862
2025-05-05 23:40:34,822 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [44.84289, -133.21118, -8.084203, 19.91685, 28.463856, 29.410294, -456.2641, 40.846714, -125.43976, -53.387882]
2025-05-05 23:40:34,822 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [131.0, 197.0, 116.0, 127.0, 101.0, 109.0, 522.0, 195.0, 217.0, 1000.0]
2025-05-05 23:40:34,834 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 89/100 (estimated time remaining: 39 minutes, 24 seconds)
2025-05-05 23:44:02,599 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:44:14,611 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -161.89507 ± 244.758
2025-05-05 23:44:14,611 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-575.70996, 4.142659, -191.54495, -120.72956, -687.47473, 21.973812, -19.756788, -68.800964, 5.2758346, 13.67383]
2025-05-05 23:44:14,611 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 269.0, 364.0, 351.0, 1000.0, 39.0, 240.0, 385.0, 202.0, 228.0]
2025-05-05 23:44:14,623 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 90/100 (estimated time remaining: 36 minutes, 47 seconds)
2025-05-05 23:47:06,586 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:47:16,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -118.48338 ± 290.215
2025-05-05 23:47:16,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [22.687712, -296.39642, -935.8046, 39.782448, 26.546793, 37.47197, 30.473614, 2.4317775, -117.78486, 5.7577286]
2025-05-05 23:47:16,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [94.0, 1000.0, 1000.0, 197.0, 138.0, 212.0, 51.0, 120.0, 342.0, 183.0]
2025-05-05 23:47:16,562 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 91/100 (estimated time remaining: 32 minutes, 46 seconds)
2025-05-05 23:50:30,293 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:50:37,513 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -4.27340 ± 60.367
2025-05-05 23:50:37,514 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-98.44729, 24.398254, -6.186311, 47.843987, 60.501057, -107.568214, 23.66866, -68.97864, 47.945087, 34.089436]
2025-05-05 23:50:37,514 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [319.0, 280.0, 289.0, 246.0, 141.0, 364.0, 173.0, 209.0, 192.0, 417.0]
2025-05-05 23:50:37,526 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 92/100 (estimated time remaining: 29 minutes, 34 seconds)
2025-05-05 23:53:43,357 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:53:59,419 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -226.03647 ± 351.599
2025-05-05 23:53:59,419 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-995.84265, -504.71988, -23.636787, 28.379478, 56.256184, -66.68518, -685.3587, 42.531994, 28.043371, -139.3325]
2025-05-05 23:53:59,419 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 1000.0, 218.0, 38.0, 370.0, 434.0, 1000.0, 145.0, 46.0, 1000.0]
2025-05-05 23:53:59,432 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 93/100 (estimated time remaining: 26 minutes, 38 seconds)
2025-05-05 23:57:12,352 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-05 23:57:23,488 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -140.11089 ± 306.220
2025-05-05 23:57:23,489 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-86.83092, -149.16846, -0.28026497, 28.214428, 46.62567, -1034.3688, -147.26144, 24.266388, 12.116966, -94.4225]
2025-05-05 23:57:23,489 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [227.0, 484.0, 90.0, 39.0, 180.0, 1000.0, 1000.0, 123.0, 152.0, 457.0]
2025-05-05 23:57:23,499 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 94/100 (estimated time remaining: 23 minutes, 32 seconds)
2025-05-06 00:00:31,466 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:00:48,877 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -290.97345 ± 490.551
2025-05-06 00:00:48,877 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-85.86994, 22.562342, -830.60706, 12.341297, -9.74905, -181.43425, -1578.6218, -23.644056, -86.15476, -148.55737]
2025-05-06 00:00:48,877 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [288.0, 227.0, 1000.0, 592.0, 187.0, 1000.0, 1000.0, 191.0, 283.0, 1000.0]
2025-05-06 00:00:48,889 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 95/100 (estimated time remaining: 19 minutes, 53 seconds)
2025-05-06 00:03:58,965 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:04:19,361 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -98.90659 ± 178.359
2025-05-06 00:04:19,362 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-4.7115426, -95.51302, -76.80611, -583.12335, 63.866295, -39.472267, 47.728703, 18.250034, -130.95023, -188.33446]
2025-05-06 00:04:19,362 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [432.0, 1000.0, 935.0, 1000.0, 340.0, 804.0, 201.0, 37.0, 1000.0, 1000.0]
2025-05-06 00:04:19,374 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 96/100 (estimated time remaining: 17 minutes, 2 seconds)
2025-05-06 00:07:41,584 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:07:52,600 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -258.43256 ± 414.623
2025-05-06 00:07:52,600 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-1225.9163, -267.6171, 16.591898, 15.390534, 41.853954, -30.654436, -67.05749, -888.15985, -88.65131, -90.105286]
2025-05-06 00:07:52,600 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 348.0, 39.0, 101.0, 112.0, 334.0, 221.0, 1000.0, 158.0, 434.0]
2025-05-06 00:07:52,612 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 97/100 (estimated time remaining: 13 minutes, 48 seconds)
2025-05-06 00:10:52,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:11:05,589 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -253.34651 ± 410.649
2025-05-06 00:11:05,589 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-80.55468, 23.19043, -1122.5007, -122.11555, 28.065826, -280.87665, 34.121243, -981.09796, -49.961597, 18.264654]
2025-05-06 00:11:05,589 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [336.0, 102.0, 1000.0, 199.0, 239.0, 1000.0, 39.0, 1000.0, 379.0, 205.0]
2025-05-06 00:11:05,602 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 98/100 (estimated time remaining: 10 minutes, 15 seconds)
2025-05-06 00:14:14,998 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:14:36,181 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -347.22028 ± 364.393
2025-05-06 00:14:36,181 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-28.219738, -50.92588, -158.95842, -997.6694, -957.8255, -203.05962, -458.2518, -66.76744, -579.2261, 28.700815]
2025-05-06 00:14:36,181 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [904.0, 315.0, 327.0, 1000.0, 1000.0, 1000.0, 1000.0, 351.0, 1000.0, 39.0]
2025-05-06 00:14:36,194 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 99/100 (estimated time remaining: 6 minutes, 53 seconds)
2025-05-06 00:17:48,841 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:18:02,402 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -198.58005 ± 369.816
2025-05-06 00:18:02,403 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-203.55537, -58.405304, -1100.7623, 66.95871, -695.61487, -107.72038, 67.41145, 2.5353856, 22.311054, 21.04111]
2025-05-06 00:18:02,403 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [1000.0, 242.0, 1000.0, 160.0, 1000.0, 425.0, 254.0, 175.0, 263.0, 38.0]
2025-05-06 00:18:02,415 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1097 [INFO]: Iteration 100/100 (estimated time remaining: 3 minutes, 26 seconds)
2025-05-06 00:21:13,852 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:21:27,934 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1119 [DEBUG]: Total Reward: -77.54247 ± 73.855
2025-05-06 00:21:27,935 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1120 [DEBUG]: All rewards: [-38.773453, -49.812775, -220.78882, -49.602654, -117.10466, -69.20118, -158.1251, 19.654917, 29.61955, -121.29039]
2025-05-06 00:21:27,935 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1121 [DEBUG]: All trajectory lengths: [530.0, 174.0, 1000.0, 266.0, 443.0, 1000.0, 895.0, 88.0, 64.0, 324.0]
2025-05-06 00:21:27,948 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-ant):1149 [DEBUG]: Training session finished
