2025-05-06 17:59:08,784 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1006 [DEBUG]: logdir: _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32
2025-05-06 17:59:08,784 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1007 [DEBUG]: trainer_prefix: benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32
2025-05-06 17:59:08,784 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1008 [DEBUG]: args.trainer_eval_latencies: {'SparseU15': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x7a93da3c4d00>}
2025-05-06 17:59:08,784 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1009 [DEBUG]: using device: cpu
2025-05-06 17:59:08,791 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1031 [INFO]: Creating new trainer
2025-05-06 17:59:08,798 baseline-sac-noisy-walker2d:105 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=209, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=6, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(6,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=6, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(6,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2., 2., 2., 2.]]), shift: tensor([[-1., -1., -1., -1., -1., -1.]]))
)
2025-05-06 17:59:08,798 baseline-sac-noisy-walker2d:106 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=215, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-05-06 17:59:09,045 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1092 [DEBUG]: Starting training session...
2025-05-06 17:59:09,045 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 1/100
2025-05-06 18:01:55,796 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:01:56,911 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 29.54799 ± 9.984
2025-05-06 18:01:56,912 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [34.67847, 43.11751, 20.370731, 50.426495, 29.471903, 28.002321, 18.372427, 28.384007, 22.431656, 20.224407]
2025-05-06 18:01:56,912 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [56.0, 67.0, 32.0, 70.0, 59.0, 54.0, 30.0, 48.0, 31.0, 32.0]
2025-05-06 18:01:56,912 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (29.55) for latency SparseU15
2025-05-06 18:01:56,912 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:01:56,916 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:01:56,922 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 2/100 (estimated time remaining: 4 hours, 36 minutes, 59 seconds)
2025-05-06 18:04:52,131 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:04:53,100 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 28.06901 ± 6.212
2025-05-06 18:04:53,101 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [30.725798, 33.54988, 33.723923, 20.067856, 25.322424, 30.701391, 28.815891, 17.60561, 37.83913, 22.338186]
2025-05-06 18:04:53,101 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [46.0, 47.0, 48.0, 30.0, 39.0, 45.0, 50.0, 29.0, 48.0, 33.0]
2025-05-06 18:04:53,102 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 3/100 (estimated time remaining: 4 hours, 40 minutes, 58 seconds)
2025-05-06 18:07:52,532 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:07:54,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 78.25462 ± 64.447
2025-05-06 18:07:54,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [75.23557, 15.15312, 59.065266, 101.38608, 55.64931, 251.9511, 41.40217, 80.34191, 11.620798, 90.74082]
2025-05-06 18:07:54,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [87.0, 32.0, 90.0, 125.0, 62.0, 174.0, 57.0, 154.0, 22.0, 91.0]
2025-05-06 18:07:54,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (78.25) for latency SparseU15
2025-05-06 18:07:54,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:07:54,632 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:07:54,638 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 4/100 (estimated time remaining: 4 hours, 43 minutes, 14 seconds)
2025-05-06 18:10:48,452 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:10:49,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 32.06873 ± 7.567
2025-05-06 18:10:49,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [32.793076, 25.375769, 24.54085, 16.211798, 37.91489, 41.400967, 35.020668, 40.80041, 31.10435, 35.524467]
2025-05-06 18:10:49,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [42.0, 36.0, 36.0, 31.0, 45.0, 46.0, 41.0, 44.0, 40.0, 42.0]
2025-05-06 18:10:49,399 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 5/100 (estimated time remaining: 4 hours, 40 minutes, 8 seconds)
2025-05-06 18:13:45,595 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:13:48,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 85.33461 ± 93.509
2025-05-06 18:13:48,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [47.22067, 49.94626, 36.22739, 43.15573, 27.894892, 48.685337, 273.0905, 269.9945, 21.362034, 35.7687]
2025-05-06 18:13:48,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [101.0, 119.0, 99.0, 94.0, 75.0, 91.0, 211.0, 181.0, 137.0, 93.0]
2025-05-06 18:13:48,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (85.33) for latency SparseU15
2025-05-06 18:13:48,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:13:48,432 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:13:48,439 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 6/100 (estimated time remaining: 4 hours, 38 minutes, 28 seconds)
2025-05-06 18:16:44,871 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:16:47,036 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 93.56932 ± 85.417
2025-05-06 18:16:47,037 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [86.670815, 20.307121, 197.07034, 16.263142, 102.61257, 291.46542, 110.03868, 21.321075, 71.40729, 18.536818]
2025-05-06 18:16:47,037 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [145.0, 32.0, 127.0, 28.0, 133.0, 177.0, 102.0, 34.0, 106.0, 35.0]
2025-05-06 18:16:47,037 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (93.57) for latency SparseU15
2025-05-06 18:16:47,037 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:16:47,041 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:16:47,048 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 7/100 (estimated time remaining: 4 hours, 38 minutes, 54 seconds)
2025-05-06 18:19:44,228 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:19:47,293 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 153.57352 ± 135.369
2025-05-06 18:19:47,293 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [280.7898, 17.891447, 49.366795, 61.570942, 23.47802, 350.91583, 392.96304, 201.26807, 115.36654, 42.124756]
2025-05-06 18:19:47,293 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [196.0, 32.0, 63.0, 143.0, 32.0, 194.0, 257.0, 126.0, 116.0, 136.0]
2025-05-06 18:19:47,294 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (153.57) for latency SparseU15
2025-05-06 18:19:47,294 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:19:47,297 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:19:47,305 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 8/100 (estimated time remaining: 4 hours, 37 minutes, 12 seconds)
2025-05-06 18:22:45,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:22:48,606 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 153.90198 ± 102.273
2025-05-06 18:22:48,606 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [263.62994, 18.707027, 233.72243, 16.452446, 260.2472, 106.29628, 212.64702, 151.14673, 263.5479, 12.622791]
2025-05-06 18:22:48,606 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [160.0, 32.0, 123.0, 28.0, 171.0, 200.0, 125.0, 95.0, 150.0, 24.0]
2025-05-06 18:22:48,607 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (153.90) for latency SparseU15
2025-05-06 18:22:48,607 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:22:48,610 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:22:48,618 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 9/100 (estimated time remaining: 4 hours, 34 minutes, 9 seconds)
2025-05-06 18:25:45,715 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:25:48,552 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 50.23323 ± 16.263
2025-05-06 18:25:48,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [80.41243, 52.13, 64.72657, 56.632042, 29.98883, 39.631588, 29.709862, 60.796574, 57.392754, 30.911642]
2025-05-06 18:25:48,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [194.0, 106.0, 67.0, 156.0, 192.0, 105.0, 129.0, 142.0, 68.0, 42.0]
2025-05-06 18:25:48,555 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 10/100 (estimated time remaining: 4 hours, 32 minutes, 44 seconds)
2025-05-06 18:28:45,719 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:28:47,146 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 55.65294 ± 22.700
2025-05-06 18:28:47,146 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [46.86518, 53.549797, 78.92481, 55.160507, 59.29964, 107.80437, 49.17034, 18.680038, 36.689217, 50.385487]
2025-05-06 18:28:47,147 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [58.0, 60.0, 91.0, 63.0, 62.0, 86.0, 54.0, 29.0, 50.0, 56.0]
2025-05-06 18:28:47,149 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 11/100 (estimated time remaining: 4 hours, 29 minutes, 36 seconds)
2025-05-06 18:31:44,633 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:31:48,702 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 198.78592 ± 121.241
2025-05-06 18:31:48,702 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [107.97389, 384.30844, 315.08612, 66.42059, 373.19284, 221.01411, 236.78918, 91.565926, 31.723679, 159.78455]
2025-05-06 18:31:48,702 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [198.0, 223.0, 199.0, 102.0, 343.0, 141.0, 137.0, 120.0, 40.0, 199.0]
2025-05-06 18:31:48,702 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (198.79) for latency SparseU15
2025-05-06 18:31:48,702 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 18:31:48,707 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 18:31:48,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 12/100 (estimated time remaining: 4 hours, 27 minutes, 29 seconds)
2025-05-06 18:34:46,891 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:34:49,654 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 148.00679 ± 91.799
2025-05-06 18:34:49,654 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [19.725674, 194.62471, 222.15533, 18.782911, 267.98737, 158.44162, 209.50702, 135.87105, 236.22464, 16.747574]
2025-05-06 18:34:49,654 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [34.0, 118.0, 140.0, 33.0, 146.0, 273.0, 131.0, 103.0, 163.0, 28.0]
2025-05-06 18:34:49,656 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 13/100 (estimated time remaining: 4 hours, 24 minutes, 41 seconds)
2025-05-06 18:37:47,015 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:37:49,853 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 171.26709 ± 82.340
2025-05-06 18:37:49,853 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [238.76851, 22.77704, 140.74649, 95.116585, 214.37407, 237.62463, 266.29895, 241.12872, 55.352867, 200.4831]
2025-05-06 18:37:49,853 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [137.0, 34.0, 96.0, 136.0, 170.0, 133.0, 163.0, 146.0, 64.0, 126.0]
2025-05-06 18:37:49,856 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 14/100 (estimated time remaining: 4 hours, 21 minutes, 21 seconds)
2025-05-06 18:40:48,908 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:40:51,210 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 156.23465 ± 99.031
2025-05-06 18:40:51,211 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [310.90005, 183.01021, 163.09566, 16.630096, 174.55525, 21.057436, 239.92532, 181.44742, 251.65851, 20.066576]
2025-05-06 18:40:51,211 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [171.0, 100.0, 111.0, 27.0, 115.0, 34.0, 152.0, 103.0, 136.0, 31.0]
2025-05-06 18:40:51,213 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 15/100 (estimated time remaining: 4 hours, 18 minutes, 45 seconds)
2025-05-06 18:43:49,882 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:43:52,964 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 193.52652 ± 87.747
2025-05-06 18:43:52,964 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [266.57724, 25.800528, 250.28503, 80.02757, 190.86913, 147.61977, 151.99213, 310.9761, 287.91602, 223.20166]
2025-05-06 18:43:52,964 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [151.0, 36.0, 127.0, 142.0, 118.0, 94.0, 185.0, 171.0, 159.0, 122.0]
2025-05-06 18:43:52,967 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 16/100 (estimated time remaining: 4 hours, 16 minutes, 38 seconds)
2025-05-06 18:46:50,983 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:46:54,251 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 133.16864 ± 129.686
2025-05-06 18:46:54,252 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [15.953216, 45.555458, 91.20044, 449.0322, 99.447365, 68.17437, 262.06082, 209.19803, 12.952158, 78.11229]
2025-05-06 18:46:54,252 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [28.0, 114.0, 74.0, 316.0, 155.0, 66.0, 335.0, 117.0, 25.0, 139.0]
2025-05-06 18:46:54,254 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 17/100 (estimated time remaining: 4 hours, 13 minutes, 33 seconds)
2025-05-06 18:49:50,781 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:49:53,585 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 149.66403 ± 88.631
2025-05-06 18:49:53,585 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [135.81522, 121.87161, 76.649796, 272.19925, 114.53321, 18.587986, 43.28043, 213.70377, 209.52753, 290.47144]
2025-05-06 18:49:53,585 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [112.0, 151.0, 66.0, 176.0, 130.0, 30.0, 62.0, 166.0, 117.0, 179.0]
2025-05-06 18:49:53,588 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 18/100 (estimated time remaining: 4 hours, 10 minutes, 5 seconds)
2025-05-06 18:52:52,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:52:54,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 120.88413 ± 89.723
2025-05-06 18:52:54,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [17.471012, 41.522644, 196.45903, 188.4322, 211.96516, 21.259775, 60.72772, 76.319336, 103.217674, 291.4668]
2025-05-06 18:52:54,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [31.0, 60.0, 105.0, 116.0, 120.0, 32.0, 164.0, 148.0, 96.0, 183.0]
2025-05-06 18:52:54,612 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 19/100 (estimated time remaining: 4 hours, 7 minutes, 18 seconds)
2025-05-06 18:55:56,011 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:55:58,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 158.54297 ± 114.758
2025-05-06 18:55:58,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [20.55286, 389.35397, 145.39401, 168.31982, 16.039589, 217.05786, 245.58206, 16.949886, 125.85817, 240.32153]
2025-05-06 18:55:58,428 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [33.0, 239.0, 97.0, 97.0, 28.0, 117.0, 147.0, 30.0, 96.0, 141.0]
2025-05-06 18:55:58,432 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 20/100 (estimated time remaining: 4 hours, 4 minutes, 56 seconds)
2025-05-06 18:58:54,718 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 18:58:58,134 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 188.05266 ± 78.982
2025-05-06 18:58:58,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [198.42162, 235.5496, 261.94543, 245.1006, 195.24286, 191.08844, 266.27057, 60.477776, 19.278929, 207.1508]
2025-05-06 18:58:58,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [113.0, 136.0, 159.0, 148.0, 107.0, 322.0, 159.0, 140.0, 29.0, 123.0]
2025-05-06 18:58:58,138 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 21/100 (estimated time remaining: 4 hours, 1 minute, 22 seconds)
2025-05-06 19:01:59,842 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:02:02,148 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 150.15646 ± 77.679
2025-05-06 19:02:02,148 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [122.3527, 197.64795, 179.5682, 208.67645, 251.86705, 14.709399, 17.63006, 207.28207, 109.06072, 192.7702]
2025-05-06 19:02:02,148 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [82.0, 132.0, 106.0, 119.0, 136.0, 28.0, 32.0, 116.0, 106.0, 122.0]
2025-05-06 19:02:02,152 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 22/100 (estimated time remaining: 3 hours, 59 minutes, 4 seconds)
2025-05-06 19:04:59,595 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:05:03,365 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 178.44542 ± 94.871
2025-05-06 19:05:03,365 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [21.596956, 126.051186, 296.2055, 228.3736, 294.5417, 81.94199, 52.54118, 223.27032, 242.93703, 216.9947]
2025-05-06 19:05:03,365 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [33.0, 199.0, 189.0, 174.0, 152.0, 166.0, 115.0, 124.0, 299.0, 133.0]
2025-05-06 19:05:03,369 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 23/100 (estimated time remaining: 3 hours, 56 minutes, 32 seconds)
2025-05-06 19:07:59,270 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:08:02,739 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 191.25333 ± 133.844
2025-05-06 19:08:02,739 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [386.97812, 59.066753, 188.22076, 15.677624, 262.55875, 97.37199, 255.616, 234.85944, 396.01727, 16.166464]
2025-05-06 19:08:02,739 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [228.0, 140.0, 121.0, 31.0, 147.0, 227.0, 145.0, 141.0, 255.0, 26.0]
2025-05-06 19:08:02,743 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 24/100 (estimated time remaining: 3 hours, 53 minutes, 5 seconds)
2025-05-06 19:11:03,190 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:11:05,858 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 155.88553 ± 68.709
2025-05-06 19:11:05,858 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [198.63623, 218.71623, 207.13066, 65.53399, 212.84534, 105.229515, 132.91788, 262.16122, 57.46034, 98.22402]
2025-05-06 19:11:05,858 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [119.0, 119.0, 113.0, 146.0, 116.0, 137.0, 89.0, 142.0, 63.0, 91.0]
2025-05-06 19:11:05,862 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 25/100 (estimated time remaining: 3 hours, 49 minutes, 52 seconds)
2025-05-06 19:14:01,831 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:14:04,668 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 176.81094 ± 101.530
2025-05-06 19:14:04,669 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [18.07435, 275.98685, 321.54373, 121.92103, 94.26042, 227.55836, 190.57524, 215.06288, 277.60455, 25.522131]
2025-05-06 19:14:04,669 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [28.0, 156.0, 218.0, 142.0, 92.0, 150.0, 100.0, 124.0, 148.0, 35.0]
2025-05-06 19:14:04,673 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 26/100 (estimated time remaining: 3 hours, 46 minutes, 38 seconds)
2025-05-06 19:17:01,564 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:17:03,680 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 119.43718 ± 101.897
2025-05-06 19:17:03,680 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [12.773997, 279.6969, 97.028564, 14.507984, 40.68008, 42.952248, 127.09415, 275.5973, 60.816948, 243.2237]
2025-05-06 19:17:03,680 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [30.0, 182.0, 87.0, 25.0, 50.0, 53.0, 116.0, 135.0, 67.0, 138.0]
2025-05-06 19:17:03,684 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 27/100 (estimated time remaining: 3 hours, 42 minutes, 22 seconds)
2025-05-06 19:20:04,280 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:20:07,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 257.17953 ± 73.416
2025-05-06 19:20:07,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [305.5288, 231.62189, 252.4748, 414.6728, 291.69604, 224.86636, 210.79402, 163.45026, 160.24228, 316.44812]
2025-05-06 19:20:07,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [158.0, 124.0, 131.0, 205.0, 203.0, 123.0, 114.0, 112.0, 127.0, 197.0]
2025-05-06 19:20:07,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1124 [INFO]: New best (257.18) for latency SparseU15
2025-05-06 19:20:07,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1127 [INFO]: saving network
2025-05-06 19:20:07,840 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-walker2d/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 19:20:07,850 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 28/100 (estimated time remaining: 3 hours, 40 minutes, 5 seconds)
2025-05-06 19:23:17,892 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:23:21,659 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 213.61519 ± 50.359
2025-05-06 19:23:21,660 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [195.09198, 161.3615, 186.33853, 249.22394, 191.96223, 297.72086, 261.1686, 117.27596, 238.3382, 237.66988]
2025-05-06 19:23:21,660 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [125.0, 102.0, 269.0, 140.0, 116.0, 163.0, 158.0, 203.0, 168.0, 132.0]
2025-05-06 19:23:21,665 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 29/100 (estimated time remaining: 3 hours, 40 minutes, 32 seconds)
2025-05-06 19:26:32,072 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:26:34,795 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 144.08224 ± 84.327
2025-05-06 19:26:34,795 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [248.44492, 107.68132, 171.20972, 141.50826, 141.31857, 15.601153, 235.50426, 99.48078, 265.0499, 15.023587]
2025-05-06 19:26:34,795 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [135.0, 84.0, 112.0, 104.0, 108.0, 27.0, 118.0, 88.0, 338.0, 25.0]
2025-05-06 19:26:34,799 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 30/100 (estimated time remaining: 3 hours, 39 minutes, 50 seconds)
2025-05-06 19:29:45,183 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:29:48,021 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 178.75787 ± 91.627
2025-05-06 19:29:48,021 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [183.83488, 220.80284, 98.31949, 291.24588, 232.4078, 21.531166, 240.04318, 249.04663, 22.704782, 227.642]
2025-05-06 19:29:48,021 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [137.0, 128.0, 153.0, 167.0, 128.0, 32.0, 139.0, 128.0, 33.0, 148.0]
2025-05-06 19:29:48,026 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 31/100 (estimated time remaining: 3 hours, 40 minutes, 6 seconds)
2025-05-06 19:32:57,832 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:33:00,932 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 169.17642 ± 80.474
2025-05-06 19:33:00,932 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [189.1615, 12.464557, 191.02596, 130.56934, 138.16258, 310.79132, 241.81964, 243.46042, 99.67984, 134.62915]
2025-05-06 19:33:00,932 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [113.0, 27.0, 134.0, 180.0, 98.0, 206.0, 187.0, 149.0, 101.0, 110.0]
2025-05-06 19:33:00,937 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 32/100 (estimated time remaining: 3 hours, 40 minutes, 10 seconds)
2025-05-06 19:36:13,232 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:36:15,516 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 101.83400 ± 93.915
2025-05-06 19:36:15,516 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [103.99734, 20.233, 185.20236, 274.1832, 20.123487, 17.380772, 138.93877, 224.94156, 20.903198, 12.436338]
2025-05-06 19:36:15,516 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [123.0, 30.0, 115.0, 163.0, 31.0, 28.0, 151.0, 260.0, 32.0, 24.0]
2025-05-06 19:36:15,521 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 33/100 (estimated time remaining: 3 hours, 39 minutes, 20 seconds)
2025-05-06 19:39:23,038 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:39:26,018 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 182.55136 ± 133.412
2025-05-06 19:39:26,018 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [12.193849, 23.263237, 242.39987, 16.155966, 256.69974, 421.1212, 85.475655, 203.23795, 275.91275, 289.0535]
2025-05-06 19:39:26,018 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [23.0, 34.0, 138.0, 28.0, 145.0, 309.0, 126.0, 124.0, 154.0, 166.0]
2025-05-06 19:39:26,023 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 34/100 (estimated time remaining: 3 hours, 35 minutes, 22 seconds)
2025-05-06 19:42:36,642 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:42:40,056 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 215.57280 ± 83.812
2025-05-06 19:42:40,056 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [302.10355, 157.02307, 323.27768, 279.2163, 235.80269, 21.965736, 152.70888, 197.59608, 241.6156, 244.41849]
2025-05-06 19:42:40,056 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [175.0, 94.0, 198.0, 191.0, 141.0, 31.0, 89.0, 182.0, 172.0, 160.0]
2025-05-06 19:42:40,061 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 35/100 (estimated time remaining: 3 hours, 32 minutes, 21 seconds)
2025-05-06 19:45:50,261 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:45:52,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 117.08423 ± 100.890
2025-05-06 19:45:52,003 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [14.495285, 278.82852, 192.56316, 24.46849, 182.5892, 232.37456, 22.987862, 19.869007, 186.31139, 16.35481]
2025-05-06 19:45:52,003 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [24.0, 135.0, 104.0, 35.0, 107.0, 118.0, 34.0, 32.0, 107.0, 33.0]
2025-05-06 19:45:52,008 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 36/100 (estimated time remaining: 3 hours, 28 minutes, 51 seconds)
2025-05-06 19:49:01,494 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:49:04,397 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 142.21681 ± 91.123
2025-05-06 19:49:04,397 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [104.14285, 209.58777, 225.83585, 20.081518, 11.220604, 171.35123, 258.96082, 17.335985, 181.85585, 221.79561]
2025-05-06 19:49:04,397 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [178.0, 141.0, 122.0, 36.0, 25.0, 147.0, 149.0, 34.0, 264.0, 122.0]
2025-05-06 19:49:04,402 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 37/100 (estimated time remaining: 3 hours, 25 minutes, 32 seconds)
2025-05-06 19:52:14,182 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:52:16,957 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 210.38029 ± 27.612
2025-05-06 19:52:16,957 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [224.0796, 204.9924, 213.43213, 220.78903, 207.0319, 192.22565, 190.87437, 160.98485, 214.92021, 274.4728]
2025-05-06 19:52:16,957 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [120.0, 109.0, 106.0, 123.0, 110.0, 104.0, 108.0, 95.0, 128.0, 163.0]
2025-05-06 19:52:16,963 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 38/100 (estimated time remaining: 3 hours, 21 minutes, 54 seconds)
2025-05-06 19:55:27,683 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:55:30,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 198.79851 ± 103.177
2025-05-06 19:55:30,867 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [11.088118, 216.95387, 313.47363, 20.6077, 179.81946, 292.26804, 151.53366, 270.50223, 281.98083, 249.75755]
2025-05-06 19:55:30,867 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [24.0, 188.0, 207.0, 30.0, 107.0, 161.0, 143.0, 157.0, 176.0, 140.0]
2025-05-06 19:55:30,872 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 39/100 (estimated time remaining: 3 hours, 19 minutes, 24 seconds)
2025-05-06 19:58:40,085 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 19:58:42,927 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 198.53975 ± 92.543
2025-05-06 19:58:42,927 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [265.93848, 196.04507, 212.00893, 266.6562, 71.063385, 155.89622, 212.11795, 235.14433, 353.08142, 17.445402]
2025-05-06 19:58:42,927 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [151.0, 123.0, 116.0, 142.0, 75.0, 101.0, 136.0, 121.0, 195.0, 31.0]
2025-05-06 19:58:42,933 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 40/100 (estimated time remaining: 3 hours, 15 minutes, 47 seconds)
2025-05-06 20:01:54,276 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:01:57,369 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 219.31804 ± 76.803
2025-05-06 20:01:57,369 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [262.9034, 246.31708, 284.46036, 222.79352, 19.663984, 326.7899, 215.1029, 210.1503, 205.93486, 199.06425]
2025-05-06 20:01:57,369 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [152.0, 137.0, 165.0, 123.0, 33.0, 201.0, 128.0, 112.0, 125.0, 111.0]
2025-05-06 20:01:57,375 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 41/100 (estimated time remaining: 3 hours, 13 minutes, 4 seconds)
2025-05-06 20:05:06,023 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:05:09,700 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 252.20149 ± 88.843
2025-05-06 20:05:09,700 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [267.69235, 271.0415, 256.68604, 354.41864, 226.00732, 21.398941, 277.19736, 199.72629, 334.98306, 312.86325]
2025-05-06 20:05:09,700 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [168.0, 161.0, 148.0, 207.0, 121.0, 32.0, 167.0, 116.0, 214.0, 198.0]
2025-05-06 20:05:09,706 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 42/100 (estimated time remaining: 3 hours, 9 minutes, 50 seconds)
2025-05-06 20:08:19,380 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:08:22,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 186.97897 ± 93.105
2025-05-06 20:08:22,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [233.3038, 175.83315, 215.25249, 212.05109, 271.97903, 23.934357, 190.94041, 213.11087, 12.328627, 321.0559]
2025-05-06 20:08:22,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [140.0, 98.0, 119.0, 120.0, 165.0, 34.0, 111.0, 119.0, 23.0, 218.0]
2025-05-06 20:08:22,125 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 43/100 (estimated time remaining: 3 hours, 6 minutes, 35 seconds)
2025-05-06 20:11:33,548 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:11:36,652 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 211.80014 ± 106.821
2025-05-06 20:11:36,652 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [230.25285, 251.77873, 253.03853, 211.24217, 250.19797, 21.54596, 14.593048, 351.99255, 330.82153, 202.53818]
2025-05-06 20:11:36,652 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [131.0, 139.0, 152.0, 117.0, 136.0, 34.0, 27.0, 215.0, 216.0, 120.0]
2025-05-06 20:11:36,659 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 44/100 (estimated time remaining: 3 hours, 3 minutes, 29 seconds)
2025-05-06 20:14:44,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:14:46,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 193.03320 ± 88.278
2025-05-06 20:14:46,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [217.59991, 264.77997, 240.47458, 22.566645, 14.9873085, 237.98291, 251.4252, 233.77477, 212.45592, 234.28477]
2025-05-06 20:14:46,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [121.0, 150.0, 136.0, 33.0, 25.0, 137.0, 143.0, 135.0, 119.0, 127.0]
2025-05-06 20:14:46,809 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 45/100 (estimated time remaining: 2 hours, 59 minutes, 55 seconds)
2025-05-06 20:17:56,590 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:17:59,076 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 182.86295 ± 88.816
2025-05-06 20:17:59,076 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [216.74327, 20.214697, 185.77328, 213.28929, 9.847964, 216.94391, 196.0104, 215.24046, 267.3221, 287.24402]
2025-05-06 20:17:59,076 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [114.0, 30.0, 103.0, 113.0, 20.0, 117.0, 106.0, 118.0, 149.0, 167.0]
2025-05-06 20:17:59,083 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 46/100 (estimated time remaining: 2 hours, 56 minutes, 18 seconds)
2025-05-06 20:21:08,174 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:21:11,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 234.52808 ± 50.968
2025-05-06 20:21:11,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [242.89662, 188.48836, 237.28781, 274.64316, 153.4242, 258.82648, 299.03174, 165.02803, 309.78473, 215.86964]
2025-05-06 20:21:11,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [135.0, 111.0, 139.0, 160.0, 95.0, 146.0, 175.0, 91.0, 194.0, 118.0]
2025-05-06 20:21:11,466 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 47/100 (estimated time remaining: 2 hours, 53 minutes, 7 seconds)
2025-05-06 20:24:21,960 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:24:24,280 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 169.69696 ± 80.128
2025-05-06 20:24:24,280 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [20.622528, 284.31485, 218.07045, 22.416502, 187.62236, 228.04695, 195.5962, 187.31082, 179.07181, 173.89705]
2025-05-06 20:24:24,280 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [29.0, 156.0, 119.0, 31.0, 106.0, 122.0, 106.0, 105.0, 99.0, 99.0]
2025-05-06 20:24:24,287 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 48/100 (estimated time remaining: 2 hours, 49 minutes, 58 seconds)
2025-05-06 20:27:36,737 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:27:39,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 175.59157 ± 85.437
2025-05-06 20:27:39,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [245.65135, 153.3576, 249.43839, 251.81804, 238.08174, 205.33144, 163.517, 17.019901, 213.60075, 18.099485]
2025-05-06 20:27:39,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [137.0, 93.0, 141.0, 138.0, 137.0, 110.0, 95.0, 29.0, 119.0, 31.0]
2025-05-06 20:27:39,202 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 49/100 (estimated time remaining: 2 hours, 46 minutes, 50 seconds)
2025-05-06 20:30:47,062 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:30:49,967 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 207.23633 ± 82.634
2025-05-06 20:30:49,967 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [225.52676, 274.01855, 221.2541, 192.15173, 358.91904, 224.75941, 192.9732, 17.408752, 216.63455, 148.71721]
2025-05-06 20:30:49,967 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [120.0, 148.0, 121.0, 105.0, 240.0, 126.0, 103.0, 29.0, 119.0, 97.0]
2025-05-06 20:30:49,974 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 50/100 (estimated time remaining: 2 hours, 43 minutes, 44 seconds)
2025-05-06 20:33:58,266 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:34:00,577 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 164.93253 ± 81.375
2025-05-06 20:34:00,577 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [16.477959, 163.123, 203.08975, 278.72733, 154.35281, 216.8446, 228.70259, 214.3558, 151.0676, 22.583921]
2025-05-06 20:34:00,577 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [26.0, 95.0, 109.0, 153.0, 94.0, 118.0, 123.0, 119.0, 91.0, 33.0]
2025-05-06 20:34:00,584 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 51/100 (estimated time remaining: 2 hours, 40 minutes, 15 seconds)
2025-05-06 20:37:10,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:37:13,237 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 198.28761 ± 73.929
2025-05-06 20:37:13,237 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [264.68677, 242.99916, 234.54805, 248.74048, 11.680939, 165.5526, 257.3823, 190.03842, 231.07579, 136.17159]
2025-05-06 20:37:13,237 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [152.0, 138.0, 134.0, 136.0, 23.0, 94.0, 146.0, 101.0, 129.0, 86.0]
2025-05-06 20:37:13,244 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 52/100 (estimated time remaining: 2 hours, 37 minutes, 5 seconds)
2025-05-06 20:40:25,038 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:40:27,472 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 173.40108 ± 83.399
2025-05-06 20:40:27,472 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [193.59706, 20.12396, 206.44151, 225.36913, 202.5861, 15.984161, 281.6171, 172.19363, 175.52803, 240.57005]
2025-05-06 20:40:27,472 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [108.0, 31.0, 115.0, 127.0, 113.0, 34.0, 162.0, 99.0, 100.0, 133.0]
2025-05-06 20:40:27,480 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 53/100 (estimated time remaining: 2 hours, 34 minutes, 6 seconds)
2025-05-06 20:43:35,893 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:43:37,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 151.03159 ± 73.444
2025-05-06 20:43:37,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [245.35155, 9.642756, 194.49756, 20.161451, 213.41006, 153.18436, 150.3206, 186.14778, 157.04312, 180.55675]
2025-05-06 20:43:37,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [132.0, 21.0, 105.0, 32.0, 117.0, 88.0, 95.0, 104.0, 94.0, 97.0]
2025-05-06 20:43:37,998 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 54/100 (estimated time remaining: 2 hours, 30 minutes, 12 seconds)
2025-05-06 20:46:47,999 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:46:51,473 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 246.77681 ± 41.749
2025-05-06 20:46:51,473 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [186.3316, 297.10248, 278.35168, 207.63126, 188.03226, 256.47302, 227.43356, 259.23996, 314.6495, 252.52264]
2025-05-06 20:46:51,473 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [123.0, 183.0, 168.0, 112.0, 108.0, 146.0, 126.0, 148.0, 185.0, 150.0]
2025-05-06 20:46:51,480 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 55/100 (estimated time remaining: 2 hours, 27 minutes, 25 seconds)
2025-05-06 20:50:03,110 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:50:05,359 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 156.95901 ± 94.406
2025-05-06 20:50:05,359 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [187.81815, 216.62878, 238.85341, 164.0491, 18.817863, 234.02243, 253.93086, 12.987974, 221.61693, 20.864653]
2025-05-06 20:50:05,359 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [112.0, 125.0, 127.0, 97.0, 31.0, 125.0, 141.0, 25.0, 125.0, 33.0]
2025-05-06 20:50:05,366 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 56/100 (estimated time remaining: 2 hours, 24 minutes, 43 seconds)
2025-05-06 20:53:15,363 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:53:17,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 192.87431 ± 34.669
2025-05-06 20:53:17,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [184.4455, 192.13887, 202.86966, 204.41684, 144.17245, 219.44412, 192.75554, 267.11026, 183.6614, 137.72849]
2025-05-06 20:53:17,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [103.0, 106.0, 110.0, 112.0, 89.0, 119.0, 105.0, 154.0, 103.0, 92.0]
2025-05-06 20:53:17,976 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 57/100 (estimated time remaining: 2 hours, 21 minutes, 29 seconds)
2025-05-06 20:56:25,852 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:56:28,347 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 175.48575 ± 86.058
2025-05-06 20:56:28,347 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [20.111032, 231.16258, 167.33858, 242.4364, 183.81894, 221.08932, 259.9017, 265.35303, 142.5407, 21.105207]
2025-05-06 20:56:28,347 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [33.0, 133.0, 93.0, 137.0, 101.0, 118.0, 155.0, 150.0, 89.0, 32.0]
2025-05-06 20:56:28,355 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 58/100 (estimated time remaining: 2 hours, 17 minutes, 43 seconds)
2025-05-06 20:59:25,665 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 20:59:27,955 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 166.08562 ± 79.245
2025-05-06 20:59:27,955 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [22.314623, 233.4209, 183.98201, 220.6592, 182.90602, 184.94582, 223.44778, 149.30447, 12.738511, 247.13681]
2025-05-06 20:59:27,955 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [33.0, 130.0, 103.0, 120.0, 103.0, 106.0, 120.0, 91.0, 25.0, 138.0]
2025-05-06 20:59:27,963 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 59/100 (estimated time remaining: 2 hours, 12 minutes, 59 seconds)
2025-05-06 21:02:25,367 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:02:28,151 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 186.55954 ± 119.868
2025-05-06 21:02:28,152 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [16.53465, 258.60626, 19.000984, 222.68495, 227.50514, 17.811972, 380.6808, 277.98865, 249.23862, 195.54324]
2025-05-06 21:02:28,152 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [33.0, 152.0, 28.0, 127.0, 129.0, 33.0, 262.0, 153.0, 140.0, 107.0]
2025-05-06 21:02:28,160 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 60/100 (estimated time remaining: 2 hours, 8 minutes)
2025-05-06 21:05:25,014 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:05:27,694 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 201.72313 ± 24.696
2025-05-06 21:05:27,694 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [248.79199, 190.30524, 188.50537, 185.80602, 211.4474, 161.76181, 204.76456, 215.9846, 178.46959, 231.39478]
2025-05-06 21:05:27,694 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [145.0, 108.0, 102.0, 105.0, 117.0, 93.0, 112.0, 118.0, 104.0, 126.0]
2025-05-06 21:05:27,703 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 61/100 (estimated time remaining: 2 hours, 2 minutes, 58 seconds)
2025-05-06 21:08:24,835 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:08:27,312 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 175.88870 ± 67.480
2025-05-06 21:08:27,312 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [163.01874, 240.56169, 144.29044, 223.47023, 20.894691, 223.34143, 138.51129, 197.79666, 139.28029, 267.7215]
2025-05-06 21:08:27,312 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [94.0, 133.0, 95.0, 121.0, 30.0, 121.0, 88.0, 118.0, 95.0, 152.0]
2025-05-06 21:08:27,321 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 62/100 (estimated time remaining: 1 hour, 58 minutes, 12 seconds)
2025-05-06 21:11:21,348 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:11:23,956 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 196.06339 ± 41.843
2025-05-06 21:11:23,957 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [278.57034, 177.61472, 151.84825, 157.58456, 214.08176, 240.90173, 192.77583, 148.02919, 166.09909, 233.12839]
2025-05-06 21:11:23,957 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [162.0, 98.0, 89.0, 88.0, 117.0, 131.0, 106.0, 88.0, 94.0, 129.0]
2025-05-06 21:11:23,965 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 63/100 (estimated time remaining: 1 hour, 53 minutes, 26 seconds)
2025-05-06 21:14:22,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:14:25,894 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 232.43298 ± 66.477
2025-05-06 21:14:25,895 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [178.14401, 252.85063, 331.0569, 226.00816, 177.64792, 213.83267, 174.54118, 215.34282, 177.41345, 377.49203]
2025-05-06 21:14:25,895 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [98.0, 144.0, 215.0, 128.0, 98.0, 112.0, 99.0, 122.0, 101.0, 263.0]
2025-05-06 21:14:25,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 64/100 (estimated time remaining: 1 hour, 50 minutes, 44 seconds)
2025-05-06 21:17:19,222 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:17:21,605 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 171.43228 ± 84.816
2025-05-06 21:17:21,605 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [221.21397, 15.176678, 250.10515, 18.353733, 155.56947, 219.21652, 206.78687, 226.6166, 259.3138, 141.96988]
2025-05-06 21:17:21,605 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [119.0, 27.0, 138.0, 32.0, 96.0, 117.0, 115.0, 127.0, 143.0, 90.0]
2025-05-06 21:17:21,614 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 65/100 (estimated time remaining: 1 hour, 47 minutes, 12 seconds)
2025-05-06 21:20:19,204 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:20:21,543 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 170.71043 ± 33.921
2025-05-06 21:20:21,543 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [171.21848, 219.03943, 229.14072, 183.66382, 152.49438, 131.68579, 199.59447, 146.00955, 141.22641, 133.0312]
2025-05-06 21:20:21,543 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [102.0, 123.0, 127.0, 99.0, 84.0, 82.0, 113.0, 90.0, 86.0, 86.0]
2025-05-06 21:20:21,552 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 66/100 (estimated time remaining: 1 hour, 44 minutes, 16 seconds)
2025-05-06 21:23:18,142 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:23:20,770 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 187.55339 ± 77.443
2025-05-06 21:23:20,770 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [146.74384, 17.403873, 245.55284, 152.42155, 258.77594, 140.57828, 247.09146, 217.09393, 153.28499, 296.58734]
2025-05-06 21:23:20,770 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [94.0, 29.0, 133.0, 89.0, 139.0, 101.0, 135.0, 118.0, 98.0, 174.0]
2025-05-06 21:23:20,779 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 67/100 (estimated time remaining: 1 hour, 41 minutes, 15 seconds)
2025-05-06 21:26:14,957 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:26:17,611 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 199.63248 ± 39.157
2025-05-06 21:26:17,612 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [165.44424, 234.52844, 246.69614, 217.74551, 269.1429, 192.60439, 148.91586, 166.10828, 198.21484, 156.92406]
2025-05-06 21:26:17,612 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [95.0, 134.0, 134.0, 117.0, 152.0, 107.0, 89.0, 92.0, 112.0, 91.0]
2025-05-06 21:26:17,621 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 68/100 (estimated time remaining: 1 hour, 38 minutes, 18 seconds)
2025-05-06 21:29:14,353 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:29:16,753 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 176.64722 ± 58.276
2025-05-06 21:29:16,753 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [235.88489, 189.7221, 164.35118, 200.07661, 178.04749, 201.51218, 20.468796, 177.96306, 157.10506, 241.34071]
2025-05-06 21:29:16,753 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [129.0, 105.0, 98.0, 113.0, 99.0, 111.0, 32.0, 96.0, 95.0, 136.0]
2025-05-06 21:29:16,762 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 69/100 (estimated time remaining: 1 hour, 35 minutes, 1 second)
2025-05-06 21:32:13,600 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:32:15,936 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 167.05086 ± 81.340
2025-05-06 21:32:15,936 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [140.69391, 13.078088, 17.36432, 185.86629, 205.00743, 227.2686, 202.7435, 252.45322, 189.1473, 236.88582]
2025-05-06 21:32:15,936 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [95.0, 26.0, 29.0, 102.0, 112.0, 124.0, 109.0, 146.0, 111.0, 130.0]
2025-05-06 21:32:15,946 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 70/100 (estimated time remaining: 1 hour, 32 minutes, 24 seconds)
2025-05-06 21:35:12,791 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:35:15,467 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 202.71176 ± 28.078
2025-05-06 21:35:15,467 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [169.64957, 236.66277, 204.81108, 195.22025, 245.09105, 237.51456, 165.90512, 191.33641, 210.2877, 170.63928]
2025-05-06 21:35:15,467 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [101.0, 130.0, 114.0, 106.0, 133.0, 128.0, 97.0, 102.0, 118.0, 98.0]
2025-05-06 21:35:15,476 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 71/100 (estimated time remaining: 1 hour, 29 minutes, 23 seconds)
2025-05-06 21:38:10,518 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:38:12,533 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 141.59125 ± 47.619
2025-05-06 21:38:12,533 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [146.91455, 117.733376, 194.33424, 124.460724, 170.97163, 170.79193, 176.60681, 21.19456, 118.8983, 174.0064]
2025-05-06 21:38:12,533 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [87.0, 83.0, 107.0, 79.0, 99.0, 95.0, 101.0, 32.0, 77.0, 99.0]
2025-05-06 21:38:12,543 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 72/100 (estimated time remaining: 1 hour, 26 minutes, 12 seconds)
2025-05-06 21:41:09,050 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:41:11,260 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 156.68489 ± 85.587
2025-05-06 21:41:11,260 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [224.166, 122.68805, 18.176224, 211.89767, 304.65793, 15.758865, 184.26497, 119.844635, 176.34111, 189.05347]
2025-05-06 21:41:11,260 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [127.0, 78.0, 29.0, 112.0, 181.0, 27.0, 100.0, 77.0, 102.0, 102.0]
2025-05-06 21:41:11,270 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 73/100 (estimated time remaining: 1 hour, 23 minutes, 24 seconds)
2025-05-06 21:44:06,130 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:44:08,719 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 188.78732 ± 54.945
2025-05-06 21:44:08,720 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [233.32898, 248.88326, 207.75928, 49.997066, 151.63844, 234.93077, 208.22296, 186.87999, 205.66269, 160.5698]
2025-05-06 21:44:08,720 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [135.0, 141.0, 110.0, 65.0, 90.0, 126.0, 113.0, 103.0, 113.0, 99.0]
2025-05-06 21:44:08,729 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 74/100 (estimated time remaining: 1 hour, 20 minutes, 16 seconds)
2025-05-06 21:47:06,865 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:47:09,684 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 198.42970 ± 100.803
2025-05-06 21:47:09,684 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [247.93066, 127.29233, 285.87424, 15.032943, 223.80356, 270.12772, 297.61884, 20.063665, 262.67072, 233.88242]
2025-05-06 21:47:09,685 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [137.0, 87.0, 177.0, 28.0, 125.0, 150.0, 171.0, 32.0, 148.0, 127.0]
2025-05-06 21:47:09,694 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 75/100 (estimated time remaining: 1 hour, 17 minutes, 27 seconds)
2025-05-06 21:50:04,676 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:50:06,481 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 123.48738 ± 87.654
2025-05-06 21:50:06,481 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [189.81091, 151.97118, 233.25043, 209.44557, 21.389174, 22.168972, 14.144826, 198.43498, 17.968115, 176.2897]
2025-05-06 21:50:06,481 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [102.0, 89.0, 131.0, 111.0, 32.0, 32.0, 26.0, 108.0, 30.0, 102.0]
2025-05-06 21:50:06,491 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 76/100 (estimated time remaining: 1 hour, 14 minutes, 15 seconds)
2025-05-06 21:53:03,449 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:53:06,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 189.93398 ± 95.802
2025-05-06 21:53:06,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [250.75038, 198.7322, 232.85905, 280.89178, 191.60565, 18.023558, 322.8592, 190.69739, 15.006591, 197.91391]
2025-05-06 21:53:06,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [143.0, 111.0, 128.0, 163.0, 104.0, 28.0, 200.0, 103.0, 28.0, 107.0]
2025-05-06 21:53:06,117 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 77/100 (estimated time remaining: 1 hour, 11 minutes, 29 seconds)
2025-05-06 21:56:02,408 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:56:04,928 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 186.03734 ± 65.680
2025-05-06 21:56:04,929 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [192.36377, 244.82896, 241.63908, 153.13628, 205.75926, 265.09073, 140.07469, 194.22931, 23.377157, 199.87415]
2025-05-06 21:56:04,929 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [104.0, 136.0, 133.0, 92.0, 113.0, 151.0, 87.0, 106.0, 33.0, 107.0]
2025-05-06 21:56:04,939 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 78/100 (estimated time remaining: 1 hour, 8 minutes, 30 seconds)
2025-05-06 21:59:00,713 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 21:59:03,363 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 196.95598 ± 38.865
2025-05-06 21:59:03,363 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [233.7579, 189.09863, 129.71996, 233.72362, 179.50212, 188.64003, 229.24536, 134.91312, 203.99121, 246.96777]
2025-05-06 21:59:03,363 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [126.0, 108.0, 82.0, 130.0, 101.0, 104.0, 123.0, 91.0, 113.0, 140.0]
2025-05-06 21:59:03,374 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 79/100 (estimated time remaining: 1 hour, 5 minutes, 36 seconds)
2025-05-06 22:02:00,140 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:02:02,608 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 184.02879 ± 21.669
2025-05-06 22:02:02,608 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [170.38713, 136.08763, 199.5385, 192.02415, 194.51602, 176.79132, 167.87207, 180.75084, 211.36292, 210.95726]
2025-05-06 22:02:02,608 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [95.0, 82.0, 119.0, 107.0, 107.0, 99.0, 96.0, 104.0, 113.0, 118.0]
2025-05-06 22:02:02,619 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 80/100 (estimated time remaining: 1 hour, 2 minutes, 30 seconds)
2025-05-06 22:04:57,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:05:00,145 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 173.98253 ± 67.329
2025-05-06 22:05:00,145 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [221.1667, 282.3224, 159.78789, 117.1246, 194.37526, 171.35733, 199.33203, 164.46332, 15.429122, 214.46677]
2025-05-06 22:05:00,145 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [125.0, 173.0, 92.0, 81.0, 108.0, 98.0, 110.0, 93.0, 31.0, 116.0]
2025-05-06 22:05:00,155 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 81/100 (estimated time remaining: 59 minutes, 34 seconds)
2025-05-06 22:07:55,062 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:07:58,054 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 209.26590 ± 76.886
2025-05-06 22:07:58,054 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [250.0064, 138.61497, 245.44212, 160.42894, 245.0759, 296.44125, 22.854818, 224.79631, 234.43758, 274.56064]
2025-05-06 22:07:58,054 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [141.0, 83.0, 134.0, 89.0, 137.0, 178.0, 32.0, 124.0, 135.0, 159.0]
2025-05-06 22:07:58,065 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 82/100 (estimated time remaining: 56 minutes, 29 seconds)
2025-05-06 22:11:00,248 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:11:03,087 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 202.06424 ± 76.168
2025-05-06 22:11:03,087 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [260.11282, 196.73132, 256.2245, 222.3818, 176.43353, 293.637, 179.12833, 18.304445, 146.22136, 271.46735]
2025-05-06 22:11:03,087 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [152.0, 109.0, 145.0, 130.0, 97.0, 178.0, 102.0, 27.0, 90.0, 156.0]
2025-05-06 22:11:03,098 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 83/100 (estimated time remaining: 53 minutes, 53 seconds)
2025-05-06 22:14:00,949 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:14:03,687 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 191.13138 ± 81.808
2025-05-06 22:14:03,687 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [21.882057, 145.281, 154.82448, 220.99387, 191.03772, 160.40689, 164.98126, 242.15576, 260.93222, 348.8183]
2025-05-06 22:14:03,687 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [33.0, 88.0, 89.0, 120.0, 110.0, 95.0, 97.0, 136.0, 139.0, 231.0]
2025-05-06 22:14:03,698 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 84/100 (estimated time remaining: 51 minutes, 1 second)
2025-05-06 22:17:03,263 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:17:05,445 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 148.75594 ± 103.659
2025-05-06 22:17:05,445 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [19.08798, 172.56958, 12.670023, 167.00125, 175.87836, 172.94012, 267.89532, 21.737558, 130.9014, 346.87766]
2025-05-06 22:17:05,445 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [30.0, 98.0, 23.0, 96.0, 102.0, 99.0, 143.0, 34.0, 82.0, 211.0]
2025-05-06 22:17:05,456 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 85/100 (estimated time remaining: 48 minutes, 9 seconds)
2025-05-06 22:20:05,022 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:20:07,265 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 164.46512 ± 81.096
2025-05-06 22:20:07,265 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [189.70334, 183.03615, 16.725945, 161.19223, 232.84294, 199.85283, 170.9193, 186.63503, 286.3495, 17.393784]
2025-05-06 22:20:07,265 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [96.0, 104.0, 27.0, 94.0, 123.0, 107.0, 99.0, 96.0, 162.0, 29.0]
2025-05-06 22:20:07,277 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 86/100 (estimated time remaining: 45 minutes, 21 seconds)
2025-05-06 22:23:07,687 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:23:10,306 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 181.85892 ± 93.333
2025-05-06 22:23:10,306 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [254.35011, 205.48749, 16.627048, 160.16837, 144.1999, 242.53357, 279.1939, 213.06819, 285.89075, 17.06998]
2025-05-06 22:23:10,306 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [145.0, 119.0, 27.0, 103.0, 89.0, 134.0, 167.0, 113.0, 168.0, 27.0]
2025-05-06 22:23:10,318 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 87/100 (estimated time remaining: 42 minutes, 34 seconds)
2025-05-06 22:26:08,087 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:26:10,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 133.91028 ± 102.506
2025-05-06 22:26:10,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [181.3276, 290.87122, 17.282017, 18.439478, 13.146187, 248.18867, 222.07382, 151.56189, 17.600245, 178.61163]
2025-05-06 22:26:10,108 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [106.0, 171.0, 27.0, 30.0, 26.0, 138.0, 129.0, 93.0, 29.0, 101.0]
2025-05-06 22:26:10,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 88/100 (estimated time remaining: 39 minutes, 18 seconds)
2025-05-06 22:29:07,920 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:29:10,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 201.65750 ± 70.836
2025-05-06 22:29:10,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [185.74867, 224.0896, 149.41695, 189.56372, 237.27472, 220.78656, 247.3645, 264.9492, 19.838354, 277.54288]
2025-05-06 22:29:10,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [103.0, 126.0, 86.0, 106.0, 128.0, 120.0, 136.0, 154.0, 31.0, 157.0]
2025-05-06 22:29:10,673 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 89/100 (estimated time remaining: 36 minutes, 16 seconds)
2025-05-06 22:32:10,404 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:32:13,769 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 236.87325 ± 119.695
2025-05-06 22:32:13,769 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [348.99142, 174.22586, 227.89851, 320.0331, 304.06406, 22.62775, 362.2833, 20.401062, 283.53488, 304.67224]
2025-05-06 22:32:13,769 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [210.0, 100.0, 119.0, 186.0, 170.0, 32.0, 227.0, 31.0, 152.0, 170.0]
2025-05-06 22:32:13,781 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 90/100 (estimated time remaining: 33 minutes, 18 seconds)
2025-05-06 22:35:12,924 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:35:15,768 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 206.72739 ± 72.168
2025-05-06 22:35:15,768 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [227.28554, 17.225328, 216.7852, 221.07413, 262.57022, 212.99538, 231.23578, 172.34706, 308.7442, 197.01118]
2025-05-06 22:35:15,768 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [125.0, 28.0, 118.0, 121.0, 150.0, 113.0, 131.0, 99.0, 187.0, 115.0]
2025-05-06 22:35:15,780 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 91/100 (estimated time remaining: 30 minutes, 17 seconds)
2025-05-06 22:38:14,612 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:38:17,707 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 227.45602 ± 41.182
2025-05-06 22:38:17,707 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [185.1331, 304.01566, 229.08766, 211.33243, 262.33408, 191.44232, 158.55664, 265.14938, 245.6549, 221.85435]
2025-05-06 22:38:17,707 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [103.0, 184.0, 124.0, 117.0, 149.0, 105.0, 95.0, 151.0, 138.0, 124.0]
2025-05-06 22:38:17,719 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 92/100 (estimated time remaining: 27 minutes, 13 seconds)
2025-05-06 22:41:16,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:41:19,436 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 216.15230 ± 75.916
2025-05-06 22:41:19,436 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [297.78827, 17.634481, 292.5179, 225.02463, 220.86014, 212.60603, 166.98137, 240.38597, 220.70412, 267.0199]
2025-05-06 22:41:19,436 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [177.0, 32.0, 170.0, 121.0, 120.0, 114.0, 92.0, 126.0, 117.0, 152.0]
2025-05-06 22:41:19,448 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 93/100 (estimated time remaining: 24 minutes, 14 seconds)
2025-05-06 22:44:16,787 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:44:19,624 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 206.76640 ± 38.994
2025-05-06 22:44:19,624 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [243.6148, 231.94794, 200.34808, 190.798, 164.95009, 266.3609, 216.95099, 242.93948, 175.12022, 134.6336]
2025-05-06 22:44:19,624 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [139.0, 131.0, 112.0, 104.0, 97.0, 154.0, 124.0, 137.0, 103.0, 86.0]
2025-05-06 22:44:19,636 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 94/100 (estimated time remaining: 21 minutes, 12 seconds)
2025-05-06 22:47:10,170 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:47:12,288 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 174.49988 ± 59.993
2025-05-06 22:47:12,288 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [186.82262, 14.741023, 183.62514, 194.92146, 200.91763, 127.92266, 182.94511, 201.36667, 204.90933, 246.82715]
2025-05-06 22:47:12,288 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [104.0, 25.0, 104.0, 107.0, 109.0, 84.0, 102.0, 114.0, 108.0, 139.0]
2025-05-06 22:47:12,299 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 95/100 (estimated time remaining: 17 minutes, 58 seconds)
2025-05-06 22:49:43,284 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:49:45,515 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 181.11053 ± 93.066
2025-05-06 22:49:45,515 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [240.09851, 276.57306, 215.34125, 185.23863, 186.51587, 204.72206, 146.95091, 19.96945, 317.44522, 18.250475]
2025-05-06 22:49:45,515 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [122.0, 151.0, 117.0, 103.0, 104.0, 109.0, 92.0, 31.0, 187.0, 30.0]
2025-05-06 22:49:45,525 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 96/100 (estimated time remaining: 14 minutes, 29 seconds)
2025-05-06 22:52:16,291 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:52:18,496 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 178.84518 ± 64.744
2025-05-06 22:52:18,496 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [162.1533, 213.10419, 205.41394, 229.23135, 179.4051, 266.78308, 166.60387, 128.59259, 217.44923, 19.715092]
2025-05-06 22:52:18,496 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [94.0, 119.0, 115.0, 124.0, 103.0, 148.0, 103.0, 84.0, 124.0, 31.0]
2025-05-06 22:52:18,507 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 97/100 (estimated time remaining: 11 minutes, 12 seconds)
2025-05-06 22:54:48,895 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:54:51,354 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 206.38728 ± 69.805
2025-05-06 22:54:51,354 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [270.36472, 238.38693, 21.377401, 179.49423, 162.86862, 243.37212, 253.63686, 217.47221, 260.65146, 216.2481]
2025-05-06 22:54:51,355 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [155.0, 128.0, 33.0, 107.0, 97.0, 130.0, 134.0, 119.0, 142.0, 116.0]
2025-05-06 22:54:51,365 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 98/100 (estimated time remaining: 8 minutes, 7 seconds)
2025-05-06 22:57:20,420 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:57:22,368 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 156.49931 ± 73.769
2025-05-06 22:57:22,368 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [180.88266, 145.09026, 16.077991, 190.70325, 154.79506, 224.57896, 20.316917, 202.9821, 196.51607, 233.04984]
2025-05-06 22:57:22,368 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [100.0, 98.0, 29.0, 106.0, 91.0, 118.0, 31.0, 112.0, 106.0, 131.0]
2025-05-06 22:57:22,379 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 99/100 (estimated time remaining: 5 minutes, 13 seconds)
2025-05-06 22:59:55,407 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 22:59:58,020 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 217.34602 ± 76.986
2025-05-06 22:59:58,020 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [186.8072, 13.802217, 234.49423, 210.33786, 277.3478, 268.74402, 316.4362, 238.7202, 210.7807, 215.9898]
2025-05-06 22:59:58,020 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [110.0, 24.0, 124.0, 121.0, 160.0, 139.0, 185.0, 130.0, 114.0, 125.0]
2025-05-06 22:59:58,031 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1097 [INFO]: Iteration 100/100 (estimated time remaining: 2 minutes, 33 seconds)
2025-05-06 23:02:28,523 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 23:02:31,425 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1119 [DEBUG]: Total Reward: 250.37019 ± 40.331
2025-05-06 23:02:31,425 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1120 [DEBUG]: All rewards: [181.1844, 217.64386, 272.18924, 268.03696, 239.90106, 202.76793, 251.04977, 310.9688, 310.13474, 249.82527]
2025-05-06 23:02:31,425 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1121 [DEBUG]: All trajectory lengths: [102.0, 115.0, 148.0, 141.0, 132.0, 112.0, 130.0, 177.0, 178.0, 130.0]
2025-05-06 23:02:31,436 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-walker2d):1149 [DEBUG]: Training session finished
