2025-05-11 15:35:45,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1108 [DEBUG]: logdir: _logs/benchmark-v3-tc4/noisy-hopper/ExtremeClogL1U23-sac-aug-mem2
2025-05-11 15:35:45,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1109 [DEBUG]: trainer_prefix: benchmark-v3-tc4/noisy-hopper/ExtremeClogL1U23-sac-aug-mem2
2025-05-11 15:35:45,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1110 [DEBUG]: args.trainer_eval_latencies: {'ExtremeClogL1U23': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x7506289cf3d0>}
2025-05-11 15:35:45,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1111 [DEBUG]: using device: cpu
2025-05-11 15:35:45,512 baseline-sac-noisy-hopper:77 [WARNING]: args.memorize_actions != args.horizon: 2 != 24
2025-05-11 15:35:45,520 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1133 [INFO]: Creating new trainer
2025-05-11 15:35:45,529 baseline-sac-noisy-hopper:111 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=17, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=3, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(3,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=3, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(3,))
  )
  (tanh_refit): NNTanhRefit(scale: tensor([[2., 2., 2.]]), shift: tensor([[-1., -1., -1.]]))
)
2025-05-11 15:35:45,529 baseline-sac-noisy-hopper:112 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=20, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-05-11 15:35:45,698 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1194 [DEBUG]: Starting training session...
2025-05-11 15:35:45,698 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 1/100
2025-05-11 15:38:08,815 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:38:09,070 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 41.27399 ± 2.079
2025-05-11 15:38:09,070 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [39.827675, 41.118412, 37.827446, 41.417187, 41.6364, 39.1902, 41.592426, 41.036884, 45.802204, 43.291103]
2025-05-11 15:38:09,070 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [23.0, 24.0, 22.0, 24.0, 24.0, 23.0, 24.0, 24.0, 27.0, 25.0]
2025-05-11 15:38:09,071 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1226 [INFO]: New best (41.27) for latency ExtremeClogL1U23
2025-05-11 15:38:09,071 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1229 [INFO]: saving network
2025-05-11 15:38:09,075 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc4/noisy-hopper/ExtremeClogL1U23-sac-aug-mem2/checkpoints/best_ExtremeClogL1U23.pkl
2025-05-11 15:38:09,081 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 2/100 (estimated time remaining: 3 hours, 56 minutes, 34 seconds)
2025-05-11 15:40:45,966 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:40:49,458 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 320.49457 ± 225.057
2025-05-11 15:40:49,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [263.10883, 694.56323, 542.3814, 405.35428, 127.75375, 32.982388, 104.62033, 107.644554, 298.4009, 628.1363]
2025-05-11 15:40:49,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [220.0, 583.0, 499.0, 371.0, 110.0, 36.0, 93.0, 104.0, 264.0, 578.0]
2025-05-11 15:40:49,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1226 [INFO]: New best (320.49) for latency ExtremeClogL1U23
2025-05-11 15:40:49,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1229 [INFO]: saving network
2025-05-11 15:40:49,464 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc4/noisy-hopper/ExtremeClogL1U23-sac-aug-mem2/checkpoints/best_ExtremeClogL1U23.pkl
2025-05-11 15:40:49,470 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 3/100 (estimated time remaining: 4 hours, 8 minutes, 4 seconds)
2025-05-11 15:43:29,023 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:43:31,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 153.32329 ± 130.819
2025-05-11 15:43:31,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [20.460457, 84.987175, 50.67185, 50.738007, 328.7524, 84.88059, 418.92847, 59.40937, 168.39171, 266.01276]
2025-05-11 15:43:31,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [25.0, 113.0, 49.0, 81.0, 360.0, 89.0, 434.0, 80.0, 182.0, 285.0]
2025-05-11 15:43:31,004 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 4/100 (estimated time remaining: 4 hours, 10 minutes, 44 seconds)
2025-05-11 15:46:08,509 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:46:14,301 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 420.62982 ± 344.773
2025-05-11 15:46:14,301 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [985.32086, 252.62206, 74.158905, 152.2919, 765.3479, 623.8927, 891.53107, 101.483406, 337.64133, 22.008291]
2025-05-11 15:46:14,301 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [987.0, 247.0, 100.0, 158.0, 797.0, 631.0, 902.0, 105.0, 342.0, 26.0]
2025-05-11 15:46:14,302 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1226 [INFO]: New best (420.63) for latency ExtremeClogL1U23
2025-05-11 15:46:14,302 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1229 [INFO]: saving network
2025-05-11 15:46:14,306 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc4/noisy-hopper/ExtremeClogL1U23-sac-aug-mem2/checkpoints/best_ExtremeClogL1U23.pkl
2025-05-11 15:46:14,313 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 5/100 (estimated time remaining: 4 hours, 11 minutes, 26 seconds)
2025-05-11 15:48:54,342 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:48:55,579 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 176.74136 ± 98.867
2025-05-11 15:48:55,580 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [124.76172, 234.07152, 29.599897, 178.33493, 249.39908, 238.83376, 19.066723, 321.51456, 272.46686, 99.36468]
2025-05-11 15:48:55,580 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [104.0, 109.0, 32.0, 148.0, 141.0, 124.0, 29.0, 212.0, 142.0, 100.0]
2025-05-11 15:48:55,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 6/100 (estimated time remaining: 4 hours, 10 minutes, 7 seconds)
2025-05-11 15:51:34,595 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:51:37,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 195.25095 ± 129.220
2025-05-11 15:51:37,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [259.7498, 128.02104, 148.89981, 32.127235, 351.98062, 226.98676, 460.07828, 208.46988, 93.838524, 42.357475]
2025-05-11 15:51:37,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [253.0, 113.0, 174.0, 39.0, 367.0, 236.0, 470.0, 237.0, 111.0, 50.0]
2025-05-11 15:51:37,004 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 7/100 (estimated time remaining: 4 hours, 13 minutes, 8 seconds)
2025-05-11 15:54:20,384 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:54:23,125 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 306.54669 ± 169.891
2025-05-11 15:54:23,125 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [98.77763, 426.9894, 444.51047, 40.435764, 276.8319, 137.99562, 616.60455, 430.65118, 302.16037, 290.51028]
2025-05-11 15:54:23,125 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [76.0, 373.0, 298.0, 28.0, 230.0, 112.0, 552.0, 368.0, 157.0, 129.0]
2025-05-11 15:54:23,127 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 8/100 (estimated time remaining: 4 hours, 12 minutes, 14 seconds)
2025-05-11 15:57:03,755 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:57:04,764 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 186.59091 ± 109.061
2025-05-11 15:57:04,764 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [50.531265, 68.00982, 287.4027, 289.6316, 158.8185, 13.969486, 129.5005, 294.27325, 282.0834, 291.68863]
2025-05-11 15:57:04,764 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [46.0, 61.0, 140.0, 131.0, 118.0, 19.0, 129.0, 137.0, 122.0, 136.0]
2025-05-11 15:57:04,766 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 9/100 (estimated time remaining: 4 hours, 9 minutes, 33 seconds)
2025-05-11 15:59:45,863 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 15:59:46,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 225.83760 ± 70.975
2025-05-11 15:59:46,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [109.311424, 274.35806, 281.11304, 276.57236, 242.73251, 224.16946, 199.35071, 292.7105, 276.41516, 81.64259]
2025-05-11 15:59:46,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [58.0, 140.0, 125.0, 120.0, 109.0, 125.0, 96.0, 126.0, 121.0, 46.0]
2025-05-11 15:59:46,973 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 10/100 (estimated time remaining: 4 hours, 6 minutes, 30 seconds)
2025-05-11 16:02:25,913 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:02:27,154 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 237.67075 ± 90.226
2025-05-11 16:02:27,154 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [294.48093, 300.02643, 293.36505, 205.88953, 170.76929, 281.28278, 305.09656, 172.63922, 330.7904, 22.367067]
2025-05-11 16:02:27,154 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [136.0, 140.0, 133.0, 105.0, 95.0, 122.0, 141.0, 95.0, 178.0, 20.0]
2025-05-11 16:02:27,157 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 11/100 (estimated time remaining: 4 hours, 3 minutes, 28 seconds)
2025-05-11 16:05:07,811 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:05:08,945 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 219.21980 ± 91.129
2025-05-11 16:05:08,945 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [65.0801, 288.33102, 114.52544, 273.34146, 276.49228, 284.47348, 95.50676, 181.63507, 330.49854, 282.31375]
2025-05-11 16:05:08,945 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [44.0, 133.0, 65.0, 125.0, 126.0, 131.0, 55.0, 98.0, 169.0, 129.0]
2025-05-11 16:05:08,947 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 12/100 (estimated time remaining: 4 hours, 52 seconds)
2025-05-11 16:07:49,257 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:07:50,557 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 242.02280 ± 104.165
2025-05-11 16:07:50,558 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [386.55173, 151.02283, 298.30765, 73.64689, 319.9509, 58.605347, 251.02686, 294.6234, 294.7138, 291.77853]
2025-05-11 16:07:50,558 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [233.0, 93.0, 134.0, 46.0, 141.0, 34.0, 130.0, 134.0, 136.0, 133.0]
2025-05-11 16:07:50,560 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 13/100 (estimated time remaining: 3 hours, 56 minutes, 50 seconds)
2025-05-11 16:10:30,948 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:10:32,308 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 272.39597 ± 54.254
2025-05-11 16:10:32,308 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [265.7782, 298.68314, 309.43524, 283.37048, 288.19736, 289.32568, 112.99697, 286.84525, 300.28873, 289.03848]
2025-05-11 16:10:32,308 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [125.0, 138.0, 156.0, 128.0, 131.0, 132.0, 66.0, 134.0, 140.0, 131.0]
2025-05-11 16:10:32,310 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 14/100 (estimated time remaining: 3 hours, 54 minutes, 11 seconds)
2025-05-11 16:13:15,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:13:16,405 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 239.67781 ± 86.700
2025-05-11 16:13:16,406 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [104.31101, 350.71167, 145.49776, 276.27237, 275.80463, 272.34628, 275.36493, 305.11142, 89.59362, 301.7645]
2025-05-11 16:13:16,406 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [74.0, 187.0, 80.0, 124.0, 123.0, 124.0, 157.0, 137.0, 55.0, 128.0]
2025-05-11 16:13:16,409 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 15/100 (estimated time remaining: 3 hours, 52 minutes, 2 seconds)
2025-05-11 16:15:57,073 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:15:58,388 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 229.32498 ± 116.799
2025-05-11 16:15:58,389 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [51.83713, 261.6808, 262.90414, 386.61972, 281.53958, 232.53891, 348.67535, 117.65344, 28.911304, 320.8896]
2025-05-11 16:15:58,389 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [30.0, 147.0, 158.0, 223.0, 126.0, 112.0, 173.0, 68.0, 24.0, 150.0]
2025-05-11 16:15:58,392 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 16/100 (estimated time remaining: 3 hours, 49 minutes, 50 seconds)
2025-05-11 16:18:40,444 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:18:41,719 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 241.61389 ± 73.159
2025-05-11 16:18:41,719 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [214.33669, 264.1729, 312.77298, 143.578, 306.05273, 297.47058, 294.84207, 98.24762, 301.21747, 183.44783]
2025-05-11 16:18:41,719 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [108.0, 129.0, 158.0, 82.0, 141.0, 136.0, 134.0, 57.0, 139.0, 103.0]
2025-05-11 16:18:41,722 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 17/100 (estimated time remaining: 3 hours, 47 minutes, 34 seconds)
2025-05-11 16:21:23,104 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:21:24,099 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 187.70349 ± 113.433
2025-05-11 16:21:24,099 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [286.78796, 295.53497, 17.442118, 308.93198, 118.44425, 94.33758, 297.862, 287.0349, 25.902964, 144.7562]
2025-05-11 16:21:24,099 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [130.0, 136.0, 18.0, 135.0, 67.0, 69.0, 145.0, 130.0, 21.0, 79.0]
2025-05-11 16:21:24,103 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 18/100 (estimated time remaining: 3 hours, 45 minutes, 4 seconds)
2025-05-11 16:24:01,724 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:24:02,862 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 236.89571 ± 96.596
2025-05-11 16:24:02,863 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [27.887333, 280.53738, 293.5356, 286.74283, 287.08365, 297.98248, 289.9726, 237.2289, 67.03013, 300.95618]
2025-05-11 16:24:02,863 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [33.0, 124.0, 133.0, 129.0, 129.0, 135.0, 132.0, 116.0, 46.0, 148.0]
2025-05-11 16:24:02,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 19/100 (estimated time remaining: 3 hours, 41 minutes, 33 seconds)
2025-05-11 16:26:38,209 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:26:39,494 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 271.70312 ± 47.000
2025-05-11 16:26:39,495 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [299.1234, 313.68967, 288.89038, 207.19695, 276.4651, 300.4548, 296.46805, 296.552, 157.84555, 280.34555]
2025-05-11 16:26:39,495 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [138.0, 143.0, 130.0, 106.0, 122.0, 138.0, 137.0, 136.0, 89.0, 131.0]
2025-05-11 16:26:39,498 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 20/100 (estimated time remaining: 3 hours, 36 minutes, 50 seconds)
2025-05-11 16:29:14,290 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:29:15,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 245.90536 ± 117.326
2025-05-11 16:29:15,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [107.70536, 21.308617, 344.386, 288.99265, 287.8556, 289.45178, 103.76641, 406.59882, 302.29968, 306.68903]
2025-05-11 16:29:15,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [60.0, 20.0, 175.0, 132.0, 130.0, 130.0, 58.0, 238.0, 142.0, 142.0]
2025-05-11 16:29:15,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 21/100 (estimated time remaining: 3 hours, 32 minutes, 34 seconds)
2025-05-11 16:31:54,580 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:31:55,981 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 257.16595 ± 111.523
2025-05-11 16:31:55,981 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [401.35022, 24.597582, 315.46667, 232.50311, 78.983604, 297.1429, 292.4462, 275.81668, 305.62497, 347.7275]
2025-05-11 16:31:55,981 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [236.0, 30.0, 159.0, 112.0, 55.0, 133.0, 134.0, 122.0, 141.0, 171.0]
2025-05-11 16:31:55,985 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 22/100 (estimated time remaining: 3 hours, 29 minutes, 9 seconds)
2025-05-11 16:34:37,453 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:34:38,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 257.78979 ± 99.392
2025-05-11 16:34:38,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [329.69022, 308.41342, 90.611206, 101.54912, 340.60803, 311.43393, 318.33325, 298.72147, 132.19029, 346.3473]
2025-05-11 16:34:38,844 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [157.0, 145.0, 53.0, 73.0, 169.0, 139.0, 148.0, 149.0, 75.0, 180.0]
2025-05-11 16:34:38,848 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 23/100 (estimated time remaining: 3 hours, 26 minutes, 38 seconds)
2025-05-11 16:37:21,397 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:37:22,735 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 241.86926 ± 122.702
2025-05-11 16:37:22,735 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [356.82568, 325.97427, 22.605083, 60.817562, 331.38013, 313.86862, 298.28552, 89.38027, 326.47507, 293.0803]
2025-05-11 16:37:22,735 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [174.0, 159.0, 25.0, 81.0, 158.0, 147.0, 138.0, 54.0, 162.0, 134.0]
2025-05-11 16:37:22,739 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 24/100 (estimated time remaining: 3 hours, 25 minutes, 18 seconds)
2025-05-11 16:40:03,060 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:40:04,511 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 289.95853 ± 67.247
2025-05-11 16:40:04,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [315.5438, 311.69318, 305.4984, 89.777435, 304.24478, 315.81604, 325.03268, 296.0921, 324.00845, 311.8784]
2025-05-11 16:40:04,512 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [146.0, 148.0, 133.0, 54.0, 132.0, 145.0, 150.0, 134.0, 155.0, 146.0]
2025-05-11 16:40:04,516 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 25/100 (estimated time remaining: 3 hours, 23 minutes, 56 seconds)
2025-05-11 16:43:03,747 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:43:05,075 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 194.76208 ± 103.232
2025-05-11 16:43:05,075 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [288.3017, 34.777977, 269.4282, 285.15976, 309.86087, 294.5534, 125.15429, 165.09949, 145.07285, 30.21229]
2025-05-11 16:43:05,075 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [133.0, 43.0, 117.0, 129.0, 148.0, 135.0, 95.0, 85.0, 82.0, 33.0]
2025-05-11 16:43:05,081 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 26/100 (estimated time remaining: 3 hours, 27 minutes, 22 seconds)
2025-05-11 16:46:26,555 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:46:28,213 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 261.90692 ± 74.938
2025-05-11 16:46:28,213 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [305.7829, 301.77313, 286.95087, 289.7043, 116.64429, 108.845215, 308.33887, 297.669, 309.23886, 294.12173]
2025-05-11 16:46:28,214 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [140.0, 141.0, 131.0, 132.0, 69.0, 68.0, 145.0, 138.0, 147.0, 136.0]
2025-05-11 16:46:28,219 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 27/100 (estimated time remaining: 3 hours, 35 minutes, 9 seconds)
2025-05-11 16:49:49,291 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:49:50,830 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 244.17513 ± 91.195
2025-05-11 16:49:50,830 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [294.8628, 305.86252, 140.07751, 304.93848, 294.01358, 301.29044, 210.00577, 293.78857, 279.91364, 16.997894]
2025-05-11 16:49:50,830 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [136.0, 144.0, 81.0, 134.0, 134.0, 140.0, 107.0, 131.0, 135.0, 15.0]
2025-05-11 16:49:50,836 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 28/100 (estimated time remaining: 3 hours, 41 minutes, 55 seconds)
2025-05-11 16:52:53,062 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:52:54,493 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 274.73535 ± 77.576
2025-05-11 16:52:54,493 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [300.49274, 284.27097, 311.94196, 54.216747, 238.7911, 339.29733, 299.88904, 302.84125, 320.90768, 294.70462]
2025-05-11 16:52:54,493 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [138.0, 134.0, 150.0, 31.0, 120.0, 184.0, 139.0, 140.0, 154.0, 135.0]
2025-05-11 16:52:54,497 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 29/100 (estimated time remaining: 3 hours, 43 minutes, 37 seconds)
2025-05-11 16:55:35,163 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:55:36,468 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 258.51581 ± 87.801
2025-05-11 16:55:36,468 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [304.63077, 329.76166, 295.6229, 291.74948, 292.34113, 200.42258, 282.40552, 274.12946, 301.06427, 13.030494]
2025-05-11 16:55:36,468 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [141.0, 161.0, 135.0, 133.0, 133.0, 106.0, 131.0, 120.0, 140.0, 15.0]
2025-05-11 16:55:36,473 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 30/100 (estimated time remaining: 3 hours, 40 minutes, 33 seconds)
2025-05-11 16:58:17,115 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 16:58:18,586 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 288.16653 ± 38.120
2025-05-11 16:58:18,586 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [301.61398, 305.0455, 261.04926, 321.1021, 302.0777, 303.16684, 295.43823, 182.47908, 309.72644, 299.96606]
2025-05-11 16:58:18,586 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [141.0, 143.0, 123.0, 159.0, 139.0, 140.0, 136.0, 103.0, 146.0, 138.0]
2025-05-11 16:58:18,591 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 31/100 (estimated time remaining: 3 hours, 33 minutes, 9 seconds)
2025-05-11 17:01:01,980 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:01:03,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 234.58170 ± 103.730
2025-05-11 17:01:03,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [294.93372, 148.95473, 335.12592, 290.45258, 293.3998, 293.2832, 293.15884, 297.9805, 35.97106, 62.557]
2025-05-11 17:01:03,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 76.0, 175.0, 132.0, 133.0, 133.0, 133.0, 137.0, 44.0, 36.0]
2025-05-11 17:01:03,200 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 32/100 (estimated time remaining: 3 hours, 21 minutes, 14 seconds)
2025-05-11 17:03:45,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:03:46,747 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 252.78499 ± 85.627
2025-05-11 17:03:46,748 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [300.44617, 308.11996, 301.0533, 289.9254, 281.7792, 298.35602, 159.8664, 258.40628, 301.447, 28.449896]
2025-05-11 17:03:46,748 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [139.0, 144.0, 139.0, 132.0, 139.0, 139.0, 88.0, 135.0, 140.0, 31.0]
2025-05-11 17:03:46,753 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 33/100 (estimated time remaining: 3 hours, 9 minutes, 28 seconds)
2025-05-11 17:06:29,284 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:06:30,594 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 243.14531 ± 100.340
2025-05-11 17:06:30,595 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [292.81284, 110.24175, 320.04608, 292.92834, 123.618515, 207.06924, 405.9062, 86.72373, 293.55856, 298.54785]
2025-05-11 17:06:30,595 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [135.0, 62.0, 153.0, 132.0, 67.0, 105.0, 244.0, 53.0, 128.0, 136.0]
2025-05-11 17:06:30,600 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 34/100 (estimated time remaining: 3 hours, 2 minutes, 15 seconds)
2025-05-11 17:09:12,687 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:09:13,748 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 210.71494 ± 94.756
2025-05-11 17:09:13,749 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [279.03043, 293.68533, 64.838326, 276.10452, 145.70493, 125.477234, 277.49536, 294.2916, 58.579643, 291.94226]
2025-05-11 17:09:13,749 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [121.0, 136.0, 37.0, 119.0, 79.0, 79.0, 123.0, 135.0, 34.0, 132.0]
2025-05-11 17:09:13,754 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 35/100 (estimated time remaining: 2 hours, 59 minutes, 48 seconds)
2025-05-11 17:11:55,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:11:57,270 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 280.04871 ± 59.562
2025-05-11 17:11:57,271 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [332.06573, 298.86093, 296.0306, 295.88464, 293.5826, 117.26653, 294.88483, 299.18195, 238.6008, 334.12836]
2025-05-11 17:11:57,271 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [169.0, 138.0, 136.0, 136.0, 134.0, 66.0, 143.0, 139.0, 128.0, 173.0]
2025-05-11 17:11:57,276 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 36/100 (estimated time remaining: 2 hours, 57 minutes, 22 seconds)
2025-05-11 17:14:41,582 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:14:42,950 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 268.33115 ± 63.893
2025-05-11 17:14:42,950 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [286.16028, 293.97614, 289.52878, 292.38034, 292.32883, 85.08948, 236.92944, 312.0101, 303.65015, 291.25815]
2025-05-11 17:14:42,951 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [126.0, 135.0, 132.0, 133.0, 133.0, 48.0, 138.0, 157.0, 142.0, 134.0]
2025-05-11 17:14:42,956 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 37/100 (estimated time remaining: 2 hours, 54 minutes, 52 seconds)
2025-05-11 17:17:25,349 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:17:26,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 204.74379 ± 118.261
2025-05-11 17:17:26,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [53.9664, 326.79388, 229.72672, 239.36409, 316.30804, 293.06464, 93.27909, 39.955338, 86.879715, 368.09988]
2025-05-11 17:17:26,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [31.0, 163.0, 116.0, 115.0, 155.0, 133.0, 54.0, 45.0, 62.0, 158.0]
2025-05-11 17:17:26,463 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 38/100 (estimated time remaining: 2 hours, 52 minutes, 8 seconds)
2025-05-11 17:20:08,376 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:20:09,560 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 218.13042 ± 92.608
2025-05-11 17:20:09,561 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [286.65182, 75.69172, 226.64374, 327.61734, 178.45541, 286.05725, 71.822464, 293.16702, 305.6142, 129.58334]
2025-05-11 17:20:09,561 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [136.0, 46.0, 155.0, 153.0, 102.0, 127.0, 43.0, 128.0, 143.0, 72.0]
2025-05-11 17:20:09,566 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 39/100 (estimated time remaining: 2 hours, 49 minutes, 15 seconds)
2025-05-11 17:22:51,906 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:22:53,331 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 283.15732 ± 34.255
2025-05-11 17:22:53,331 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [295.2112, 299.2491, 297.0179, 286.3725, 296.22495, 308.68594, 299.34183, 192.93785, 308.7237, 247.80843]
2025-05-11 17:22:53,331 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [135.0, 139.0, 136.0, 130.0, 136.0, 144.0, 138.0, 102.0, 151.0, 122.0]
2025-05-11 17:22:53,338 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 40/100 (estimated time remaining: 2 hours, 46 minutes, 38 seconds)
2025-05-11 17:25:36,886 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:25:38,273 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 280.23627 ± 56.104
2025-05-11 17:25:38,273 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [285.29773, 300.34763, 330.47266, 303.68362, 293.51675, 293.55652, 295.23166, 284.65182, 299.67484, 115.92941]
2025-05-11 17:25:38,273 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [129.0, 141.0, 150.0, 137.0, 135.0, 134.0, 139.0, 128.0, 136.0, 67.0]
2025-05-11 17:25:38,280 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 41/100 (estimated time remaining: 2 hours, 44 minutes, 12 seconds)
2025-05-11 17:28:21,130 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:28:22,294 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 224.26694 ± 100.548
2025-05-11 17:28:22,294 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [291.37, 299.10406, 289.48404, 98.2794, 29.360306, 263.38116, 291.8861, 293.44672, 293.32986, 93.02784]
2025-05-11 17:28:22,294 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [133.0, 138.0, 132.0, 56.0, 35.0, 138.0, 134.0, 134.0, 135.0, 53.0]
2025-05-11 17:28:22,300 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 42/100 (estimated time remaining: 2 hours, 41 minutes, 8 seconds)
2025-05-11 17:31:03,519 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:31:04,845 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 270.25787 ± 76.501
2025-05-11 17:31:04,846 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [298.5785, 298.3441, 292.36435, 292.89755, 286.38318, 292.5629, 291.42035, 299.32446, 309.269, 41.434364]
2025-05-11 17:31:04,846 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 137.0, 132.0, 132.0, 129.0, 133.0, 133.0, 139.0, 145.0, 24.0]
2025-05-11 17:31:04,852 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 43/100 (estimated time remaining: 2 hours, 38 minutes, 13 seconds)
2025-05-11 17:33:47,382 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:33:48,616 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 239.36040 ± 109.241
2025-05-11 17:33:48,616 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [263.06335, 285.8377, 290.9864, 13.168382, 31.552866, 307.3063, 297.79633, 298.27618, 307.66785, 297.94876]
2025-05-11 17:33:48,616 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [127.0, 126.0, 132.0, 16.0, 38.0, 151.0, 137.0, 137.0, 146.0, 137.0]
2025-05-11 17:33:48,622 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 44/100 (estimated time remaining: 2 hours, 35 minutes, 37 seconds)
2025-05-11 17:36:31,129 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:36:32,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 296.09717 ± 7.739
2025-05-11 17:36:32,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [294.73566, 313.44162, 292.85608, 287.84195, 293.74136, 288.15247, 292.2379, 296.55768, 307.46133, 293.94565]
2025-05-11 17:36:32,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 146.0, 132.0, 130.0, 133.0, 131.0, 133.0, 136.0, 144.0, 134.0]
2025-05-11 17:36:32,587 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 45/100 (estimated time remaining: 2 hours, 32 minutes, 55 seconds)
2025-05-11 17:39:15,283 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:39:16,631 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 256.39810 ± 87.840
2025-05-11 17:39:16,631 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [295.98724, 302.03088, 293.3165, 281.8843, 128.75534, 45.826653, 339.3727, 283.75684, 298.74747, 294.30307]
2025-05-11 17:39:16,631 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 140.0, 132.0, 129.0, 77.0, 46.0, 180.0, 141.0, 137.0, 134.0]
2025-05-11 17:39:16,638 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 46/100 (estimated time remaining: 2 hours, 30 minutes, 1 second)
2025-05-11 17:41:59,892 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:42:01,149 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 247.44600 ± 72.252
2025-05-11 17:42:01,150 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [294.0833, 164.55849, 281.77963, 124.134605, 130.25479, 323.56854, 276.44897, 297.1381, 284.2541, 298.23956]
2025-05-11 17:42:01,150 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 91.0, 123.0, 71.0, 69.0, 165.0, 124.0, 137.0, 129.0, 137.0]
2025-05-11 17:42:01,156 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 47/100 (estimated time remaining: 2 hours, 27 minutes, 23 seconds)
2025-05-11 17:44:44,734 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:44:45,817 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 212.42175 ± 103.270
2025-05-11 17:44:45,818 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [150.1966, 66.87147, 281.63422, 293.93683, 14.257973, 288.85028, 298.25064, 295.44632, 144.12158, 290.65158]
2025-05-11 17:44:45,818 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [83.0, 42.0, 122.0, 134.0, 16.0, 144.0, 133.0, 135.0, 76.0, 127.0]
2025-05-11 17:44:45,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 48/100 (estimated time remaining: 2 hours, 25 minutes, 2 seconds)
2025-05-11 17:47:24,858 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:47:25,956 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 202.15628 ± 100.257
2025-05-11 17:47:25,956 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [294.47397, 265.6813, 31.00548, 292.9648, 153.80312, 65.184135, 291.57214, 94.42164, 293.67374, 238.7825]
2025-05-11 17:47:25,956 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [135.0, 128.0, 36.0, 131.0, 83.0, 72.0, 136.0, 54.0, 137.0, 114.0]
2025-05-11 17:47:25,962 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 49/100 (estimated time remaining: 2 hours, 21 minutes, 40 seconds)
2025-05-11 17:50:05,489 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:50:06,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 207.19370 ± 95.880
2025-05-11 17:50:06,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [34.275883, 273.88538, 285.55673, 292.30536, 284.2299, 287.12943, 108.88385, 62.269333, 241.49199, 201.90909]
2025-05-11 17:50:06,609 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [41.0, 128.0, 129.0, 133.0, 128.0, 129.0, 70.0, 38.0, 123.0, 129.0]
2025-05-11 17:50:06,616 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 50/100 (estimated time remaining: 2 hours, 18 minutes, 23 seconds)
2025-05-11 17:52:48,281 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:52:49,718 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 282.14178 ± 61.087
2025-05-11 17:52:49,718 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [314.48102, 311.6533, 102.51782, 275.72736, 292.86493, 301.54022, 294.0397, 319.3077, 310.4483, 298.83734]
2025-05-11 17:52:49,718 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [152.0, 147.0, 60.0, 120.0, 133.0, 140.0, 135.0, 162.0, 147.0, 138.0]
2025-05-11 17:52:49,726 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 51/100 (estimated time remaining: 2 hours, 15 minutes, 30 seconds)
2025-05-11 17:55:30,869 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:55:32,127 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 238.80017 ± 114.503
2025-05-11 17:55:32,127 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [180.75786, 369.35425, 29.52779, 290.91705, 24.079092, 287.78094, 301.65442, 303.01266, 299.28793, 301.6296]
2025-05-11 17:55:32,127 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [96.0, 211.0, 24.0, 132.0, 19.0, 127.0, 140.0, 140.0, 140.0, 137.0]
2025-05-11 17:55:32,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 52/100 (estimated time remaining: 2 hours, 12 minutes, 27 seconds)
2025-05-11 17:58:12,313 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 17:58:13,598 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 243.95464 ± 95.435
2025-05-11 17:58:13,598 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [277.05392, 313.5455, 290.5734, 325.9273, 284.31195, 299.6873, 303.9092, 50.033382, 71.06569, 223.4386]
2025-05-11 17:58:13,598 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [125.0, 148.0, 132.0, 189.0, 126.0, 137.0, 141.0, 29.0, 61.0, 110.0]
2025-05-11 17:58:13,605 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 53/100 (estimated time remaining: 2 hours, 9 minutes, 14 seconds)
2025-05-11 18:00:55,785 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:00:57,217 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 288.52454 ± 34.656
2025-05-11 18:00:57,217 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [289.63336, 311.36984, 327.26443, 273.89038, 242.54387, 217.47862, 335.70035, 300.84528, 302.97128, 283.54813]
2025-05-11 18:00:57,217 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [132.0, 146.0, 147.0, 119.0, 110.0, 114.0, 154.0, 139.0, 141.0, 134.0]
2025-05-11 18:00:57,225 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 54/100 (estimated time remaining: 2 hours, 7 minutes, 5 seconds)
2025-05-11 18:03:38,676 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:03:40,110 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 293.61383 ± 5.973
2025-05-11 18:03:40,111 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [292.5074, 298.661, 289.13824, 284.81628, 288.6001, 304.70496, 294.64127, 299.63165, 296.1924, 287.24475]
2025-05-11 18:03:40,111 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 137.0, 131.0, 127.0, 130.0, 140.0, 134.0, 139.0, 136.0, 130.0]
2025-05-11 18:03:40,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 55/100 (estimated time remaining: 2 hours, 4 minutes, 44 seconds)
2025-05-11 18:06:19,979 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:06:21,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 235.88147 ± 99.073
2025-05-11 18:06:21,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [297.13998, 291.46942, 55.796146, 304.43448, 291.64075, 299.57474, 299.72552, 104.59479, 316.9013, 97.537575]
2025-05-11 18:06:21,195 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 133.0, 32.0, 146.0, 133.0, 137.0, 138.0, 61.0, 161.0, 55.0]
2025-05-11 18:06:21,203 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 56/100 (estimated time remaining: 2 hours, 1 minute, 43 seconds)
2025-05-11 18:09:02,869 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:09:04,198 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 255.79634 ± 85.459
2025-05-11 18:09:04,199 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [138.50783, 43.19765, 282.816, 289.45724, 293.02496, 298.44254, 299.83746, 305.4348, 307.5837, 299.66122]
2025-05-11 18:09:04,199 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [90.0, 55.0, 126.0, 133.0, 134.0, 137.0, 138.0, 143.0, 144.0, 140.0]
2025-05-11 18:09:04,207 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 57/100 (estimated time remaining: 1 hour, 59 minutes, 6 seconds)
2025-05-11 18:11:45,141 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:11:46,312 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 214.64709 ± 91.882
2025-05-11 18:11:46,313 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [298.6448, 303.6865, 297.2274, 290.1316, 295.36325, 132.87505, 216.56982, 36.70976, 127.82428, 147.43852]
2025-05-11 18:11:46,313 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 141.0, 137.0, 133.0, 135.0, 73.0, 124.0, 40.0, 82.0, 98.0]
2025-05-11 18:11:46,321 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 58/100 (estimated time remaining: 1 hour, 56 minutes, 29 seconds)
2025-05-11 18:14:28,804 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:14:30,190 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 276.01175 ± 55.107
2025-05-11 18:14:30,191 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [298.9846, 312.10315, 259.35617, 116.29587, 285.46808, 284.93362, 292.315, 302.81528, 308.3853, 299.46048]
2025-05-11 18:14:30,191 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [138.0, 153.0, 123.0, 64.0, 131.0, 126.0, 133.0, 141.0, 145.0, 138.0]
2025-05-11 18:14:30,199 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 59/100 (estimated time remaining: 1 hour, 53 minutes, 48 seconds)
2025-05-11 18:17:09,681 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:17:10,814 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 211.72746 ± 108.701
2025-05-11 18:17:10,814 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [33.84754, 300.02783, 293.59583, 116.401855, 32.448307, 291.33237, 160.36627, 282.68903, 305.96396, 300.60168]
2025-05-11 18:17:10,814 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [36.0, 142.0, 135.0, 75.0, 36.0, 140.0, 88.0, 125.0, 143.0, 140.0]
2025-05-11 18:17:10,823 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 60/100 (estimated time remaining: 1 hour, 50 minutes, 47 seconds)
2025-05-11 18:19:53,561 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:19:54,949 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 267.49759 ± 77.335
2025-05-11 18:19:54,950 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [300.58563, 140.61983, 332.24832, 300.66315, 297.87064, 297.0963, 300.81052, 301.33078, 312.90668, 90.844055]
2025-05-11 18:19:54,950 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [155.0, 79.0, 167.0, 138.0, 137.0, 136.0, 139.0, 139.0, 148.0, 56.0]
2025-05-11 18:19:54,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 61/100 (estimated time remaining: 1 hour, 48 minutes, 30 seconds)
2025-05-11 18:22:39,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:22:40,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 217.08755 ± 100.702
2025-05-11 18:22:40,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [230.80862, 107.98892, 264.19638, 291.8006, 292.58328, 281.9129, 22.05457, 299.5977, 302.4354, 77.49723]
2025-05-11 18:22:40,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [116.0, 57.0, 115.0, 133.0, 133.0, 151.0, 27.0, 136.0, 141.0, 44.0]
2025-05-11 18:22:40,590 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 62/100 (estimated time remaining: 1 hour, 46 minutes, 7 seconds)
2025-05-11 18:25:24,645 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:25:26,123 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 260.52185 ± 83.236
2025-05-11 18:25:26,123 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [250.58998, 109.403, 325.87372, 90.5654, 317.96423, 298.44016, 299.66544, 291.68533, 332.32886, 288.7023]
2025-05-11 18:25:26,123 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [117.0, 73.0, 167.0, 51.0, 167.0, 141.0, 138.0, 134.0, 181.0, 132.0]
2025-05-11 18:25:26,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 63/100 (estimated time remaining: 1 hour, 43 minutes, 50 seconds)
2025-05-11 18:28:08,105 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:28:09,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 270.48767 ± 64.294
2025-05-11 18:28:09,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [270.56564, 288.2615, 289.8967, 292.13754, 297.51855, 79.42418, 285.6652, 300.1843, 303.1567, 298.06638]
2025-05-11 18:28:09,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [122.0, 126.0, 137.0, 132.0, 136.0, 46.0, 125.0, 137.0, 141.0, 138.0]
2025-05-11 18:28:09,437 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 64/100 (estimated time remaining: 1 hour, 41 minutes, 2 seconds)
2025-05-11 18:30:54,154 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:30:55,513 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 259.74310 ± 72.445
2025-05-11 18:30:55,513 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [204.50964, 290.3786, 290.57867, 283.29883, 290.6201, 137.75053, 363.02188, 296.19217, 310.21774, 130.86285]
2025-05-11 18:30:55,513 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [111.0, 137.0, 133.0, 124.0, 133.0, 83.0, 192.0, 136.0, 147.0, 73.0]
2025-05-11 18:30:55,521 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 65/100 (estimated time remaining: 1 hour, 38 minutes, 57 seconds)
2025-05-11 18:33:37,588 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:33:39,012 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 278.93335 ± 57.153
2025-05-11 18:33:39,013 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [304.72882, 305.95123, 236.47678, 286.5843, 123.48004, 338.86035, 298.84756, 301.85855, 304.26895, 288.27698]
2025-05-11 18:33:39,013 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [142.0, 143.0, 112.0, 128.0, 67.0, 176.0, 138.0, 140.0, 142.0, 138.0]
2025-05-11 18:33:39,021 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 66/100 (estimated time remaining: 1 hour, 36 minutes, 8 seconds)
2025-05-11 18:36:21,943 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:36:23,231 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 240.07402 ± 102.409
2025-05-11 18:36:23,232 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [298.58078, 307.91135, 47.21803, 294.2933, 116.858986, 315.57693, 313.5134, 295.05545, 93.88763, 317.84436]
2025-05-11 18:36:23,232 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 145.0, 44.0, 143.0, 66.0, 160.0, 153.0, 136.0, 57.0, 155.0]
2025-05-11 18:36:23,241 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 67/100 (estimated time remaining: 1 hour, 33 minutes, 14 seconds)
2025-05-11 18:39:30,112 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:39:31,331 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 190.61786 ± 128.448
2025-05-11 18:39:31,331 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [70.97635, 27.235216, 368.91443, 286.62415, 292.6088, 187.66606, 268.2897, 34.248817, 329.74274, 39.872494]
2025-05-11 18:39:31,331 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [40.0, 31.0, 199.0, 129.0, 134.0, 96.0, 128.0, 39.0, 172.0, 23.0]
2025-05-11 18:39:31,342 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 68/100 (estimated time remaining: 1 hour, 32 minutes, 58 seconds)
2025-05-11 18:42:52,050 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:42:53,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 231.73877 ± 111.391
2025-05-11 18:42:53,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [293.21432, 298.10153, 287.7224, 298.11386, 324.19052, 328.0303, 296.44135, 86.48409, 75.24225, 29.847233]
2025-05-11 18:42:53,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 138.0, 131.0, 137.0, 161.0, 166.0, 137.0, 51.0, 43.0, 35.0]
2025-05-11 18:42:53,468 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 69/100 (estimated time remaining: 1 hour, 34 minutes, 17 seconds)
2025-05-11 18:46:15,355 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:46:16,843 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 254.02898 ± 89.566
2025-05-11 18:46:16,843 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [296.6543, 301.31046, 289.8669, 289.17514, 269.39526, 313.5337, 111.467155, 318.07666, 304.0173, 46.79303]
2025-05-11 18:46:16,843 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [138.0, 140.0, 132.0, 131.0, 129.0, 152.0, 63.0, 161.0, 143.0, 27.0]
2025-05-11 18:46:16,855 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 70/100 (estimated time remaining: 1 hour, 35 minutes, 12 seconds)
2025-05-11 18:48:54,779 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:48:55,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 244.10593 ± 99.617
2025-05-11 18:48:55,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [308.6823, 296.60083, 92.870224, 306.60712, 290.58868, 300.0186, 17.24026, 306.8158, 307.10205, 214.53354]
2025-05-11 18:48:55,971 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [151.0, 136.0, 53.0, 149.0, 126.0, 138.0, 17.0, 144.0, 144.0, 113.0]
2025-05-11 18:48:55,980 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 71/100 (estimated time remaining: 1 hour, 31 minutes, 41 seconds)
2025-05-11 18:51:31,770 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:51:32,993 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 253.67204 ± 80.743
2025-05-11 18:51:32,993 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [295.98788, 301.34677, 292.975, 290.8224, 279.66055, 190.36877, 293.4915, 300.48026, 30.748482, 260.83853]
2025-05-11 18:51:32,993 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 141.0, 134.0, 136.0, 122.0, 104.0, 134.0, 139.0, 32.0, 128.0]
2025-05-11 18:51:33,002 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 72/100 (estimated time remaining: 1 hour, 27 minutes, 56 seconds)
2025-05-11 18:54:09,325 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:54:10,437 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 230.59561 ± 102.994
2025-05-11 18:54:10,437 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [301.57977, 305.85767, 14.394098, 147.60303, 302.81177, 300.0728, 246.46126, 296.4831, 306.30548, 84.38727]
2025-05-11 18:54:10,437 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [139.0, 141.0, 16.0, 81.0, 139.0, 138.0, 119.0, 135.0, 143.0, 50.0]
2025-05-11 18:54:10,446 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 73/100 (estimated time remaining: 1 hour, 22 minutes, 2 seconds)
2025-05-11 18:56:47,742 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:56:48,933 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 247.93588 ± 98.293
2025-05-11 18:56:48,933 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [311.1297, 304.15152, 272.0786, 293.44333, 55.494118, 296.78903, 49.145016, 301.9297, 300.81836, 294.3794]
2025-05-11 18:56:48,933 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [144.0, 141.0, 122.0, 134.0, 32.0, 141.0, 36.0, 149.0, 138.0, 134.0]
2025-05-11 18:56:48,943 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 74/100 (estimated time remaining: 1 hour, 15 minutes, 11 seconds)
2025-05-11 18:59:31,304 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 18:59:32,644 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 262.07849 ± 67.942
2025-05-11 18:59:32,644 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [269.92487, 271.43637, 285.95267, 298.1781, 157.0928, 277.29767, 326.6114, 111.20202, 288.49344, 334.59564]
2025-05-11 18:59:32,644 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [116.0, 125.0, 130.0, 146.0, 85.0, 128.0, 167.0, 64.0, 127.0, 167.0]
2025-05-11 18:59:32,655 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 75/100 (estimated time remaining: 1 hour, 8 minutes, 58 seconds)
2025-05-11 19:02:10,599 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:02:11,754 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 235.76218 ± 90.492
2025-05-11 19:02:11,754 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [293.02524, 90.57731, 287.29163, 62.738422, 281.59842, 292.76236, 150.8978, 308.11902, 300.73068, 289.88104]
2025-05-11 19:02:11,754 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 56.0, 130.0, 36.0, 131.0, 140.0, 87.0, 159.0, 139.0, 134.0]
2025-05-11 19:02:11,764 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 76/100 (estimated time remaining: 1 hour, 6 minutes, 18 seconds)
2025-05-11 19:04:48,453 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:04:49,693 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 250.92465 ± 68.198
2025-05-11 19:04:49,693 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [288.0962, 295.1342, 289.89325, 140.08269, 102.764366, 295.63382, 233.85628, 296.16632, 262.3915, 305.22797]
2025-05-11 19:04:49,693 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [131.0, 136.0, 132.0, 80.0, 79.0, 136.0, 118.0, 138.0, 128.0, 156.0]
2025-05-11 19:04:49,703 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 77/100 (estimated time remaining: 1 hour, 3 minutes, 44 seconds)
2025-05-11 19:07:26,633 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:07:27,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 247.72336 ± 85.884
2025-05-11 19:07:27,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [298.743, 295.26547, 292.88733, 239.96646, 303.32715, 291.54315, 293.28644, 290.65594, 137.80899, 33.749454]
2025-05-11 19:07:27,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [137.0, 137.0, 133.0, 115.0, 143.0, 133.0, 133.0, 131.0, 82.0, 37.0]
2025-05-11 19:07:27,834 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 78/100 (estimated time remaining: 1 hour, 1 minute, 7 seconds)
2025-05-11 19:10:03,352 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:10:04,571 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 252.28264 ± 93.654
2025-05-11 19:10:04,571 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [158.20512, 323.41626, 62.96159, 298.37775, 367.27036, 130.28455, 283.28763, 303.48434, 299.50842, 296.0305]
2025-05-11 19:10:04,571 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [87.0, 161.0, 36.0, 137.0, 164.0, 81.0, 125.0, 141.0, 139.0, 137.0]
2025-05-11 19:10:04,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 79/100 (estimated time remaining: 58 minutes, 20 seconds)
2025-05-11 19:12:39,645 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:12:40,796 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 240.34860 ± 81.736
2025-05-11 19:12:40,796 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [55.524147, 296.11307, 290.55875, 282.4025, 300.76672, 294.66397, 156.78818, 278.4738, 154.51443, 293.68073]
2025-05-11 19:12:40,796 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [32.0, 137.0, 131.0, 128.0, 143.0, 134.0, 94.0, 122.0, 86.0, 130.0]
2025-05-11 19:12:40,807 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 80/100 (estimated time remaining: 55 minutes, 10 seconds)
2025-05-11 19:15:17,020 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:15:18,397 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 294.51547 ± 10.647
2025-05-11 19:15:18,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [289.5304, 306.33203, 289.34525, 289.06442, 293.58746, 302.27728, 269.674, 299.8568, 308.68307, 296.80426]
2025-05-11 19:15:18,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [132.0, 143.0, 131.0, 131.0, 134.0, 140.0, 125.0, 138.0, 145.0, 137.0]
2025-05-11 19:15:18,409 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 81/100 (estimated time remaining: 52 minutes, 26 seconds)
2025-05-11 19:17:53,843 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:17:55,103 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 270.01367 ± 69.856
2025-05-11 19:17:55,103 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [291.38498, 287.0514, 293.88797, 283.63943, 292.2297, 291.56937, 301.3776, 61.710674, 285.9067, 311.37878]
2025-05-11 19:17:55,103 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [132.0, 136.0, 134.0, 128.0, 132.0, 132.0, 140.0, 36.0, 127.0, 150.0]
2025-05-11 19:17:55,113 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 82/100 (estimated time remaining: 49 minutes, 44 seconds)
2025-05-11 19:20:31,136 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:20:32,286 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 227.85410 ± 110.539
2025-05-11 19:20:32,286 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [75.6461, 318.43088, 296.61227, 286.55988, 291.34653, 295.63867, 77.68051, 323.6956, 29.803905, 283.1266]
2025-05-11 19:20:32,286 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [55.0, 160.0, 136.0, 129.0, 132.0, 135.0, 52.0, 175.0, 33.0, 123.0]
2025-05-11 19:20:32,296 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 83/100 (estimated time remaining: 47 minutes, 4 seconds)
2025-05-11 19:23:09,120 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:23:10,370 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 242.67493 ± 98.228
2025-05-11 19:23:10,371 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [299.81122, 301.77887, 300.71027, 289.25012, 85.18419, 304.01715, 119.39757, 319.6739, 78.11344, 328.81244]
2025-05-11 19:23:10,371 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [139.0, 141.0, 139.0, 131.0, 47.0, 151.0, 95.0, 162.0, 45.0, 177.0]
2025-05-11 19:23:10,381 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 84/100 (estimated time remaining: 44 minutes, 31 seconds)
2025-05-11 19:25:48,131 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:25:49,519 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 295.24643 ± 6.712
2025-05-11 19:25:49,519 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [290.51236, 300.1734, 289.2894, 288.34286, 286.9692, 309.08148, 294.86227, 293.077, 302.22464, 297.93173]
2025-05-11 19:25:49,519 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [132.0, 139.0, 132.0, 131.0, 130.0, 150.0, 135.0, 135.0, 140.0, 139.0]
2025-05-11 19:25:49,529 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 85/100 (estimated time remaining: 42 minutes, 3 seconds)
2025-05-11 19:28:28,133 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:28:29,408 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 268.80438 ± 71.774
2025-05-11 19:28:29,408 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [293.19156, 299.53973, 54.48926, 293.34085, 294.00076, 289.54907, 295.4483, 298.4496, 296.4987, 273.536]
2025-05-11 19:28:29,408 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 138.0, 52.0, 134.0, 134.0, 131.0, 135.0, 138.0, 137.0, 122.0]
2025-05-11 19:28:29,419 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 86/100 (estimated time remaining: 39 minutes, 33 seconds)
2025-05-11 19:31:05,932 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:31:07,229 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 260.79697 ± 71.205
2025-05-11 19:31:07,229 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [192.15213, 69.53471, 292.76993, 292.17712, 293.44504, 299.703, 274.07565, 299.7301, 312.68738, 281.6948]
2025-05-11 19:31:07,229 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [142.0, 44.0, 133.0, 132.0, 133.0, 136.0, 144.0, 139.0, 145.0, 131.0]
2025-05-11 19:31:07,240 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 87/100 (estimated time remaining: 36 minutes, 57 seconds)
2025-05-11 19:33:44,343 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:33:45,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 243.16467 ± 105.438
2025-05-11 19:33:45,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [366.5497, 301.41028, 292.3117, 310.39542, 86.36592, 31.388203, 293.87665, 155.06146, 300.5895, 293.6978]
2025-05-11 19:33:45,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [192.0, 139.0, 133.0, 140.0, 48.0, 35.0, 135.0, 91.0, 140.0, 134.0]
2025-05-11 19:33:45,562 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 88/100 (estimated time remaining: 34 minutes, 22 seconds)
2025-05-11 19:36:23,175 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:36:23,897 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 143.48486 ± 105.066
2025-05-11 19:36:23,897 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [279.1907, 92.02182, 68.039116, 30.72391, 79.118095, 210.17798, 284.5735, 86.38822, 291.95895, 12.6562605]
2025-05-11 19:36:23,897 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [121.0, 53.0, 39.0, 32.0, 44.0, 109.0, 124.0, 57.0, 130.0, 15.0]
2025-05-11 19:36:23,908 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 89/100 (estimated time remaining: 31 minutes, 44 seconds)
2025-05-11 19:39:00,442 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:39:01,717 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 273.99191 ± 61.939
2025-05-11 19:39:01,717 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [291.86612, 306.8088, 293.4778, 280.47025, 290.41608, 294.77298, 295.26794, 298.41995, 89.174065, 299.24506]
2025-05-11 19:39:01,717 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [133.0, 143.0, 134.0, 122.0, 131.0, 134.0, 135.0, 137.0, 52.0, 135.0]
2025-05-11 19:39:01,728 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 90/100 (estimated time remaining: 29 minutes, 2 seconds)
2025-05-11 19:41:38,409 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:41:39,328 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 177.21707 ± 114.098
2025-05-11 19:41:39,329 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [288.30426, 303.8555, 51.425083, 42.48127, 56.351017, 275.33154, 47.55584, 130.60951, 286.42606, 289.83054]
2025-05-11 19:41:39,329 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 140.0, 51.0, 41.0, 47.0, 123.0, 32.0, 82.0, 134.0, 130.0]
2025-05-11 19:41:39,340 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 91/100 (estimated time remaining: 26 minutes, 19 seconds)
2025-05-11 19:44:16,371 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:44:17,550 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 245.87332 ± 76.619
2025-05-11 19:44:17,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [174.25839, 300.2438, 293.0846, 296.33646, 297.52438, 140.20839, 275.3004, 296.73166, 299.44806, 85.5972]
2025-05-11 19:44:17,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [93.0, 139.0, 134.0, 136.0, 137.0, 81.0, 121.0, 137.0, 140.0, 52.0]
2025-05-11 19:44:17,562 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 92/100 (estimated time remaining: 23 minutes, 42 seconds)
2025-05-11 19:46:54,647 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:46:55,820 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 239.61406 ± 100.456
2025-05-11 19:46:55,820 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [300.19977, 81.26582, 295.833, 293.97287, 33.116726, 167.58011, 323.95178, 297.3768, 309.57645, 293.26743]
2025-05-11 19:46:55,820 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [139.0, 54.0, 135.0, 134.0, 25.0, 107.0, 144.0, 136.0, 145.0, 134.0]
2025-05-11 19:46:55,832 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 93/100 (estimated time remaining: 21 minutes, 4 seconds)
2025-05-11 19:49:34,706 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:49:36,001 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 276.18512 ± 52.795
2025-05-11 19:49:36,001 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [304.21207, 298.2944, 289.76657, 292.3321, 289.47748, 292.53342, 292.97946, 118.38225, 288.1178, 295.7561]
2025-05-11 19:49:36,001 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [142.0, 138.0, 132.0, 133.0, 132.0, 134.0, 134.0, 66.0, 128.0, 136.0]
2025-05-11 19:49:36,012 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 94/100 (estimated time remaining: 18 minutes, 28 seconds)
2025-05-11 19:52:15,308 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:52:16,828 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 300.72861 ± 21.810
2025-05-11 19:52:16,828 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [296.56226, 306.40152, 307.39987, 288.5412, 294.76913, 293.88263, 268.4892, 358.57626, 298.23083, 294.43307]
2025-05-11 19:52:16,828 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [134.0, 138.0, 141.0, 131.0, 134.0, 134.0, 126.0, 200.0, 137.0, 135.0]
2025-05-11 19:52:16,841 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 95/100 (estimated time remaining: 15 minutes, 54 seconds)
2025-05-11 19:54:58,641 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:54:59,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 231.75883 ± 97.769
2025-05-11 19:54:59,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [301.42838, 108.535, 37.48759, 288.8479, 286.3209, 295.50958, 300.36835, 293.9992, 295.33755, 109.75402]
2025-05-11 19:54:59,824 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [139.0, 64.0, 39.0, 130.0, 129.0, 135.0, 139.0, 134.0, 136.0, 62.0]
2025-05-11 19:54:59,837 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 96/100 (estimated time remaining: 13 minutes, 20 seconds)
2025-05-11 19:57:41,166 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 19:57:42,458 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 255.43857 ± 72.431
2025-05-11 19:57:42,458 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [296.70752, 298.14026, 291.7108, 287.43478, 291.92337, 268.52963, 301.58136, 103.29228, 120.27816, 294.78748]
2025-05-11 19:57:42,459 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [135.0, 137.0, 132.0, 130.0, 133.0, 124.0, 146.0, 64.0, 70.0, 134.0]
2025-05-11 19:57:42,471 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 97/100 (estimated time remaining: 10 minutes, 43 seconds)
2025-05-11 20:00:24,409 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 20:00:25,749 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 263.59079 ± 82.725
2025-05-11 20:00:25,749 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [100.66129, 295.6637, 293.9691, 295.61566, 295.13736, 289.0337, 296.56168, 114.59467, 273.22745, 381.44327]
2025-05-11 20:00:25,749 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [59.0, 132.0, 134.0, 134.0, 135.0, 129.0, 135.0, 65.0, 128.0, 196.0]
2025-05-11 20:00:25,762 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 98/100 (estimated time remaining: 8 minutes, 5 seconds)
2025-05-11 20:03:06,519 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 20:03:07,815 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 255.59322 ± 77.924
2025-05-11 20:03:07,815 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [291.56158, 73.33443, 308.736, 283.03366, 132.04019, 291.4817, 301.70984, 289.81302, 299.6492, 284.57245]
2025-05-11 20:03:07,815 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [133.0, 50.0, 154.0, 127.0, 72.0, 134.0, 140.0, 132.0, 139.0, 131.0]
2025-05-11 20:03:07,828 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 99/100 (estimated time remaining: 5 minutes, 24 seconds)
2025-05-11 20:05:49,273 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 20:05:50,657 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 279.89120 ± 30.549
2025-05-11 20:05:50,657 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [295.6647, 294.76138, 289.7959, 275.06387, 284.00876, 287.43167, 297.61118, 289.82083, 190.22281, 294.5309]
2025-05-11 20:05:50,657 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [136.0, 135.0, 131.0, 120.0, 128.0, 130.0, 137.0, 132.0, 103.0, 141.0]
2025-05-11 20:05:50,671 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1199 [INFO]: Iteration 100/100 (estimated time remaining: 2 minutes, 42 seconds)
2025-05-11 20:08:31,931 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1214 [DEBUG]: Evaluating for latency ExtremeClogL1U23...
2025-05-11 20:08:32,945 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1221 [DEBUG]: Total Reward: 192.84827 ± 104.708
2025-05-11 20:08:32,945 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1222 [DEBUG]: All rewards: [293.77454, 138.56573, 281.37003, 90.687454, 20.650875, 37.635326, 285.81107, 284.30908, 221.06487, 274.6137]
2025-05-11 20:08:32,945 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1223 [DEBUG]: All trajectory lengths: [135.0, 77.0, 125.0, 53.0, 24.0, 40.0, 129.0, 131.0, 117.0, 124.0]
2025-05-11 20:08:32,959 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-hopper):1251 [DEBUG]: Training session finished
