2025-05-06 00:21:30,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1006 [DEBUG]: logdir: _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32
2025-05-06 00:21:30,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1007 [DEBUG]: trainer_prefix: benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32
2025-05-06 00:21:30,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1008 [DEBUG]: args.trainer_eval_latencies: {'SparseU15': <latency_env.delayed_mdp.HiddenMarkovianDelay object at 0x76df6fdc7d00>}
2025-05-06 00:21:30,121 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1009 [DEBUG]: using device: cpu
2025-05-06 00:21:30,137 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1031 [INFO]: Creating new trainer
2025-05-06 00:21:30,147 baseline-sac-noisy-humanoid:105 [DEBUG]: pi network:
NNGaussianPolicy(
  (common_head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=920, out_features=256, bias=True)
    (2): ReLU()
    (3): Linear(in_features=256, out_features=256, bias=True)
    (4): ReLU()
  )
  (mu_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (log_std_head): Sequential(
    (0): Linear(in_features=256, out_features=17, bias=True)
    (1): Unflatten(dim=1, unflattened_size=(17,))
  )
  (tanh_refit): NNTanhRefit(
    scale: tensor([[0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000,
             0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000, 0.8000]]), shift: tensor([[-0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000, -0.4000,
             -0.4000]])
  )
)
2025-05-06 00:21:30,147 baseline-sac-noisy-humanoid:106 [DEBUG]: q network:
NNLayerConcat2(
  dim: -1
  (next): Sequential(
    (0): Linear(in_features=937, out_features=256, bias=True)
    (1): ReLU()
    (2): Linear(in_features=256, out_features=256, bias=True)
    (3): ReLU()
    (4): Linear(in_features=256, out_features=1, bias=True)
    (5): NNLayerSqueeze(dim: -1)
  )
  (init_left): Flatten(start_dim=1, end_dim=-1)
  (init_right): Flatten(start_dim=1, end_dim=-1)
)
2025-05-06 00:21:31,189 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1092 [DEBUG]: Starting training session...
2025-05-06 00:21:31,189 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 1/100
2025-05-06 00:25:26,655 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:25:27,603 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 163.98361 ± 17.866
2025-05-06 00:25:27,603 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [160.9391, 160.93646, 145.38402, 185.75835, 125.29761, 166.0207, 184.30382, 156.24554, 180.28352, 174.66692]
2025-05-06 00:25:27,603 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 31.0, 28.0, 36.0, 24.0, 32.0, 36.0, 30.0, 35.0, 34.0]
2025-05-06 00:25:27,603 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1124 [INFO]: New best (163.98) for latency SparseU15
2025-05-06 00:25:27,603 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1127 [INFO]: saving network
2025-05-06 00:25:27,607 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 00:25:27,615 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 2/100 (estimated time remaining: 6 hours, 30 minutes, 6 seconds)
2025-05-06 00:29:38,178 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:29:39,105 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 161.38358 ± 13.005
2025-05-06 00:29:39,106 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [152.01244, 140.54517, 174.62692, 150.94331, 170.99603, 150.91663, 160.62695, 187.09143, 160.20834, 165.86845]
2025-05-06 00:29:39,106 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 27.0, 34.0, 29.0, 33.0, 29.0, 31.0, 36.0, 31.0, 32.0]
2025-05-06 00:29:39,107 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 3/100 (estimated time remaining: 6 hours, 38 minutes, 27 seconds)
2025-05-06 00:33:48,986 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:33:49,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 168.76671 ± 19.901
2025-05-06 00:33:49,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.29448, 151.71062, 140.29399, 150.10016, 161.47021, 194.48251, 190.33803, 166.30357, 174.14476, 202.52867]
2025-05-06 00:33:49,968 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 29.0, 27.0, 29.0, 31.0, 38.0, 37.0, 32.0, 34.0, 39.0]
2025-05-06 00:33:49,969 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1124 [INFO]: New best (168.77) for latency SparseU15
2025-05-06 00:33:49,969 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1127 [INFO]: saving network
2025-05-06 00:33:49,973 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 00:33:49,982 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 4/100 (estimated time remaining: 6 hours, 38 minutes, 7 seconds)
2025-05-06 00:38:08,638 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:38:09,573 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 161.51604 ± 22.055
2025-05-06 00:38:09,573 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [146.52168, 185.1489, 181.11153, 145.53615, 195.20142, 130.27989, 179.75047, 161.17499, 129.92976, 160.5055]
2025-05-06 00:38:09,574 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [28.0, 36.0, 35.0, 28.0, 38.0, 25.0, 35.0, 31.0, 25.0, 31.0]
2025-05-06 00:38:09,575 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 5/100 (estimated time remaining: 6 hours, 39 minutes, 21 seconds)
2025-05-06 00:42:28,642 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:42:29,758 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 190.64615 ± 39.097
2025-05-06 00:42:29,758 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [293.67615, 160.67259, 196.2094, 165.96725, 181.29314, 205.09697, 146.04797, 208.27042, 177.92513, 171.30232]
2025-05-06 00:42:29,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [56.0, 31.0, 38.0, 32.0, 35.0, 40.0, 28.0, 40.0, 35.0, 33.0]
2025-05-06 00:42:29,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1124 [INFO]: New best (190.65) for latency SparseU15
2025-05-06 00:42:29,759 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1127 [INFO]: saving network
2025-05-06 00:42:29,763 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 00:42:29,773 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 6/100 (estimated time remaining: 6 hours, 38 minutes, 33 seconds)
2025-05-06 00:46:44,571 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:46:45,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 172.49586 ± 20.203
2025-05-06 00:46:45,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [162.26474, 171.06367, 196.70238, 166.60387, 189.90427, 165.84833, 171.846, 199.8493, 175.78688, 125.08921]
2025-05-06 00:46:45,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 33.0, 39.0, 32.0, 37.0, 32.0, 33.0, 39.0, 34.0, 24.0]
2025-05-06 00:46:45,583 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 7/100 (estimated time remaining: 6 hours, 40 minutes, 25 seconds)
2025-05-06 00:51:02,993 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:51:03,897 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 156.28400 ± 22.790
2025-05-06 00:51:03,898 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [157.0181, 195.88632, 150.14824, 130.67355, 192.60614, 159.87137, 130.55418, 171.01137, 134.61002, 140.46065]
2025-05-06 00:51:03,898 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 38.0, 29.0, 25.0, 38.0, 31.0, 25.0, 33.0, 26.0, 27.0]
2025-05-06 00:51:03,899 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 8/100 (estimated time remaining: 6 hours, 38 minutes, 17 seconds)
2025-05-06 00:55:22,885 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:55:23,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 168.24252 ± 29.056
2025-05-06 00:55:23,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [202.3658, 150.58005, 129.75432, 160.88994, 224.30272, 181.66884, 139.90845, 140.7618, 190.69574, 161.4976]
2025-05-06 00:55:23,866 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [39.0, 29.0, 25.0, 31.0, 44.0, 35.0, 27.0, 27.0, 38.0, 31.0]
2025-05-06 00:55:23,868 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 9/100 (estimated time remaining: 6 hours, 36 minutes, 47 seconds)
2025-05-06 00:59:44,813 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 00:59:45,808 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 171.11551 ± 44.065
2025-05-06 00:59:45,809 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [136.45078, 184.67326, 160.76686, 179.27437, 130.04555, 150.49208, 292.77332, 145.93767, 154.29558, 176.44563]
2025-05-06 00:59:45,809 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [26.0, 36.0, 31.0, 35.0, 25.0, 29.0, 56.0, 28.0, 30.0, 34.0]
2025-05-06 00:59:45,811 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 10/100 (estimated time remaining: 6 hours, 33 minutes, 11 seconds)
2025-05-06 01:04:08,610 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:04:09,577 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 166.85538 ± 15.486
2025-05-06 01:04:09,577 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [162.74207, 176.15735, 175.94551, 186.63448, 151.00363, 145.36542, 165.26979, 194.98248, 159.90204, 150.5512]
2025-05-06 01:04:09,577 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 34.0, 34.0, 36.0, 29.0, 28.0, 32.0, 37.0, 31.0, 29.0]
2025-05-06 01:04:09,579 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 11/100 (estimated time remaining: 6 hours, 29 minutes, 56 seconds)
2025-05-06 01:08:30,684 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:08:31,758 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 183.55463 ± 52.152
2025-05-06 01:08:31,758 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [192.90381, 140.76318, 191.93056, 157.07797, 207.65498, 175.3841, 135.0893, 324.40533, 149.61853, 160.71848]
2025-05-06 01:08:31,758 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [37.0, 27.0, 37.0, 30.0, 41.0, 34.0, 26.0, 61.0, 29.0, 31.0]
2025-05-06 01:08:31,761 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 12/100 (estimated time remaining: 6 hours, 27 minutes, 29 seconds)
2025-05-06 01:12:53,698 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:12:54,847 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 195.23203 ± 48.984
2025-05-06 01:12:54,847 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [320.81207, 170.32217, 200.52258, 180.90569, 175.494, 234.03934, 201.13545, 155.68843, 177.99364, 135.40685]
2025-05-06 01:12:54,847 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [61.0, 33.0, 39.0, 35.0, 34.0, 47.0, 39.0, 30.0, 35.0, 26.0]
2025-05-06 01:12:54,848 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1124 [INFO]: New best (195.23) for latency SparseU15
2025-05-06 01:12:54,848 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1127 [INFO]: saving network
2025-05-06 01:12:54,852 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 01:12:54,863 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 13/100 (estimated time remaining: 6 hours, 24 minutes, 32 seconds)
2025-05-06 01:17:17,203 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:17:18,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 158.05803 ± 15.091
2025-05-06 01:17:18,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [168.72418, 167.12599, 180.5796, 151.33856, 171.12175, 135.0906, 156.20929, 155.23935, 164.89043, 130.26064]
2025-05-06 01:17:18,119 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 32.0, 35.0, 29.0, 33.0, 26.0, 30.0, 30.0, 32.0, 25.0]
2025-05-06 01:17:18,122 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 14/100 (estimated time remaining: 6 hours, 21 minutes, 8 seconds)
2025-05-06 01:21:40,716 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:21:41,724 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 172.51233 ± 27.854
2025-05-06 01:21:41,725 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [146.68321, 129.8664, 221.98555, 176.88689, 165.79512, 175.36479, 180.67752, 167.28082, 144.67186, 215.9112]
2025-05-06 01:21:41,725 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [28.0, 25.0, 43.0, 34.0, 32.0, 34.0, 35.0, 32.0, 28.0, 42.0]
2025-05-06 01:21:41,727 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 15/100 (estimated time remaining: 6 hours, 17 minutes, 13 seconds)
2025-05-06 01:26:04,332 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:26:05,315 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 169.04402 ± 23.697
2025-05-06 01:26:05,315 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [167.0185, 161.04639, 176.24277, 141.01369, 181.2099, 198.74855, 210.65251, 129.86235, 175.02174, 149.62375]
2025-05-06 01:26:05,315 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 31.0, 34.0, 27.0, 35.0, 39.0, 41.0, 25.0, 34.0, 29.0]
2025-05-06 01:26:05,318 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 16/100 (estimated time remaining: 6 hours, 12 minutes, 47 seconds)
2025-05-06 01:30:27,616 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:30:28,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 166.36234 ± 18.089
2025-05-06 01:30:28,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [181.6667, 190.6239, 184.84386, 150.95537, 176.41554, 164.13058, 146.4331, 172.05714, 129.83255, 166.66455]
2025-05-06 01:30:28,581 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 37.0, 36.0, 29.0, 34.0, 32.0, 28.0, 33.0, 25.0, 32.0]
2025-05-06 01:30:28,584 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 17/100 (estimated time remaining: 6 hours, 8 minutes, 42 seconds)
2025-05-06 01:34:50,701 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:34:51,734 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 175.81149 ± 21.710
2025-05-06 01:34:51,734 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [170.3564, 145.68205, 195.82335, 170.92004, 183.84682, 197.9758, 216.2219, 171.48715, 159.74977, 146.05182]
2025-05-06 01:34:51,734 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 28.0, 38.0, 33.0, 36.0, 39.0, 42.0, 33.0, 31.0, 28.0]
2025-05-06 01:34:51,737 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 18/100 (estimated time remaining: 6 hours, 4 minutes, 20 seconds)
2025-05-06 01:39:13,773 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:39:14,776 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 170.94717 ± 23.834
2025-05-06 01:39:14,777 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [211.48035, 130.61394, 199.72867, 161.9331, 192.50267, 181.18379, 165.43166, 145.99883, 155.18925, 165.40945]
2025-05-06 01:39:14,777 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [41.0, 25.0, 39.0, 31.0, 37.0, 35.0, 32.0, 28.0, 30.0, 32.0]
2025-05-06 01:39:14,780 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 19/100 (estimated time remaining: 5 hours, 59 minutes, 53 seconds)
2025-05-06 01:43:36,676 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:43:37,731 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 178.19472 ± 19.852
2025-05-06 01:43:37,732 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [168.48404, 189.92384, 154.9358, 140.70503, 210.675, 185.59084, 176.82518, 166.48575, 192.70448, 195.61723]
2025-05-06 01:43:37,732 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 37.0, 30.0, 27.0, 41.0, 36.0, 34.0, 32.0, 38.0, 38.0]
2025-05-06 01:43:37,735 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 20/100 (estimated time remaining: 5 hours, 55 minutes, 19 seconds)
2025-05-06 01:47:59,742 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:48:00,896 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 194.92126 ± 47.174
2025-05-06 01:48:00,896 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [176.79222, 229.05212, 170.45435, 194.83545, 144.74013, 156.37447, 185.526, 320.92422, 183.51965, 186.994]
2025-05-06 01:48:00,896 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [34.0, 44.0, 33.0, 38.0, 28.0, 30.0, 36.0, 62.0, 36.0, 36.0]
2025-05-06 01:48:00,900 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 21/100 (estimated time remaining: 5 hours, 50 minutes, 49 seconds)
2025-05-06 01:52:24,221 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:52:25,272 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 177.24117 ± 19.187
2025-05-06 01:52:25,272 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [162.19316, 212.21042, 191.29597, 184.89236, 199.84421, 151.1359, 180.46709, 156.50737, 158.84502, 175.02016]
2025-05-06 01:52:25,272 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 42.0, 37.0, 36.0, 39.0, 29.0, 35.0, 30.0, 31.0, 34.0]
2025-05-06 01:52:25,276 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 22/100 (estimated time remaining: 5 hours, 46 minutes, 43 seconds)
2025-05-06 01:56:48,252 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 01:56:49,178 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 159.27927 ± 16.327
2025-05-06 01:56:49,179 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [157.6155, 175.49028, 161.3139, 125.36097, 176.6042, 155.63362, 156.03178, 161.92946, 140.12787, 182.6851]
2025-05-06 01:56:49,179 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 34.0, 31.0, 24.0, 34.0, 30.0, 30.0, 31.0, 27.0, 35.0]
2025-05-06 01:56:49,182 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 23/100 (estimated time remaining: 5 hours, 42 minutes, 32 seconds)
2025-05-06 02:01:13,020 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:01:13,985 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 165.09444 ± 22.840
2025-05-06 02:01:13,985 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [162.39876, 146.72156, 125.30782, 165.56532, 189.91655, 129.64508, 191.13808, 175.66736, 178.14397, 186.43997]
2025-05-06 02:01:13,985 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 28.0, 24.0, 32.0, 37.0, 25.0, 37.0, 34.0, 35.0, 36.0]
2025-05-06 02:01:13,989 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 24/100 (estimated time remaining: 5 hours, 38 minutes, 35 seconds)
2025-05-06 02:05:37,556 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:05:38,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 213.91928 ± 103.071
2025-05-06 02:05:38,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [171.52242, 520.98773, 187.05087, 161.60179, 204.68825, 165.40224, 181.5368, 187.41248, 187.51823, 171.47197]
2025-05-06 02:05:38,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 97.0, 36.0, 31.0, 40.0, 32.0, 35.0, 36.0, 37.0, 33.0]
2025-05-06 02:05:38,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1124 [INFO]: New best (213.92) for latency SparseU15
2025-05-06 02:05:38,801 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1127 [INFO]: saving network
2025-05-06 02:05:38,805 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 02:05:38,818 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 25/100 (estimated time remaining: 5 hours, 34 minutes, 40 seconds)
2025-05-06 02:10:03,409 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:10:04,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 178.47250 ± 56.383
2025-05-06 02:10:04,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [135.90913, 182.6672, 213.77391, 187.34735, 135.36559, 135.56111, 166.5559, 177.22276, 124.08334, 326.23877]
2025-05-06 02:10:04,457 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [26.0, 35.0, 42.0, 36.0, 26.0, 26.0, 32.0, 34.0, 24.0, 65.0]
2025-05-06 02:10:04,461 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 26/100 (estimated time remaining: 5 hours, 30 minutes, 53 seconds)
2025-05-06 02:14:27,778 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:14:28,769 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 170.58199 ± 15.891
2025-05-06 02:14:28,769 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [183.32536, 156.61967, 181.40671, 156.99846, 182.13782, 145.24492, 156.19995, 162.40765, 189.24011, 192.23918]
2025-05-06 02:14:28,769 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 30.0, 35.0, 30.0, 35.0, 28.0, 30.0, 31.0, 37.0, 37.0]
2025-05-06 02:14:28,773 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 27/100 (estimated time remaining: 5 hours, 26 minutes, 27 seconds)
2025-05-06 02:18:52,831 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:18:53,783 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 163.91052 ± 18.947
2025-05-06 02:18:53,783 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [192.78244, 140.49757, 150.72697, 171.83139, 191.1854, 159.96094, 171.40195, 130.47884, 169.31407, 160.92564]
2025-05-06 02:18:53,783 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [37.0, 27.0, 29.0, 33.0, 37.0, 31.0, 33.0, 25.0, 33.0, 31.0]
2025-05-06 02:18:53,788 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 28/100 (estimated time remaining: 5 hours, 22 minutes, 19 seconds)
2025-05-06 02:23:17,897 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:23:19,097 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 203.40649 ± 64.039
2025-05-06 02:23:19,097 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [150.86194, 186.35533, 151.01799, 181.69199, 199.77135, 387.29755, 186.07567, 181.28842, 212.57242, 197.13226]
2025-05-06 02:23:19,097 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 36.0, 29.0, 35.0, 39.0, 77.0, 36.0, 35.0, 42.0, 38.0]
2025-05-06 02:23:19,101 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 29/100 (estimated time remaining: 5 hours, 18 minutes, 1 second)
2025-05-06 02:27:43,259 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:27:44,494 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 208.49466 ± 54.247
2025-05-06 02:27:44,494 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [227.80853, 151.02101, 175.16927, 214.58717, 208.7011, 333.47348, 140.63327, 196.83794, 173.37071, 263.3441]
2025-05-06 02:27:44,494 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [44.0, 29.0, 34.0, 41.0, 40.0, 68.0, 27.0, 38.0, 34.0, 52.0]
2025-05-06 02:27:44,499 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 30/100 (estimated time remaining: 5 hours, 13 minutes, 44 seconds)
2025-05-06 02:32:07,999 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:32:09,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 191.24429 ± 71.947
2025-05-06 02:32:09,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.48888, 196.75352, 180.57375, 395.99408, 151.093, 135.17216, 215.4961, 160.97606, 145.24895, 164.64658]
2025-05-06 02:32:09,135 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 38.0, 35.0, 80.0, 29.0, 26.0, 42.0, 31.0, 28.0, 32.0]
2025-05-06 02:32:09,140 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 31/100 (estimated time remaining: 5 hours, 9 minutes, 5 seconds)
2025-05-06 02:36:32,410 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:36:33,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 174.49026 ± 15.065
2025-05-06 02:36:33,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [161.69592, 155.8506, 184.2831, 203.72092, 171.2029, 170.10344, 179.63177, 187.28098, 180.20854, 150.92455]
2025-05-06 02:36:33,430 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 30.0, 35.0, 40.0, 33.0, 33.0, 35.0, 36.0, 35.0, 29.0]
2025-05-06 02:36:33,434 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 32/100 (estimated time remaining: 5 hours, 4 minutes, 40 seconds)
2025-05-06 02:40:57,537 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:40:58,721 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 198.91257 ± 89.764
2025-05-06 02:40:58,721 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.44954, 160.83589, 199.78627, 145.84782, 135.69981, 207.74185, 150.43494, 182.36784, 460.05804, 179.9036]
2025-05-06 02:40:58,721 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 31.0, 39.0, 28.0, 26.0, 41.0, 29.0, 35.0, 93.0, 35.0]
2025-05-06 02:40:58,726 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 33/100 (estimated time remaining: 5 hours, 19 seconds)
2025-05-06 02:45:22,540 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:45:23,741 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 202.03574 ± 43.584
2025-05-06 02:45:23,741 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [298.6619, 225.0356, 161.17221, 181.10481, 145.71907, 216.03749, 200.38805, 246.25839, 170.88545, 175.09435]
2025-05-06 02:45:23,741 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [59.0, 44.0, 31.0, 35.0, 28.0, 43.0, 40.0, 47.0, 33.0, 34.0]
2025-05-06 02:45:23,747 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 34/100 (estimated time remaining: 4 hours, 55 minutes, 50 seconds)
2025-05-06 02:49:48,290 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:49:49,306 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 174.57211 ± 23.901
2025-05-06 02:49:49,306 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [219.84969, 176.55745, 155.51416, 140.78217, 191.25525, 164.46881, 207.73131, 157.30588, 155.33162, 176.92474]
2025-05-06 02:49:49,306 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [42.0, 34.0, 30.0, 27.0, 37.0, 32.0, 40.0, 30.0, 30.0, 34.0]
2025-05-06 02:49:49,311 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 35/100 (estimated time remaining: 4 hours, 51 minutes, 27 seconds)
2025-05-06 02:54:13,816 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:54:14,962 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 191.86215 ± 49.843
2025-05-06 02:54:14,962 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [182.20021, 314.16183, 206.64609, 130.52092, 201.86449, 170.29266, 224.27713, 181.46765, 177.6686, 129.52203]
2025-05-06 02:54:14,962 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 65.0, 40.0, 25.0, 39.0, 33.0, 44.0, 35.0, 35.0, 25.0]
2025-05-06 02:54:14,967 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 36/100 (estimated time remaining: 4 hours, 47 minutes, 15 seconds)
2025-05-06 02:58:38,392 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 02:58:39,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 191.01146 ± 78.382
2025-05-06 02:58:39,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [141.00436, 160.97244, 153.82391, 165.58563, 162.02823, 150.33073, 417.17386, 223.21974, 179.44025, 156.5355]
2025-05-06 02:58:39,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [27.0, 31.0, 30.0, 32.0, 31.0, 29.0, 88.0, 43.0, 35.0, 30.0]
2025-05-06 02:58:39,547 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 37/100 (estimated time remaining: 4 hours, 42 minutes, 54 seconds)
2025-05-06 03:03:03,454 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:03:04,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 166.83237 ± 16.736
2025-05-06 03:03:04,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.1314, 135.02245, 175.77766, 156.38547, 165.51334, 155.38919, 166.65775, 202.12315, 163.90916, 181.41396]
2025-05-06 03:03:04,429 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 26.0, 34.0, 30.0, 32.0, 30.0, 32.0, 39.0, 32.0, 35.0]
2025-05-06 03:03:04,434 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 38/100 (estimated time remaining: 4 hours, 38 minutes, 23 seconds)
2025-05-06 03:07:28,113 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:07:29,066 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 163.57895 ± 21.484
2025-05-06 03:07:29,067 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [145.86646, 180.37395, 155.4062, 140.79797, 159.84071, 160.53926, 172.74734, 155.9859, 218.04681, 146.18481]
2025-05-06 03:07:29,067 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [28.0, 35.0, 30.0, 27.0, 31.0, 31.0, 33.0, 30.0, 43.0, 28.0]
2025-05-06 03:07:29,072 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 39/100 (estimated time remaining: 4 hours, 33 minutes, 54 seconds)
2025-05-06 03:11:52,876 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:11:53,953 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 182.29128 ± 62.067
2025-05-06 03:11:53,953 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [182.21114, 180.7684, 161.82227, 160.9653, 180.01785, 139.94485, 151.1305, 156.67397, 363.63498, 145.74356]
2025-05-06 03:11:53,953 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 35.0, 31.0, 31.0, 35.0, 27.0, 29.0, 30.0, 73.0, 28.0]
2025-05-06 03:11:53,959 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 40/100 (estimated time remaining: 4 hours, 29 minutes, 20 seconds)
2025-05-06 03:16:17,863 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:16:18,869 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 172.56076 ± 32.667
2025-05-06 03:16:18,870 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [196.8383, 211.8161, 150.75224, 151.67334, 125.17489, 140.41183, 211.9188, 160.9538, 154.96336, 221.10481]
2025-05-06 03:16:18,870 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [38.0, 41.0, 29.0, 29.0, 24.0, 27.0, 41.0, 31.0, 30.0, 43.0]
2025-05-06 03:16:18,875 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 41/100 (estimated time remaining: 4 hours, 24 minutes, 46 seconds)
2025-05-06 03:20:42,630 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:20:43,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 159.14120 ± 27.373
2025-05-06 03:20:43,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [141.66106, 145.41393, 179.66444, 125.54604, 222.37297, 155.721, 185.00989, 135.87375, 144.41396, 155.73505]
2025-05-06 03:20:43,553 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [27.0, 28.0, 35.0, 24.0, 43.0, 30.0, 36.0, 26.0, 28.0, 30.0]
2025-05-06 03:20:43,559 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 42/100 (estimated time remaining: 4 hours, 20 minutes, 23 seconds)
2025-05-06 03:25:06,779 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:25:07,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 172.34656 ± 14.159
2025-05-06 03:25:07,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [181.67126, 166.80548, 155.05597, 181.2272, 181.649, 179.5019, 160.74792, 179.98753, 144.45573, 192.36371]
2025-05-06 03:25:07,803 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 32.0, 30.0, 35.0, 35.0, 35.0, 31.0, 35.0, 28.0, 37.0]
2025-05-06 03:25:07,809 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 43/100 (estimated time remaining: 4 hours, 15 minutes, 51 seconds)
2025-05-06 03:29:32,491 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:29:33,483 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 170.94479 ± 18.361
2025-05-06 03:29:33,483 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [191.78416, 160.75998, 189.23508, 135.66397, 161.64651, 175.62027, 186.1856, 146.04118, 174.90526, 187.60587]
2025-05-06 03:29:33,483 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [37.0, 31.0, 37.0, 26.0, 31.0, 34.0, 36.0, 28.0, 34.0, 36.0]
2025-05-06 03:29:33,489 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 44/100 (estimated time remaining: 4 hours, 11 minutes, 38 seconds)
2025-05-06 03:33:57,645 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:33:58,660 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 174.23221 ± 28.990
2025-05-06 03:33:58,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [146.9461, 156.08446, 151.04305, 197.34088, 181.68336, 189.67699, 231.94492, 196.6681, 130.04547, 160.8888]
2025-05-06 03:33:58,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [28.0, 30.0, 29.0, 38.0, 35.0, 37.0, 45.0, 38.0, 25.0, 31.0]
2025-05-06 03:33:58,667 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 45/100 (estimated time remaining: 4 hours, 7 minutes, 16 seconds)
2025-05-06 03:38:23,257 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:38:24,231 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 166.77399 ± 17.016
2025-05-06 03:38:24,231 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [162.64565, 184.04663, 195.66457, 141.47197, 183.95644, 155.15707, 156.17456, 176.39278, 145.09137, 167.13878]
2025-05-06 03:38:24,231 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 36.0, 38.0, 27.0, 36.0, 30.0, 30.0, 34.0, 28.0, 32.0]
2025-05-06 03:38:24,237 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 46/100 (estimated time remaining: 4 hours, 2 minutes, 58 seconds)
2025-05-06 03:42:48,029 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:42:49,010 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 168.05496 ± 29.730
2025-05-06 03:42:49,010 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [190.58743, 171.41719, 150.33795, 155.49847, 135.8727, 194.99611, 135.67764, 180.9281, 229.77847, 135.4556]
2025-05-06 03:42:49,010 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [37.0, 33.0, 29.0, 30.0, 26.0, 38.0, 26.0, 35.0, 45.0, 26.0]
2025-05-06 03:42:49,016 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 47/100 (estimated time remaining: 3 hours, 58 minutes, 34 seconds)
2025-05-06 03:47:12,790 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:47:13,723 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 161.18253 ± 18.269
2025-05-06 03:47:13,723 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [161.43274, 193.82698, 165.6974, 140.32515, 146.08107, 155.83994, 180.46292, 162.5128, 129.94847, 175.69783]
2025-05-06 03:47:13,724 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 38.0, 32.0, 27.0, 28.0, 30.0, 35.0, 31.0, 25.0, 34.0]
2025-05-06 03:47:13,730 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 48/100 (estimated time remaining: 3 hours, 54 minutes, 14 seconds)
2025-05-06 03:51:37,834 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:51:38,854 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 173.98729 ± 17.518
2025-05-06 03:51:38,854 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [203.51111, 200.1236, 161.65527, 177.30331, 190.37032, 155.31032, 160.71631, 160.82404, 154.45068, 175.60799]
2025-05-06 03:51:38,854 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [39.0, 39.0, 31.0, 34.0, 37.0, 30.0, 31.0, 31.0, 30.0, 34.0]
2025-05-06 03:51:38,861 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 49/100 (estimated time remaining: 3 hours, 49 minutes, 43 seconds)
2025-05-06 03:56:03,448 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 03:56:04,679 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 208.48666 ± 108.961
2025-05-06 03:56:04,680 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [188.35255, 150.82706, 155.97627, 166.91711, 165.51877, 175.4691, 531.77185, 208.89369, 180.18213, 160.95813]
2025-05-06 03:56:04,680 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [36.0, 29.0, 30.0, 32.0, 32.0, 34.0, 103.0, 40.0, 35.0, 31.0]
2025-05-06 03:56:04,686 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 50/100 (estimated time remaining: 3 hours, 45 minutes, 25 seconds)
2025-05-06 04:00:27,988 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:00:29,378 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 230.45016 ± 99.219
2025-05-06 04:00:29,378 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [145.947, 261.63748, 395.72, 186.0327, 160.42172, 440.16217, 171.73438, 191.9606, 199.72826, 151.15741]
2025-05-06 04:00:29,378 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [28.0, 51.0, 82.0, 36.0, 31.0, 88.0, 33.0, 37.0, 39.0, 29.0]
2025-05-06 04:00:29,378 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1124 [INFO]: New best (230.45) for latency SparseU15
2025-05-06 04:00:29,379 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1127 [INFO]: saving network
2025-05-06 04:00:29,383 latency_env.training.utils:544 [DEBUG]: Saving evalcopy of SAC to _logs/benchmark-v3-tc3/noisy-humanoid/SparseU15-sac-aug-mem32/checkpoints/best_SparseU15.pkl
2025-05-06 04:00:29,398 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 51/100 (estimated time remaining: 3 hours, 40 minutes, 51 seconds)
2025-05-06 04:04:54,004 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:04:54,929 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 160.04442 ± 21.629
2025-05-06 04:04:54,930 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.61176, 200.42477, 130.26556, 141.15518, 130.22066, 160.42381, 183.4591, 165.90717, 154.82413, 177.15211]
2025-05-06 04:04:54,930 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 39.0, 25.0, 27.0, 25.0, 31.0, 35.0, 32.0, 30.0, 34.0]
2025-05-06 04:04:54,937 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 52/100 (estimated time remaining: 3 hours, 36 minutes, 34 seconds)
2025-05-06 04:09:17,568 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:09:18,846 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 214.75900 ± 118.373
2025-05-06 04:09:18,847 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [151.89891, 461.31052, 156.52402, 125.4345, 434.6581, 195.25494, 150.72643, 140.41762, 145.28296, 186.08192]
2025-05-06 04:09:18,847 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 89.0, 30.0, 24.0, 89.0, 38.0, 29.0, 27.0, 28.0, 36.0]
2025-05-06 04:09:18,854 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 53/100 (estimated time remaining: 3 hours, 32 minutes, 1 second)
2025-05-06 04:13:42,961 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:13:43,981 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 174.38193 ± 23.930
2025-05-06 04:13:43,981 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [206.71758, 180.09369, 151.01215, 135.219, 160.50493, 195.08803, 161.19733, 162.03334, 177.42952, 214.5238]
2025-05-06 04:13:43,981 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [40.0, 35.0, 29.0, 26.0, 31.0, 38.0, 31.0, 31.0, 35.0, 42.0]
2025-05-06 04:13:43,988 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 54/100 (estimated time remaining: 3 hours, 27 minutes, 36 seconds)
2025-05-06 04:18:12,632 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:18:13,619 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 169.97629 ± 19.713
2025-05-06 04:18:13,619 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.26743, 213.27686, 180.7918, 145.25089, 162.48907, 150.02484, 181.99051, 146.02069, 178.00511, 175.64577]
2025-05-06 04:18:13,619 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 42.0, 35.0, 28.0, 31.0, 29.0, 35.0, 28.0, 35.0, 34.0]
2025-05-06 04:18:13,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 55/100 (estimated time remaining: 3 hours, 23 minutes, 46 seconds)
2025-05-06 04:22:36,057 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:22:37,093 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 176.18463 ± 33.711
2025-05-06 04:22:37,093 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.93883, 161.75092, 145.41135, 176.6746, 130.24681, 166.90205, 151.16626, 217.27168, 230.14977, 225.334]
2025-05-06 04:22:37,093 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 31.0, 28.0, 34.0, 25.0, 33.0, 29.0, 42.0, 45.0, 44.0]
2025-05-06 04:22:37,101 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 56/100 (estimated time remaining: 3 hours, 19 minutes, 9 seconds)
2025-05-06 04:26:57,413 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:26:58,415 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 172.40927 ± 24.932
2025-05-06 04:26:58,415 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.25977, 146.00232, 156.57065, 202.76562, 218.44685, 160.75151, 171.87254, 145.68542, 204.04337, 161.69473]
2025-05-06 04:26:58,415 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 28.0, 30.0, 39.0, 42.0, 31.0, 33.0, 28.0, 40.0, 31.0]
2025-05-06 04:26:58,423 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 57/100 (estimated time remaining: 3 hours, 14 minutes, 6 seconds)
2025-05-06 04:31:18,785 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:31:19,708 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 159.31543 ± 15.914
2025-05-06 04:31:19,708 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [136.16103, 151.54118, 161.58679, 146.53897, 187.68257, 175.3963, 141.34251, 172.774, 170.03992, 150.09096]
2025-05-06 04:31:19,708 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [26.0, 29.0, 31.0, 28.0, 36.0, 34.0, 27.0, 33.0, 33.0, 29.0]
2025-05-06 04:31:19,716 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 58/100 (estimated time remaining: 3 hours, 9 minutes, 19 seconds)
2025-05-06 04:35:40,620 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:35:41,657 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 177.45792 ± 23.151
2025-05-06 04:35:41,657 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [171.4186, 175.97574, 167.0009, 196.33269, 144.55537, 218.79884, 172.53987, 140.90654, 183.84822, 203.20233]
2025-05-06 04:35:41,657 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 34.0, 32.0, 38.0, 28.0, 43.0, 33.0, 27.0, 36.0, 39.0]
2025-05-06 04:35:41,665 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 59/100 (estimated time remaining: 3 hours, 4 minutes, 28 seconds)
2025-05-06 04:40:01,808 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:40:02,982 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 199.29993 ± 88.809
2025-05-06 04:40:02,982 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [162.25783, 459.10202, 210.38034, 190.69911, 166.15915, 181.18887, 171.57713, 150.75743, 134.98273, 165.89458]
2025-05-06 04:40:02,982 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [31.0, 91.0, 41.0, 37.0, 32.0, 35.0, 33.0, 29.0, 26.0, 32.0]
2025-05-06 04:40:02,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 60/100 (estimated time remaining: 2 hours, 58 minutes, 56 seconds)
2025-05-06 04:44:24,382 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:44:25,484 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 188.91866 ± 76.471
2025-05-06 04:44:25,485 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [411.63937, 160.82802, 156.06812, 171.4469, 156.2655, 164.81256, 206.75438, 175.57475, 155.56477, 130.23209]
2025-05-06 04:44:25,485 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [81.0, 31.0, 30.0, 33.0, 30.0, 32.0, 40.0, 34.0, 30.0, 25.0]
2025-05-06 04:44:25,493 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 61/100 (estimated time remaining: 2 hours, 54 minutes, 27 seconds)
2025-05-06 04:48:46,498 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:48:47,706 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 205.32852 ± 64.738
2025-05-06 04:48:47,706 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [167.68039, 220.25996, 145.90689, 224.88684, 206.2216, 154.67476, 190.04192, 176.42328, 182.42783, 384.7617]
2025-05-06 04:48:47,706 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 43.0, 28.0, 44.0, 40.0, 30.0, 37.0, 34.0, 36.0, 75.0]
2025-05-06 04:48:47,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 62/100 (estimated time remaining: 2 hours, 50 minutes, 12 seconds)
2025-05-06 04:53:15,412 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:53:16,542 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 193.14384 ± 88.717
2025-05-06 04:53:16,543 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.66687, 135.6358, 186.33034, 180.59026, 160.98766, 204.74011, 451.43115, 175.15794, 139.18225, 140.71611]
2025-05-06 04:53:16,543 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 26.0, 36.0, 35.0, 31.0, 40.0, 88.0, 34.0, 27.0, 27.0]
2025-05-06 04:53:16,551 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 63/100 (estimated time remaining: 2 hours, 46 minutes, 47 seconds)
2025-05-06 04:57:36,650 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 04:57:37,610 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 165.89407 ± 24.294
2025-05-06 04:57:37,611 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [151.17542, 180.52635, 177.01695, 140.19943, 150.53935, 145.05298, 175.67868, 136.13431, 185.06831, 217.54895]
2025-05-06 04:57:37,611 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 35.0, 34.0, 27.0, 29.0, 28.0, 34.0, 26.0, 36.0, 42.0]
2025-05-06 04:57:37,619 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 64/100 (estimated time remaining: 2 hours, 42 minutes, 18 seconds)
2025-05-06 05:01:58,979 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:01:59,958 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 170.48512 ± 22.992
2025-05-06 05:01:59,958 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.98705, 181.02821, 191.47424, 166.94525, 176.6692, 169.83379, 130.89082, 219.75847, 149.96329, 161.30093]
2025-05-06 05:01:59,958 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 35.0, 37.0, 32.0, 34.0, 33.0, 25.0, 42.0, 29.0, 31.0]
2025-05-06 05:01:59,967 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 65/100 (estimated time remaining: 2 hours, 38 minutes, 2 seconds)
2025-05-06 05:06:20,249 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:06:21,412 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 198.56253 ± 68.863
2025-05-06 05:06:21,412 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [157.25685, 160.67651, 150.5874, 234.01439, 221.89774, 140.19919, 195.95639, 383.9747, 150.34413, 190.718]
2025-05-06 05:06:21,412 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 31.0, 29.0, 45.0, 43.0, 27.0, 38.0, 72.0, 29.0, 37.0]
2025-05-06 05:06:21,421 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 66/100 (estimated time remaining: 2 hours, 33 minutes, 31 seconds)
2025-05-06 05:10:42,953 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:10:44,232 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 216.96208 ± 102.251
2025-05-06 05:10:44,232 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.24922, 176.27335, 171.31046, 420.29126, 160.8571, 170.76959, 416.51404, 160.8419, 201.1949, 125.31893]
2025-05-06 05:10:44,232 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 34.0, 33.0, 84.0, 31.0, 33.0, 78.0, 31.0, 39.0, 24.0]
2025-05-06 05:10:44,241 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 67/100 (estimated time remaining: 2 hours, 29 minutes, 12 seconds)
2025-05-06 05:15:04,690 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:15:05,720 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 177.16904 ± 20.301
2025-05-06 05:15:05,720 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [206.72816, 200.77316, 151.65955, 150.9417, 206.11191, 166.94481, 171.48895, 161.05424, 170.36479, 185.62315]
2025-05-06 05:15:05,720 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [40.0, 39.0, 29.0, 29.0, 40.0, 32.0, 33.0, 31.0, 33.0, 36.0]
2025-05-06 05:15:05,729 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 68/100 (estimated time remaining: 2 hours, 24 minutes)
2025-05-06 05:19:26,384 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:19:27,630 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 211.94580 ± 86.490
2025-05-06 05:19:27,631 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [172.02675, 150.68718, 136.38794, 171.23308, 196.03712, 166.00008, 319.526, 236.709, 149.88431, 420.96667]
2025-05-06 05:19:27,631 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 29.0, 26.0, 33.0, 38.0, 32.0, 64.0, 46.0, 29.0, 82.0]
2025-05-06 05:19:27,640 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 69/100 (estimated time remaining: 2 hours, 19 minutes, 44 seconds)
2025-05-06 05:23:47,552 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:23:48,582 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 176.45346 ± 39.281
2025-05-06 05:23:48,582 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [195.69864, 151.44243, 160.21222, 135.17429, 202.65411, 275.89847, 151.15562, 151.3259, 153.93907, 187.03384]
2025-05-06 05:23:48,582 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [38.0, 29.0, 31.0, 26.0, 39.0, 54.0, 29.0, 29.0, 30.0, 36.0]
2025-05-06 05:23:48,592 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 70/100 (estimated time remaining: 2 hours, 15 minutes, 13 seconds)
2025-05-06 05:28:10,990 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:28:12,083 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 187.35374 ± 68.391
2025-05-06 05:28:12,084 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [125.36151, 178.91344, 180.11221, 181.15475, 386.2362, 170.39253, 175.6839, 160.69724, 169.65414, 145.33156]
2025-05-06 05:28:12,084 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [24.0, 35.0, 35.0, 35.0, 73.0, 33.0, 34.0, 31.0, 33.0, 28.0]
2025-05-06 05:28:12,093 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 71/100 (estimated time remaining: 2 hours, 11 minutes, 4 seconds)
2025-05-06 05:32:31,087 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:32:32,091 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 171.55191 ± 28.916
2025-05-06 05:32:32,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [212.38219, 151.85048, 196.42857, 135.62253, 161.25673, 169.62149, 215.14433, 192.11047, 145.11989, 135.98248]
2025-05-06 05:32:32,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [41.0, 29.0, 38.0, 26.0, 31.0, 33.0, 42.0, 37.0, 28.0, 26.0]
2025-05-06 05:32:32,101 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 72/100 (estimated time remaining: 2 hours, 6 minutes, 25 seconds)
2025-05-06 05:36:53,705 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:36:54,714 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 173.44409 ± 29.793
2025-05-06 05:36:54,715 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.75526, 188.99823, 136.14392, 251.02306, 181.42218, 179.54028, 162.32835, 156.00809, 161.12627, 151.0953]
2025-05-06 05:36:54,715 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 37.0, 26.0, 49.0, 35.0, 35.0, 31.0, 30.0, 31.0, 29.0]
2025-05-06 05:36:54,724 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 73/100 (estimated time remaining: 2 hours, 2 minutes, 10 seconds)
2025-05-06 05:41:14,756 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:41:15,903 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 195.86859 ± 88.002
2025-05-06 05:41:15,904 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [167.85548, 195.97934, 172.04428, 452.72748, 161.21558, 159.71472, 156.33626, 182.38174, 119.45565, 190.97539]
2025-05-06 05:41:15,904 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 38.0, 33.0, 90.0, 31.0, 31.0, 30.0, 35.0, 23.0, 37.0]
2025-05-06 05:41:15,913 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 74/100 (estimated time remaining: 1 hour, 57 minutes, 44 seconds)
2025-05-06 05:45:36,063 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:45:37,012 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 160.43076 ± 25.144
2025-05-06 05:45:37,012 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [151.42665, 151.28366, 140.36642, 156.61296, 201.95323, 169.9809, 210.86122, 130.20204, 150.30841, 141.31197]
2025-05-06 05:45:37,012 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 29.0, 27.0, 30.0, 39.0, 33.0, 41.0, 25.0, 29.0, 27.0]
2025-05-06 05:45:37,021 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 75/100 (estimated time remaining: 1 hour, 53 minutes, 23 seconds)
2025-05-06 05:49:58,376 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:49:59,329 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 165.55910 ± 22.300
2025-05-06 05:49:59,329 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [151.30147, 180.7135, 175.60544, 171.0059, 145.66243, 208.69092, 171.4522, 167.8296, 163.70294, 119.62661]
2025-05-06 05:49:59,329 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 35.0, 34.0, 33.0, 28.0, 41.0, 33.0, 32.0, 32.0, 23.0]
2025-05-06 05:49:59,339 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 76/100 (estimated time remaining: 1 hour, 48 minutes, 56 seconds)
2025-05-06 05:54:19,383 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:54:20,407 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 175.13760 ± 24.948
2025-05-06 05:54:20,407 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [173.33794, 191.23166, 189.08037, 181.2364, 155.39009, 140.07172, 146.477, 231.08092, 163.47504, 179.99498]
2025-05-06 05:54:20,407 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 37.0, 37.0, 35.0, 30.0, 27.0, 28.0, 45.0, 32.0, 35.0]
2025-05-06 05:54:20,417 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 77/100 (estimated time remaining: 1 hour, 44 minutes, 39 seconds)
2025-05-06 05:58:41,698 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 05:58:42,779 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 182.86440 ± 17.906
2025-05-06 05:58:42,779 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [167.8358, 187.12381, 150.78668, 196.62993, 182.47772, 217.65022, 195.78403, 166.60245, 174.79378, 188.95944]
2025-05-06 05:58:42,779 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 36.0, 29.0, 38.0, 35.0, 43.0, 38.0, 32.0, 34.0, 37.0]
2025-05-06 05:58:42,789 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 78/100 (estimated time remaining: 1 hour, 40 minutes, 17 seconds)
2025-05-06 06:03:03,466 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:03:04,476 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 173.38371 ± 20.340
2025-05-06 06:03:04,476 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [185.3451, 181.13892, 186.35873, 160.18068, 120.03826, 160.67076, 189.60062, 186.13037, 183.51143, 180.86232]
2025-05-06 06:03:04,476 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [36.0, 35.0, 36.0, 31.0, 23.0, 31.0, 37.0, 36.0, 36.0, 35.0]
2025-05-06 06:03:04,487 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 79/100 (estimated time remaining: 1 hour, 35 minutes, 57 seconds)
2025-05-06 06:07:25,287 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:07:26,259 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 168.42381 ± 27.358
2025-05-06 06:07:26,260 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [198.72151, 166.66862, 228.7723, 146.1256, 150.43651, 145.8401, 135.6767, 186.5545, 154.8229, 170.61943]
2025-05-06 06:07:26,260 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [38.0, 32.0, 44.0, 28.0, 29.0, 28.0, 26.0, 36.0, 30.0, 33.0]
2025-05-06 06:07:26,270 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 80/100 (estimated time remaining: 1 hour, 31 minutes, 38 seconds)
2025-05-06 06:11:47,821 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:11:49,082 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 208.42937 ± 105.394
2025-05-06 06:11:49,082 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [155.71799, 175.1129, 196.08571, 178.35852, 167.32819, 155.9224, 196.67929, 155.66206, 521.57153, 181.85521]
2025-05-06 06:11:49,082 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 34.0, 38.0, 34.0, 32.0, 30.0, 38.0, 30.0, 111.0, 35.0]
2025-05-06 06:11:49,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 81/100 (estimated time remaining: 1 hour, 27 minutes, 19 seconds)
2025-05-06 06:16:10,674 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:16:11,650 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 167.08772 ± 32.522
2025-05-06 06:16:11,650 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [203.52286, 140.82707, 146.22786, 156.2857, 141.03973, 154.86894, 135.35378, 203.08318, 153.98141, 235.6867]
2025-05-06 06:16:11,650 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [39.0, 27.0, 28.0, 30.0, 27.0, 30.0, 26.0, 39.0, 30.0, 46.0]
2025-05-06 06:16:11,661 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 82/100 (estimated time remaining: 1 hour, 23 minutes, 2 seconds)
2025-05-06 06:20:41,161 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:20:42,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 160.72417 ± 40.997
2025-05-06 06:20:42,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [156.75183, 266.98932, 171.1169, 141.12912, 185.09615, 119.46946, 170.65189, 141.02196, 124.4872, 130.52795]
2025-05-06 06:20:42,092 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [30.0, 51.0, 33.0, 27.0, 36.0, 23.0, 33.0, 27.0, 24.0, 25.0]
2025-05-06 06:20:42,103 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 83/100 (estimated time remaining: 1 hour, 19 minutes, 9 seconds)
2025-05-06 06:25:43,946 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:25:45,129 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 198.56403 ± 87.349
2025-05-06 06:25:45,129 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [136.03494, 184.29047, 181.00339, 190.89333, 160.51862, 155.65508, 455.23813, 150.43921, 184.31053, 187.25642]
2025-05-06 06:25:45,129 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [26.0, 36.0, 35.0, 37.0, 31.0, 30.0, 91.0, 29.0, 36.0, 36.0]
2025-05-06 06:25:45,157 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 84/100 (estimated time remaining: 1 hour, 17 minutes, 6 seconds)
2025-05-06 06:30:07,214 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:30:08,272 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 180.09871 ± 29.751
2025-05-06 06:30:08,272 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [176.44649, 186.25461, 136.1652, 193.83707, 140.35204, 179.70193, 161.51254, 191.36961, 187.63457, 247.7131]
2025-05-06 06:30:08,272 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [34.0, 36.0, 26.0, 38.0, 27.0, 35.0, 31.0, 37.0, 37.0, 48.0]
2025-05-06 06:30:08,283 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 85/100 (estimated time remaining: 1 hour, 12 minutes, 38 seconds)
2025-05-06 06:34:29,925 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:34:31,016 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 185.13506 ± 24.976
2025-05-06 06:34:31,016 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [181.0082, 140.92422, 202.18964, 166.58263, 206.3853, 203.87291, 156.20027, 184.77196, 229.45618, 179.95917]
2025-05-06 06:34:31,016 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 27.0, 39.0, 32.0, 40.0, 40.0, 30.0, 36.0, 45.0, 35.0]
2025-05-06 06:34:31,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 86/100 (estimated time remaining: 1 hour, 8 minutes, 5 seconds)
2025-05-06 06:38:54,446 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:38:55,456 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 173.18991 ± 18.577
2025-05-06 06:38:55,456 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [181.08109, 156.39243, 197.31795, 176.4086, 150.70175, 195.14778, 177.58784, 140.19937, 165.08728, 191.97481]
2025-05-06 06:38:55,456 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 30.0, 38.0, 34.0, 29.0, 38.0, 34.0, 27.0, 32.0, 37.0]
2025-05-06 06:38:55,467 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 87/100 (estimated time remaining: 1 hour, 3 minutes, 38 seconds)
2025-05-06 06:43:17,418 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:43:18,399 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 168.56673 ± 30.998
2025-05-06 06:43:18,400 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [141.29951, 151.8567, 175.25784, 180.96828, 176.84703, 145.3243, 145.6589, 156.25671, 159.81303, 252.38503]
2025-05-06 06:43:18,400 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [27.0, 29.0, 34.0, 35.0, 34.0, 28.0, 28.0, 30.0, 31.0, 49.0]
2025-05-06 06:43:18,411 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 88/100 (estimated time remaining: 58 minutes, 46 seconds)
2025-05-06 06:47:40,641 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:47:41,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 168.62866 ± 31.326
2025-05-06 06:47:41,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [182.33795, 146.60988, 160.35469, 189.34949, 249.13155, 160.19768, 136.0236, 157.30983, 138.73047, 166.24142]
2025-05-06 06:47:41,627 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [35.0, 28.0, 31.0, 37.0, 49.0, 31.0, 26.0, 30.0, 27.0, 32.0]
2025-05-06 06:47:41,638 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 89/100 (estimated time remaining: 52 minutes, 39 seconds)
2025-05-06 06:52:03,447 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:52:04,404 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 165.87720 ± 15.162
2025-05-06 06:52:04,404 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [166.84573, 166.37158, 145.25726, 171.43451, 171.49956, 165.6588, 161.90897, 203.97176, 154.76704, 151.05678]
2025-05-06 06:52:04,404 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 32.0, 28.0, 33.0, 33.0, 32.0, 31.0, 39.0, 30.0, 29.0]
2025-05-06 06:52:04,415 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 90/100 (estimated time remaining: 48 minutes, 15 seconds)
2025-05-06 06:56:26,946 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 06:56:28,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 184.23080 ± 12.737
2025-05-06 06:56:28,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [173.5904, 186.57574, 166.60501, 185.1662, 205.59995, 180.41237, 203.7112, 186.6269, 187.76266, 166.2575]
2025-05-06 06:56:28,027 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 36.0, 32.0, 36.0, 41.0, 35.0, 40.0, 36.0, 37.0, 32.0]
2025-05-06 06:56:28,039 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 91/100 (estimated time remaining: 43 minutes, 54 seconds)
2025-05-06 07:00:49,795 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:00:50,923 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 191.96103 ± 73.480
2025-05-06 07:00:50,924 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [171.57948, 171.07025, 401.79898, 218.93854, 190.87851, 166.41084, 160.87294, 140.92976, 134.58984, 162.54124]
2025-05-06 07:00:50,924 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [33.0, 33.0, 79.0, 43.0, 37.0, 32.0, 31.0, 27.0, 26.0, 31.0]
2025-05-06 07:00:50,942 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 92/100 (estimated time remaining: 39 minutes, 27 seconds)
2025-05-06 07:05:12,448 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:05:13,526 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 184.66156 ± 81.041
2025-05-06 07:05:13,526 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [167.83246, 166.06885, 165.58327, 144.62012, 135.6238, 155.42397, 161.16083, 424.91483, 179.75446, 145.6332]
2025-05-06 07:05:13,526 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 32.0, 32.0, 28.0, 26.0, 30.0, 31.0, 80.0, 35.0, 28.0]
2025-05-06 07:05:13,538 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 93/100 (estimated time remaining: 35 minutes, 4 seconds)
2025-05-06 07:09:36,304 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:09:37,384 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 181.19353 ± 86.410
2025-05-06 07:09:37,384 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [167.05629, 161.44455, 165.38307, 150.68501, 140.66545, 119.56228, 135.8652, 146.33936, 190.56639, 434.36768]
2025-05-06 07:09:37,384 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [32.0, 31.0, 32.0, 29.0, 27.0, 23.0, 26.0, 28.0, 37.0, 85.0]
2025-05-06 07:09:37,396 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 94/100 (estimated time remaining: 30 minutes, 42 seconds)
2025-05-06 07:13:59,831 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:14:01,113 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 216.27397 ± 93.886
2025-05-06 07:14:01,114 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [177.62642, 194.9727, 199.99075, 181.79329, 209.51468, 276.69675, 160.91107, 162.30843, 124.49288, 474.4328]
2025-05-06 07:14:01,114 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [34.0, 38.0, 39.0, 35.0, 41.0, 54.0, 31.0, 31.0, 24.0, 89.0]
2025-05-06 07:14:01,126 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 95/100 (estimated time remaining: 26 minutes, 20 seconds)
2025-05-06 07:18:23,375 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:18:24,278 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 155.70706 ± 19.111
2025-05-06 07:18:24,278 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [137.0668, 170.28561, 144.51888, 156.22708, 180.41101, 119.422905, 161.21419, 182.38934, 164.61237, 140.92253]
2025-05-06 07:18:24,278 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [26.0, 33.0, 28.0, 30.0, 35.0, 23.0, 31.0, 35.0, 32.0, 27.0]
2025-05-06 07:18:24,290 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 96/100 (estimated time remaining: 21 minutes, 56 seconds)
2025-05-06 07:22:48,190 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:22:49,225 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 176.46283 ± 75.945
2025-05-06 07:22:49,225 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [152.05263, 180.72412, 146.2191, 171.02675, 124.72385, 175.2542, 125.150536, 151.04219, 397.5805, 140.85442]
2025-05-06 07:22:49,226 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [29.0, 35.0, 28.0, 33.0, 24.0, 34.0, 24.0, 29.0, 78.0, 27.0]
2025-05-06 07:22:49,238 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 97/100 (estimated time remaining: 17 minutes, 34 seconds)
2025-05-06 07:27:11,904 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:27:13,062 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 195.66568 ± 87.106
2025-05-06 07:27:13,062 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [192.53244, 135.68117, 449.67456, 180.52972, 185.81499, 134.59665, 186.38237, 145.60924, 178.57455, 167.2611]
2025-05-06 07:27:13,062 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [37.0, 26.0, 86.0, 35.0, 36.0, 26.0, 36.0, 28.0, 35.0, 32.0]
2025-05-06 07:27:13,074 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 98/100 (estimated time remaining: 13 minutes, 11 seconds)
2025-05-06 07:31:28,229 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:31:29,349 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 191.10187 ± 31.186
2025-05-06 07:31:29,349 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [274.3054, 161.75166, 180.15294, 184.96964, 184.79803, 198.93121, 201.00896, 151.88019, 188.39291, 184.82767]
2025-05-06 07:31:29,349 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [54.0, 31.0, 35.0, 36.0, 36.0, 39.0, 39.0, 29.0, 37.0, 36.0]
2025-05-06 07:31:29,362 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 99/100 (estimated time remaining: 8 minutes, 44 seconds)
2025-05-06 07:35:44,493 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:35:45,668 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 201.90605 ± 105.923
2025-05-06 07:35:45,668 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [187.23416, 165.63762, 160.98454, 140.32527, 189.79243, 154.59004, 191.56784, 513.7058, 129.64357, 185.57935]
2025-05-06 07:35:45,668 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [36.0, 32.0, 31.0, 27.0, 37.0, 30.0, 37.0, 96.0, 25.0, 36.0]
2025-05-06 07:35:45,680 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1097 [INFO]: Iteration 100/100 (estimated time remaining: 4 minutes, 20 seconds)
2025-05-06 07:40:00,022 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1112 [DEBUG]: Evaluating for latency SparseU15...
2025-05-06 07:40:01,022 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1119 [DEBUG]: Total Reward: 171.92017 ± 26.059
2025-05-06 07:40:01,022 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1120 [DEBUG]: All rewards: [186.8308, 242.82408, 151.11386, 151.05939, 169.57545, 170.03215, 160.33151, 166.4663, 149.57056, 171.39754]
2025-05-06 07:40:01,022 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1121 [DEBUG]: All trajectory lengths: [36.0, 47.0, 29.0, 29.0, 33.0, 33.0, 31.0, 32.0, 29.0, 33.0]
2025-05-06 07:40:01,035 latency_env.delayed_mdp:training_loop(baseline-sac-noisy-humanoid):1149 [DEBUG]: Training session finished
