Logging to experiments/gym_fwalker2d/Wo01/Mon-07-Nov-2022-10-30-38-AM-CST_gym_fwalker2d_trpo_iteration_20_seed3214
Print configuration .....
{'env_name': 'gym_fwalker2d', 'random_seeds': [3214, 2431, 2531, 2231], 'save_variables': False, 'model_save_dir': '/tmp/gym_fwalker2d_models/', 'restore_variables': False, 'start_onpol_iter': 0, 'onpol_iters': 33, 'num_path_random': 6, 'num_path_onpol': 6, 'env_horizon': 1000, 'max_train_data': 200000, 'max_val_data': 100000, 'discard_ratio': 0.0, 'dynamics': {'pre_training': {'mode': 'intrinsic_reward', 'itr': 0, 'policy_itr': 20}, 'model': 'nn', 'ensemble': False, 'ensemble_model_count': 5, 'enable_particle_ensemble': True, 'particles': 5, 'obs_var': 1.0, 'intrinsic_reward_coeff': 1.0, 'ita': 1.0, 'mode': 'random', 'val': True, 'n_layers': 4, 'hidden_size': 1000, 'activation': 'relu', 'batch_size': 1000, 'learning_rate': 0.001, 'reg_coeff': 0.0, 'epochs': 200, 'kfac_params': {'learning_rate': 0.1, 'damping': 0.001, 'momentum': 0.9, 'kl_clip': 0.0001, 'cov_ema_decay': 0.99}}, 'policy': {'network_shape': [64, 64], 'init_logstd': 0.0, 'activation': 'tanh', 'reinitialize_every_itr': False}, 'trpo': {'horizon': 1000, 'gamma': 0.99, 'step_size': 0.01, 'iterations': 20, 'batch_size': 50000, 'gae': 0.95, 'visualization': False, 'visualize_iterations': [0]}, 'algo': 'trpo'}
Generating random rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 15.
Path 2 | total_timesteps 40.
Path 3 | total_timesteps 55.
Path 4 | total_timesteps 65.
Path 5 | total_timesteps 77.
Path 6 | total_timesteps 91.
Path 7 | total_timesteps 110.
Path 8 | total_timesteps 124.
Path 9 | total_timesteps 150.
Path 10 | total_timesteps 167.
Path 11 | total_timesteps 177.
Path 12 | total_timesteps 198.
Path 13 | total_timesteps 219.
Path 14 | total_timesteps 258.
Path 15 | total_timesteps 271.
Path 16 | total_timesteps 293.
Path 17 | total_timesteps 321.
Path 18 | total_timesteps 341.
Path 19 | total_timesteps 356.
Path 20 | total_timesteps 383.
Path 21 | total_timesteps 402.
Path 22 | total_timesteps 426.
Path 23 | total_timesteps 456.
Path 24 | total_timesteps 465.
Path 25 | total_timesteps 500.
Path 26 | total_timesteps 515.
Path 27 | total_timesteps 543.
Path 28 | total_timesteps 580.
Path 29 | total_timesteps 603.
Path 30 | total_timesteps 623.
Path 31 | total_timesteps 645.
Path 32 | total_timesteps 659.
Path 33 | total_timesteps 670.
Path 34 | total_timesteps 688.
Path 35 | total_timesteps 716.
Path 36 | total_timesteps 734.
Path 37 | total_timesteps 768.
Path 38 | total_timesteps 788.
Path 39 | total_timesteps 800.
Path 40 | total_timesteps 811.
Path 41 | total_timesteps 829.
Path 42 | total_timesteps 843.
Path 43 | total_timesteps 859.
Path 44 | total_timesteps 869.
Path 45 | total_timesteps 894.
Path 46 | total_timesteps 906.
Path 47 | total_timesteps 923.
Path 48 | total_timesteps 940.
Path 49 | total_timesteps 958.
Path 50 | total_timesteps 977.
Path 51 | total_timesteps 992.
Path 52 | total_timesteps 1016.
Path 53 | total_timesteps 1030.
Path 54 | total_timesteps 1054.
Path 55 | total_timesteps 1073.
Path 56 | total_timesteps 1089.
Path 57 | total_timesteps 1110.
Path 58 | total_timesteps 1131.
Path 59 | total_timesteps 1145.
Path 60 | total_timesteps 1163.
Path 61 | total_timesteps 1188.
Path 62 | total_timesteps 1200.
Path 63 | total_timesteps 1215.
Path 64 | total_timesteps 1254.
Path 65 | total_timesteps 1268.
Path 66 | total_timesteps 1291.
Path 67 | total_timesteps 1314.
Path 68 | total_timesteps 1325.
Path 69 | total_timesteps 1341.
Path 70 | total_timesteps 1354.
Path 71 | total_timesteps 1374.
Path 72 | total_timesteps 1384.
Path 73 | total_timesteps 1394.
Path 74 | total_timesteps 1417.
Path 75 | total_timesteps 1427.
Path 76 | total_timesteps 1459.
Path 77 | total_timesteps 1490.
Path 78 | total_timesteps 1505.
Path 79 | total_timesteps 1522.
Path 80 | total_timesteps 1537.
Path 81 | total_timesteps 1545.
Path 82 | total_timesteps 1570.
Path 83 | total_timesteps 1595.
Path 84 | total_timesteps 1610.
Path 85 | total_timesteps 1668.
Path 86 | total_timesteps 1689.
Path 87 | total_timesteps 1708.
Path 88 | total_timesteps 1728.
Path 89 | total_timesteps 1756.
Path 90 | total_timesteps 1778.
Path 91 | total_timesteps 1793.
Path 92 | total_timesteps 1830.
Path 93 | total_timesteps 1848.
Path 94 | total_timesteps 1876.
Path 95 | total_timesteps 1895.
Path 96 | total_timesteps 1915.
Path 97 | total_timesteps 1940.
Path 98 | total_timesteps 1963.
Path 99 | total_timesteps 1984.
Path 100 | total_timesteps 1998.
Path 101 | total_timesteps 2019.
Path 102 | total_timesteps 2030.
Path 103 | total_timesteps 2044.
Path 104 | total_timesteps 2060.
Path 105 | total_timesteps 2071.
Path 106 | total_timesteps 2093.
Path 107 | total_timesteps 2105.
Path 108 | total_timesteps 2119.
Path 109 | total_timesteps 2141.
Path 110 | total_timesteps 2172.
Path 111 | total_timesteps 2195.
Path 112 | total_timesteps 2210.
Path 113 | total_timesteps 2240.
Path 114 | total_timesteps 2256.
Path 115 | total_timesteps 2266.
Path 116 | total_timesteps 2285.
Path 117 | total_timesteps 2312.
Path 118 | total_timesteps 2330.
Path 119 | total_timesteps 2343.
Path 120 | total_timesteps 2380.
Path 121 | total_timesteps 2400.
Path 122 | total_timesteps 2413.
Path 123 | total_timesteps 2444.
Path 124 | total_timesteps 2461.
Path 125 | total_timesteps 2475.
Path 126 | total_timesteps 2489.
Path 127 | total_timesteps 2509.
Path 128 | total_timesteps 2540.
Path 129 | total_timesteps 2556.
Path 130 | total_timesteps 2572.
Path 131 | total_timesteps 2587.
Path 132 | total_timesteps 2607.
Path 133 | total_timesteps 2624.
Path 134 | total_timesteps 2638.
Path 135 | total_timesteps 2655.
Path 136 | total_timesteps 2678.
Path 137 | total_timesteps 2702.
Path 138 | total_timesteps 2724.
Path 139 | total_timesteps 2766.
Path 140 | total_timesteps 2775.
Path 141 | total_timesteps 2794.
Path 142 | total_timesteps 2814.
Path 143 | total_timesteps 2824.
Path 144 | total_timesteps 2847.
Path 145 | total_timesteps 2869.
Path 146 | total_timesteps 2893.
Path 147 | total_timesteps 2922.
Path 148 | total_timesteps 2935.
Path 149 | total_timesteps 2951.
Path 150 | total_timesteps 2981.
Path 151 | total_timesteps 3015.
Path 152 | total_timesteps 3032.
Path 153 | total_timesteps 3055.
Path 154 | total_timesteps 3072.
Path 155 | total_timesteps 3082.
Path 156 | total_timesteps 3092.
Path 157 | total_timesteps 3111.
Path 158 | total_timesteps 3135.
Path 159 | total_timesteps 3150.
Path 160 | total_timesteps 3164.
Path 161 | total_timesteps 3196.
Path 162 | total_timesteps 3213.
Path 163 | total_timesteps 3233.
Path 164 | total_timesteps 3247.
Path 165 | total_timesteps 3265.
Path 166 | total_timesteps 3276.
Path 167 | total_timesteps 3297.
Path 168 | total_timesteps 3320.
Path 169 | total_timesteps 3334.
Path 170 | total_timesteps 3345.
Path 171 | total_timesteps 3373.
Path 172 | total_timesteps 3387.
Path 173 | total_timesteps 3404.
Path 174 | total_timesteps 3417.
Path 175 | total_timesteps 3433.
Path 176 | total_timesteps 3451.
Path 177 | total_timesteps 3464.
Path 178 | total_timesteps 3498.
Path 179 | total_timesteps 3523.
Path 180 | total_timesteps 3546.
Path 181 | total_timesteps 3566.
Path 182 | total_timesteps 3586.
Path 183 | total_timesteps 3608.
Path 184 | total_timesteps 3630.
Path 185 | total_timesteps 3643.
Path 186 | total_timesteps 3654.
Path 187 | total_timesteps 3676.
Path 188 | total_timesteps 3702.
Path 189 | total_timesteps 3727.
Path 190 | total_timesteps 3743.
Path 191 | total_timesteps 3762.
Path 192 | total_timesteps 3775.
Path 193 | total_timesteps 3797.
Path 194 | total_timesteps 3818.
Path 195 | total_timesteps 3840.
Path 196 | total_timesteps 3865.
Path 197 | total_timesteps 3882.
Path 198 | total_timesteps 3895.
Path 199 | total_timesteps 3907.
Path 200 | total_timesteps 3934.
Path 201 | total_timesteps 3952.
Path 202 | total_timesteps 3963.
Path 203 | total_timesteps 3977.
Path 204 | total_timesteps 3995.
Path 205 | total_timesteps 4020.
Path 206 | total_timesteps 4042.
Path 207 | total_timesteps 4059.
Path 208 | total_timesteps 4077.
Path 209 | total_timesteps 4096.
Path 210 | total_timesteps 4123.
Path 211 | total_timesteps 4149.
Path 212 | total_timesteps 4161.
Path 213 | total_timesteps 4187.
Path 214 | total_timesteps 4194.
Path 215 | total_timesteps 4217.
Path 216 | total_timesteps 4235.
Path 217 | total_timesteps 4256.
Path 218 | total_timesteps 4274.
Path 219 | total_timesteps 4298.
Path 220 | total_timesteps 4314.
Path 221 | total_timesteps 4330.
Path 222 | total_timesteps 4370.
Path 223 | total_timesteps 4397.
Path 224 | total_timesteps 4421.
Path 225 | total_timesteps 4430.
Path 226 | total_timesteps 4451.
Path 227 | total_timesteps 4465.
Path 228 | total_timesteps 4501.
Path 229 | total_timesteps 4531.
Path 230 | total_timesteps 4562.
Path 231 | total_timesteps 4582.
Path 232 | total_timesteps 4604.
Path 233 | total_timesteps 4622.
Path 234 | total_timesteps 4664.
Path 235 | total_timesteps 4682.
Path 236 | total_timesteps 4694.
Path 237 | total_timesteps 4707.
Path 238 | total_timesteps 4749.
Path 239 | total_timesteps 4774.
Path 240 | total_timesteps 4797.
Path 241 | total_timesteps 4829.
Path 242 | total_timesteps 4840.
Path 243 | total_timesteps 4853.
Path 244 | total_timesteps 4873.
Path 245 | total_timesteps 4898.
Path 246 | total_timesteps 4924.
Path 247 | total_timesteps 4943.
Path 248 | total_timesteps 4959.
Path 249 | total_timesteps 4994.
Path 250 | total_timesteps 5015.
Path 251 | total_timesteps 5028.
Path 252 | total_timesteps 5044.
Path 253 | total_timesteps 5055.
Path 254 | total_timesteps 5069.
Path 255 | total_timesteps 5082.
Path 256 | total_timesteps 5105.
Path 257 | total_timesteps 5115.
Path 258 | total_timesteps 5138.
Path 259 | total_timesteps 5156.
Path 260 | total_timesteps 5168.
Path 261 | total_timesteps 5195.
Path 262 | total_timesteps 5221.
Path 263 | total_timesteps 5233.
Path 264 | total_timesteps 5260.
Path 265 | total_timesteps 5286.
Path 266 | total_timesteps 5306.
Path 267 | total_timesteps 5319.
Path 268 | total_timesteps 5349.
Path 269 | total_timesteps 5369.
Path 270 | total_timesteps 5392.
Path 271 | total_timesteps 5427.
Path 272 | total_timesteps 5440.
Path 273 | total_timesteps 5458.
Path 274 | total_timesteps 5477.
Path 275 | total_timesteps 5493.
Path 276 | total_timesteps 5510.
Path 277 | total_timesteps 5532.
Path 278 | total_timesteps 5561.
Path 279 | total_timesteps 5584.
Path 280 | total_timesteps 5604.
Path 281 | total_timesteps 5632.
Path 282 | total_timesteps 5653.
Path 283 | total_timesteps 5671.
Path 284 | total_timesteps 5710.
Path 285 | total_timesteps 5729.
Path 286 | total_timesteps 5749.
Path 287 | total_timesteps 5771.
Path 288 | total_timesteps 5796.
Path 289 | total_timesteps 5821.
Path 290 | total_timesteps 5847.
Path 291 | total_timesteps 5875.
Path 292 | total_timesteps 5890.
Path 293 | total_timesteps 5916.
Path 294 | total_timesteps 5928.
Path 295 | total_timesteps 5944.
Path 296 | total_timesteps 5961.
Path 297 | total_timesteps 5978.
Path 298 | total_timesteps 5994.
Done generating random rollouts.
Creating normalization for training data.
Done creating normalization for training data.
Train dynamics model with intrinsic reward only? False
Pre-training enabled. Using only intrinsic reward.
Pre-training dynamics model for 0 iterations...
Done pre-training dynamics model.
Using external reward only.
itr #0 | 
Fitting dynamics.
Validation loss = 0.7574825882911682
Validation loss = 0.3907787799835205
Validation loss = 0.357915997505188
Validation loss = 0.3394075036048889
Validation loss = 0.3279867172241211
Validation loss = 0.32492589950561523
Validation loss = 0.3228757977485657
Validation loss = 0.327279269695282
Validation loss = 0.35026973485946655
Validation loss = 0.3368108570575714
Validation loss = 0.3368890881538391
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 18.
Path 2 | total_timesteps 87.
Path 3 | total_timesteps 135.
Path 4 | total_timesteps 176.
Path 5 | total_timesteps 208.
Path 6 | total_timesteps 229.
Path 7 | total_timesteps 275.
Path 8 | total_timesteps 325.
Path 9 | total_timesteps 393.
Path 10 | total_timesteps 430.
Path 11 | total_timesteps 451.
Path 12 | total_timesteps 530.
Path 13 | total_timesteps 599.
Path 14 | total_timesteps 643.
Path 15 | total_timesteps 679.
Path 16 | total_timesteps 742.
Path 17 | total_timesteps 789.
Path 18 | total_timesteps 812.
Path 19 | total_timesteps 852.
Path 20 | total_timesteps 943.
Path 21 | total_timesteps 976.
Path 22 | total_timesteps 1019.
Path 23 | total_timesteps 1083.
Path 24 | total_timesteps 1111.
Path 25 | total_timesteps 1142.
Path 26 | total_timesteps 1215.
Path 27 | total_timesteps 1247.
Path 28 | total_timesteps 1298.
Path 29 | total_timesteps 1324.
Path 30 | total_timesteps 1346.
Path 31 | total_timesteps 1372.
Path 32 | total_timesteps 1418.
Path 33 | total_timesteps 1452.
Path 34 | total_timesteps 1549.
Path 35 | total_timesteps 1589.
Path 36 | total_timesteps 1637.
Path 37 | total_timesteps 1662.
Path 38 | total_timesteps 1703.
Path 39 | total_timesteps 1770.
Path 40 | total_timesteps 1801.
Path 41 | total_timesteps 1829.
Path 42 | total_timesteps 1947.
Path 43 | total_timesteps 2031.
Path 44 | total_timesteps 2072.
Path 45 | total_timesteps 2132.
Path 46 | total_timesteps 2173.
Path 47 | total_timesteps 2222.
Path 48 | total_timesteps 2258.
Path 49 | total_timesteps 2299.
Path 50 | total_timesteps 2318.
Path 51 | total_timesteps 2353.
Path 52 | total_timesteps 2378.
Path 53 | total_timesteps 2468.
Path 54 | total_timesteps 2524.
Path 55 | total_timesteps 2549.
Path 56 | total_timesteps 2571.
Path 57 | total_timesteps 2653.
Path 58 | total_timesteps 2689.
Path 59 | total_timesteps 2776.
Path 60 | total_timesteps 2855.
Path 61 | total_timesteps 2929.
Path 62 | total_timesteps 2961.
Path 63 | total_timesteps 3000.
Path 64 | total_timesteps 3050.
Path 65 | total_timesteps 3089.
Path 66 | total_timesteps 3111.
Path 67 | total_timesteps 3171.
Path 68 | total_timesteps 3209.
Path 69 | total_timesteps 3251.
Path 70 | total_timesteps 3271.
Path 71 | total_timesteps 3301.
Path 72 | total_timesteps 3384.
Path 73 | total_timesteps 3393.
Path 74 | total_timesteps 3419.
Path 75 | total_timesteps 3481.
Path 76 | total_timesteps 3553.
Path 77 | total_timesteps 3604.
Path 78 | total_timesteps 3648.
Path 79 | total_timesteps 3683.
Path 80 | total_timesteps 3700.
Path 81 | total_timesteps 3746.
Path 82 | total_timesteps 3780.
Path 83 | total_timesteps 3803.
Path 84 | total_timesteps 3832.
Path 85 | total_timesteps 3871.
Path 86 | total_timesteps 3916.
Path 87 | total_timesteps 3954.
Path 88 | total_timesteps 3985.
Path 89 | total_timesteps 4085.
Path 90 | total_timesteps 4137.
Path 91 | total_timesteps 4167.
Path 92 | total_timesteps 4203.
Path 93 | total_timesteps 4253.
Path 94 | total_timesteps 4303.
Path 95 | total_timesteps 4357.
Path 96 | total_timesteps 4405.
Path 97 | total_timesteps 4460.
Path 98 | total_timesteps 4484.
Path 99 | total_timesteps 4521.
Path 100 | total_timesteps 4559.
Path 101 | total_timesteps 4596.
Path 102 | total_timesteps 4610.
Path 103 | total_timesteps 4653.
Path 104 | total_timesteps 4685.
Path 105 | total_timesteps 4727.
Path 106 | total_timesteps 4794.
Path 107 | total_timesteps 4908.
Path 108 | total_timesteps 4994.
Path 109 | total_timesteps 5042.
Path 110 | total_timesteps 5054.
Path 111 | total_timesteps 5101.
Path 112 | total_timesteps 5177.
Path 113 | total_timesteps 5222.
Path 114 | total_timesteps 5254.
Path 115 | total_timesteps 5309.
Path 116 | total_timesteps 5358.
Path 117 | total_timesteps 5443.
Path 118 | total_timesteps 5506.
Path 119 | total_timesteps 5518.
Path 120 | total_timesteps 5552.
Path 121 | total_timesteps 5573.
Path 122 | total_timesteps 5593.
Path 123 | total_timesteps 5636.
Path 124 | total_timesteps 5659.
Path 125 | total_timesteps 5696.
Path 126 | total_timesteps 5752.
Path 127 | total_timesteps 5828.
Path 128 | total_timesteps 5874.
Path 129 | total_timesteps 5916.
Path 130 | total_timesteps 5952.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -7.94    |
| Iteration     | 0        |
| MaximumReturn | 53.4     |
| MinimumReturn | -41.8    |
| TotalSamples  | 8021     |
----------------------------
itr #1 | 
Fitting dynamics.
Validation loss = 0.3847937285900116
Validation loss = 0.35276907682418823
Validation loss = 0.36318036913871765
Validation loss = 0.3542090952396393
Validation loss = 0.378950297832489
Validation loss = 0.3874548375606537
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 32.
Path 2 | total_timesteps 101.
Path 3 | total_timesteps 130.
Path 4 | total_timesteps 164.
Path 5 | total_timesteps 201.
Path 6 | total_timesteps 218.
Path 7 | total_timesteps 259.
Path 8 | total_timesteps 288.
Path 9 | total_timesteps 336.
Path 10 | total_timesteps 388.
Path 11 | total_timesteps 419.
Path 12 | total_timesteps 448.
Path 13 | total_timesteps 490.
Path 14 | total_timesteps 550.
Path 15 | total_timesteps 597.
Path 16 | total_timesteps 622.
Path 17 | total_timesteps 665.
Path 18 | total_timesteps 710.
Path 19 | total_timesteps 731.
Path 20 | total_timesteps 767.
Path 21 | total_timesteps 807.
Path 22 | total_timesteps 857.
Path 23 | total_timesteps 875.
Path 24 | total_timesteps 887.
Path 25 | total_timesteps 952.
Path 26 | total_timesteps 988.
Path 27 | total_timesteps 1010.
Path 28 | total_timesteps 1054.
Path 29 | total_timesteps 1094.
Path 30 | total_timesteps 1121.
Path 31 | total_timesteps 1155.
Path 32 | total_timesteps 1205.
Path 33 | total_timesteps 1235.
Path 34 | total_timesteps 1260.
Path 35 | total_timesteps 1304.
Path 36 | total_timesteps 1338.
Path 37 | total_timesteps 1361.
Path 38 | total_timesteps 1384.
Path 39 | total_timesteps 1418.
Path 40 | total_timesteps 1469.
Path 41 | total_timesteps 1499.
Path 42 | total_timesteps 1553.
Path 43 | total_timesteps 1612.
Path 44 | total_timesteps 1634.
Path 45 | total_timesteps 1680.
Path 46 | total_timesteps 1734.
Path 47 | total_timesteps 1803.
Path 48 | total_timesteps 1826.
Path 49 | total_timesteps 1842.
Path 50 | total_timesteps 1873.
Path 51 | total_timesteps 1893.
Path 52 | total_timesteps 1907.
Path 53 | total_timesteps 1963.
Path 54 | total_timesteps 1987.
Path 55 | total_timesteps 2012.
Path 56 | total_timesteps 2063.
Path 57 | total_timesteps 2100.
Path 58 | total_timesteps 2123.
Path 59 | total_timesteps 2175.
Path 60 | total_timesteps 2193.
Path 61 | total_timesteps 2256.
Path 62 | total_timesteps 2291.
Path 63 | total_timesteps 2329.
Path 64 | total_timesteps 2366.
Path 65 | total_timesteps 2401.
Path 66 | total_timesteps 2449.
Path 67 | total_timesteps 2476.
Path 68 | total_timesteps 2519.
Path 69 | total_timesteps 2552.
Path 70 | total_timesteps 2569.
Path 71 | total_timesteps 2587.
Path 72 | total_timesteps 2599.
Path 73 | total_timesteps 2628.
Path 74 | total_timesteps 2661.
Path 75 | total_timesteps 2706.
Path 76 | total_timesteps 2736.
Path 77 | total_timesteps 2766.
Path 78 | total_timesteps 2811.
Path 79 | total_timesteps 2843.
Path 80 | total_timesteps 2865.
Path 81 | total_timesteps 2889.
Path 82 | total_timesteps 2923.
Path 83 | total_timesteps 2981.
Path 84 | total_timesteps 2997.
Path 85 | total_timesteps 3024.
Path 86 | total_timesteps 3082.
Path 87 | total_timesteps 3112.
Path 88 | total_timesteps 3132.
Path 89 | total_timesteps 3167.
Path 90 | total_timesteps 3181.
Path 91 | total_timesteps 3201.
Path 92 | total_timesteps 3235.
Path 93 | total_timesteps 3259.
Path 94 | total_timesteps 3305.
Path 95 | total_timesteps 3338.
Path 96 | total_timesteps 3372.
Path 97 | total_timesteps 3406.
Path 98 | total_timesteps 3421.
Path 99 | total_timesteps 3466.
Path 100 | total_timesteps 3519.
Path 101 | total_timesteps 3538.
Path 102 | total_timesteps 3565.
Path 103 | total_timesteps 3602.
Path 104 | total_timesteps 3650.
Path 105 | total_timesteps 3667.
Path 106 | total_timesteps 3684.
Path 107 | total_timesteps 3725.
Path 108 | total_timesteps 3766.
Path 109 | total_timesteps 3816.
Path 110 | total_timesteps 3864.
Path 111 | total_timesteps 3884.
Path 112 | total_timesteps 3929.
Path 113 | total_timesteps 3942.
Path 114 | total_timesteps 3966.
Path 115 | total_timesteps 4027.
Path 116 | total_timesteps 4082.
Path 117 | total_timesteps 4143.
Path 118 | total_timesteps 4215.
Path 119 | total_timesteps 4238.
Path 120 | total_timesteps 4263.
Path 121 | total_timesteps 4290.
Path 122 | total_timesteps 4344.
Path 123 | total_timesteps 4368.
Path 124 | total_timesteps 4409.
Path 125 | total_timesteps 4446.
Path 126 | total_timesteps 4485.
Path 127 | total_timesteps 4521.
Path 128 | total_timesteps 4538.
Path 129 | total_timesteps 4573.
Path 130 | total_timesteps 4593.
Path 131 | total_timesteps 4615.
Path 132 | total_timesteps 4678.
Path 133 | total_timesteps 4697.
Path 134 | total_timesteps 4724.
Path 135 | total_timesteps 4748.
Path 136 | total_timesteps 4794.
Path 137 | total_timesteps 4834.
Path 138 | total_timesteps 4853.
Path 139 | total_timesteps 4908.
Path 140 | total_timesteps 4919.
Path 141 | total_timesteps 4969.
Path 142 | total_timesteps 4997.
Path 143 | total_timesteps 5046.
Path 144 | total_timesteps 5079.
Path 145 | total_timesteps 5114.
Path 146 | total_timesteps 5150.
Path 147 | total_timesteps 5179.
Path 148 | total_timesteps 5230.
Path 149 | total_timesteps 5267.
Path 150 | total_timesteps 5326.
Path 151 | total_timesteps 5356.
Path 152 | total_timesteps 5387.
Path 153 | total_timesteps 5436.
Path 154 | total_timesteps 5466.
Path 155 | total_timesteps 5506.
Path 156 | total_timesteps 5523.
Path 157 | total_timesteps 5555.
Path 158 | total_timesteps 5605.
Path 159 | total_timesteps 5626.
Path 160 | total_timesteps 5678.
Path 161 | total_timesteps 5705.
Path 162 | total_timesteps 5741.
Path 163 | total_timesteps 5766.
Path 164 | total_timesteps 5796.
Path 165 | total_timesteps 5826.
Path 166 | total_timesteps 5922.
Path 167 | total_timesteps 5967.
Path 168 | total_timesteps 5980.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -8.67    |
| Iteration     | 1        |
| MaximumReturn | 20.6     |
| MinimumReturn | -40.3    |
| TotalSamples  | 12029    |
----------------------------
itr #2 | 
Fitting dynamics.
Validation loss = 0.3647642433643341
Validation loss = 0.35560712218284607
Validation loss = 0.36261066794395447
Validation loss = 0.36837396025657654
Validation loss = 0.36854204535484314
Validation loss = 0.3859119117259979
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 27.
Path 2 | total_timesteps 79.
Path 3 | total_timesteps 120.
Path 4 | total_timesteps 142.
Path 5 | total_timesteps 166.
Path 6 | total_timesteps 189.
Path 7 | total_timesteps 242.
Path 8 | total_timesteps 264.
Path 9 | total_timesteps 295.
Path 10 | total_timesteps 334.
Path 11 | total_timesteps 346.
Path 12 | total_timesteps 368.
Path 13 | total_timesteps 389.
Path 14 | total_timesteps 415.
Path 15 | total_timesteps 477.
Path 16 | total_timesteps 533.
Path 17 | total_timesteps 575.
Path 18 | total_timesteps 616.
Path 19 | total_timesteps 628.
Path 20 | total_timesteps 675.
Path 21 | total_timesteps 700.
Path 22 | total_timesteps 724.
Path 23 | total_timesteps 763.
Path 24 | total_timesteps 778.
Path 25 | total_timesteps 818.
Path 26 | total_timesteps 841.
Path 27 | total_timesteps 869.
Path 28 | total_timesteps 950.
Path 29 | total_timesteps 987.
Path 30 | total_timesteps 1004.
Path 31 | total_timesteps 1047.
Path 32 | total_timesteps 1095.
Path 33 | total_timesteps 1139.
Path 34 | total_timesteps 1167.
Path 35 | total_timesteps 1194.
Path 36 | total_timesteps 1217.
Path 37 | total_timesteps 1287.
Path 38 | total_timesteps 1330.
Path 39 | total_timesteps 1358.
Path 40 | total_timesteps 1393.
Path 41 | total_timesteps 1414.
Path 42 | total_timesteps 1450.
Path 43 | total_timesteps 1461.
Path 44 | total_timesteps 1471.
Path 45 | total_timesteps 1517.
Path 46 | total_timesteps 1549.
Path 47 | total_timesteps 1567.
Path 48 | total_timesteps 1632.
Path 49 | total_timesteps 1667.
Path 50 | total_timesteps 1690.
Path 51 | total_timesteps 1727.
Path 52 | total_timesteps 1762.
Path 53 | total_timesteps 1790.
Path 54 | total_timesteps 1804.
Path 55 | total_timesteps 1821.
Path 56 | total_timesteps 1850.
Path 57 | total_timesteps 1910.
Path 58 | total_timesteps 1944.
Path 59 | total_timesteps 1965.
Path 60 | total_timesteps 2004.
Path 61 | total_timesteps 2050.
Path 62 | total_timesteps 2133.
Path 63 | total_timesteps 2165.
Path 64 | total_timesteps 2226.
Path 65 | total_timesteps 2261.
Path 66 | total_timesteps 2279.
Path 67 | total_timesteps 2293.
Path 68 | total_timesteps 2316.
Path 69 | total_timesteps 2374.
Path 70 | total_timesteps 2429.
Path 71 | total_timesteps 2445.
Path 72 | total_timesteps 2478.
Path 73 | total_timesteps 2510.
Path 74 | total_timesteps 2528.
Path 75 | total_timesteps 2570.
Path 76 | total_timesteps 2622.
Path 77 | total_timesteps 2654.
Path 78 | total_timesteps 2688.
Path 79 | total_timesteps 2779.
Path 80 | total_timesteps 2827.
Path 81 | total_timesteps 2864.
Path 82 | total_timesteps 2906.
Path 83 | total_timesteps 2925.
Path 84 | total_timesteps 2944.
Path 85 | total_timesteps 2978.
Path 86 | total_timesteps 3026.
Path 87 | total_timesteps 3066.
Path 88 | total_timesteps 3089.
Path 89 | total_timesteps 3126.
Path 90 | total_timesteps 3138.
Path 91 | total_timesteps 3162.
Path 92 | total_timesteps 3186.
Path 93 | total_timesteps 3213.
Path 94 | total_timesteps 3255.
Path 95 | total_timesteps 3296.
Path 96 | total_timesteps 3337.
Path 97 | total_timesteps 3386.
Path 98 | total_timesteps 3419.
Path 99 | total_timesteps 3464.
Path 100 | total_timesteps 3493.
Path 101 | total_timesteps 3520.
Path 102 | total_timesteps 3572.
Path 103 | total_timesteps 3596.
Path 104 | total_timesteps 3623.
Path 105 | total_timesteps 3641.
Path 106 | total_timesteps 3694.
Path 107 | total_timesteps 3709.
Path 108 | total_timesteps 3743.
Path 109 | total_timesteps 3751.
Path 110 | total_timesteps 3766.
Path 111 | total_timesteps 3835.
Path 112 | total_timesteps 3862.
Path 113 | total_timesteps 3888.
Path 114 | total_timesteps 3927.
Path 115 | total_timesteps 3941.
Path 116 | total_timesteps 3966.
Path 117 | total_timesteps 3984.
Path 118 | total_timesteps 4011.
Path 119 | total_timesteps 4026.
Path 120 | total_timesteps 4078.
Path 121 | total_timesteps 4131.
Path 122 | total_timesteps 4165.
Path 123 | total_timesteps 4220.
Path 124 | total_timesteps 4269.
Path 125 | total_timesteps 4310.
Path 126 | total_timesteps 4345.
Path 127 | total_timesteps 4389.
Path 128 | total_timesteps 4475.
Path 129 | total_timesteps 4521.
Path 130 | total_timesteps 4549.
Path 131 | total_timesteps 4629.
Path 132 | total_timesteps 4648.
Path 133 | total_timesteps 4689.
Path 134 | total_timesteps 4714.
Path 135 | total_timesteps 4769.
Path 136 | total_timesteps 4807.
Path 137 | total_timesteps 4835.
Path 138 | total_timesteps 4882.
Path 139 | total_timesteps 4924.
Path 140 | total_timesteps 4940.
Path 141 | total_timesteps 4975.
Path 142 | total_timesteps 5021.
Path 143 | total_timesteps 5043.
Path 144 | total_timesteps 5066.
Path 145 | total_timesteps 5099.
Path 146 | total_timesteps 5120.
Path 147 | total_timesteps 5146.
Path 148 | total_timesteps 5203.
Path 149 | total_timesteps 5226.
Path 150 | total_timesteps 5272.
Path 151 | total_timesteps 5310.
Path 152 | total_timesteps 5348.
Path 153 | total_timesteps 5381.
Path 154 | total_timesteps 5408.
Path 155 | total_timesteps 5454.
Path 156 | total_timesteps 5507.
Path 157 | total_timesteps 5533.
Path 158 | total_timesteps 5558.
Path 159 | total_timesteps 5600.
Path 160 | total_timesteps 5624.
Path 161 | total_timesteps 5669.
Path 162 | total_timesteps 5696.
Path 163 | total_timesteps 5720.
Path 164 | total_timesteps 5744.
Path 165 | total_timesteps 5794.
Path 166 | total_timesteps 5841.
Path 167 | total_timesteps 5854.
Path 168 | total_timesteps 5877.
Path 169 | total_timesteps 5915.
Path 170 | total_timesteps 5962.
Path 171 | total_timesteps 5991.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -9.05    |
| Iteration     | 2        |
| MaximumReturn | 25.5     |
| MinimumReturn | -45.7    |
| TotalSamples  | 16060    |
----------------------------
itr #3 | 
Fitting dynamics.
Validation loss = 0.3582417666912079
Validation loss = 0.36296355724334717
Validation loss = 0.3743324875831604
Validation loss = 0.38754400610923767
Validation loss = 0.37944549322128296
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 45.
Path 2 | total_timesteps 73.
Path 3 | total_timesteps 120.
Path 4 | total_timesteps 160.
Path 5 | total_timesteps 236.
Path 6 | total_timesteps 261.
Path 7 | total_timesteps 297.
Path 8 | total_timesteps 318.
Path 9 | total_timesteps 326.
Path 10 | total_timesteps 365.
Path 11 | total_timesteps 408.
Path 12 | total_timesteps 430.
Path 13 | total_timesteps 464.
Path 14 | total_timesteps 504.
Path 15 | total_timesteps 546.
Path 16 | total_timesteps 586.
Path 17 | total_timesteps 625.
Path 18 | total_timesteps 660.
Path 19 | total_timesteps 696.
Path 20 | total_timesteps 713.
Path 21 | total_timesteps 744.
Path 22 | total_timesteps 767.
Path 23 | total_timesteps 786.
Path 24 | total_timesteps 819.
Path 25 | total_timesteps 856.
Path 26 | total_timesteps 888.
Path 27 | total_timesteps 924.
Path 28 | total_timesteps 945.
Path 29 | total_timesteps 967.
Path 30 | total_timesteps 1007.
Path 31 | total_timesteps 1039.
Path 32 | total_timesteps 1077.
Path 33 | total_timesteps 1106.
Path 34 | total_timesteps 1148.
Path 35 | total_timesteps 1167.
Path 36 | total_timesteps 1220.
Path 37 | total_timesteps 1247.
Path 38 | total_timesteps 1278.
Path 39 | total_timesteps 1311.
Path 40 | total_timesteps 1336.
Path 41 | total_timesteps 1400.
Path 42 | total_timesteps 1433.
Path 43 | total_timesteps 1477.
Path 44 | total_timesteps 1509.
Path 45 | total_timesteps 1567.
Path 46 | total_timesteps 1581.
Path 47 | total_timesteps 1626.
Path 48 | total_timesteps 1659.
Path 49 | total_timesteps 1704.
Path 50 | total_timesteps 1744.
Path 51 | total_timesteps 1770.
Path 52 | total_timesteps 1810.
Path 53 | total_timesteps 1828.
Path 54 | total_timesteps 1850.
Path 55 | total_timesteps 1875.
Path 56 | total_timesteps 1917.
Path 57 | total_timesteps 1968.
Path 58 | total_timesteps 2006.
Path 59 | total_timesteps 2064.
Path 60 | total_timesteps 2102.
Path 61 | total_timesteps 2115.
Path 62 | total_timesteps 2139.
Path 63 | total_timesteps 2193.
Path 64 | total_timesteps 2231.
Path 65 | total_timesteps 2273.
Path 66 | total_timesteps 2303.
Path 67 | total_timesteps 2319.
Path 68 | total_timesteps 2352.
Path 69 | total_timesteps 2381.
Path 70 | total_timesteps 2480.
Path 71 | total_timesteps 2531.
Path 72 | total_timesteps 2555.
Path 73 | total_timesteps 2595.
Path 74 | total_timesteps 2680.
Path 75 | total_timesteps 2704.
Path 76 | total_timesteps 2722.
Path 77 | total_timesteps 2774.
Path 78 | total_timesteps 2821.
Path 79 | total_timesteps 2860.
Path 80 | total_timesteps 2896.
Path 81 | total_timesteps 2919.
Path 82 | total_timesteps 2991.
Path 83 | total_timesteps 3012.
Path 84 | total_timesteps 3056.
Path 85 | total_timesteps 3103.
Path 86 | total_timesteps 3136.
Path 87 | total_timesteps 3177.
Path 88 | total_timesteps 3212.
Path 89 | total_timesteps 3269.
Path 90 | total_timesteps 3292.
Path 91 | total_timesteps 3311.
Path 92 | total_timesteps 3354.
Path 93 | total_timesteps 3380.
Path 94 | total_timesteps 3439.
Path 95 | total_timesteps 3453.
Path 96 | total_timesteps 3471.
Path 97 | total_timesteps 3505.
Path 98 | total_timesteps 3559.
Path 99 | total_timesteps 3591.
Path 100 | total_timesteps 3619.
Path 101 | total_timesteps 3672.
Path 102 | total_timesteps 3711.
Path 103 | total_timesteps 3728.
Path 104 | total_timesteps 3754.
Path 105 | total_timesteps 3782.
Path 106 | total_timesteps 3822.
Path 107 | total_timesteps 3872.
Path 108 | total_timesteps 3907.
Path 109 | total_timesteps 3942.
Path 110 | total_timesteps 3977.
Path 111 | total_timesteps 4019.
Path 112 | total_timesteps 4046.
Path 113 | total_timesteps 4075.
Path 114 | total_timesteps 4102.
Path 115 | total_timesteps 4143.
Path 116 | total_timesteps 4177.
Path 117 | total_timesteps 4250.
Path 118 | total_timesteps 4324.
Path 119 | total_timesteps 4385.
Path 120 | total_timesteps 4405.
Path 121 | total_timesteps 4419.
Path 122 | total_timesteps 4457.
Path 123 | total_timesteps 4468.
Path 124 | total_timesteps 4495.
Path 125 | total_timesteps 4550.
Path 126 | total_timesteps 4591.
Path 127 | total_timesteps 4627.
Path 128 | total_timesteps 4660.
Path 129 | total_timesteps 4681.
Path 130 | total_timesteps 4701.
Path 131 | total_timesteps 4720.
Path 132 | total_timesteps 4737.
Path 133 | total_timesteps 4759.
Path 134 | total_timesteps 4854.
Path 135 | total_timesteps 4883.
Path 136 | total_timesteps 4904.
Path 137 | total_timesteps 4985.
Path 138 | total_timesteps 5030.
Path 139 | total_timesteps 5064.
Path 140 | total_timesteps 5075.
Path 141 | total_timesteps 5115.
Path 142 | total_timesteps 5142.
Path 143 | total_timesteps 5165.
Path 144 | total_timesteps 5221.
Path 145 | total_timesteps 5298.
Path 146 | total_timesteps 5321.
Path 147 | total_timesteps 5351.
Path 148 | total_timesteps 5371.
Path 149 | total_timesteps 5404.
Path 150 | total_timesteps 5427.
Path 151 | total_timesteps 5452.
Path 152 | total_timesteps 5482.
Path 153 | total_timesteps 5519.
Path 154 | total_timesteps 5549.
Path 155 | total_timesteps 5595.
Path 156 | total_timesteps 5628.
Path 157 | total_timesteps 5659.
Path 158 | total_timesteps 5725.
Path 159 | total_timesteps 5742.
Path 160 | total_timesteps 5757.
Path 161 | total_timesteps 5793.
Path 162 | total_timesteps 5819.
Path 163 | total_timesteps 5860.
Path 164 | total_timesteps 5890.
Path 165 | total_timesteps 5939.
Path 166 | total_timesteps 5972.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -8.9     |
| Iteration     | 3        |
| MaximumReturn | 55.3     |
| MinimumReturn | -44.1    |
| TotalSamples  | 20061    |
----------------------------
itr #4 | 
Fitting dynamics.
Validation loss = 0.3647835850715637
Validation loss = 0.3761187791824341
Validation loss = 0.3751690983772278
Validation loss = 0.3760283291339874
Validation loss = 0.3997272849082947
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 27.
Path 2 | total_timesteps 65.
Path 3 | total_timesteps 84.
Path 4 | total_timesteps 96.
Path 5 | total_timesteps 106.
Path 6 | total_timesteps 119.
Path 7 | total_timesteps 135.
Path 8 | total_timesteps 156.
Path 9 | total_timesteps 181.
Path 10 | total_timesteps 197.
Path 11 | total_timesteps 214.
Path 12 | total_timesteps 241.
Path 13 | total_timesteps 255.
Path 14 | total_timesteps 271.
Path 15 | total_timesteps 303.
Path 16 | total_timesteps 327.
Path 17 | total_timesteps 336.
Path 18 | total_timesteps 365.
Path 19 | total_timesteps 392.
Path 20 | total_timesteps 410.
Path 21 | total_timesteps 425.
Path 22 | total_timesteps 437.
Path 23 | total_timesteps 454.
Path 24 | total_timesteps 463.
Path 25 | total_timesteps 482.
Path 26 | total_timesteps 509.
Path 27 | total_timesteps 521.
Path 28 | total_timesteps 540.
Path 29 | total_timesteps 548.
Path 30 | total_timesteps 572.
Path 31 | total_timesteps 592.
Path 32 | total_timesteps 606.
Path 33 | total_timesteps 626.
Path 34 | total_timesteps 640.
Path 35 | total_timesteps 653.
Path 36 | total_timesteps 668.
Path 37 | total_timesteps 688.
Path 38 | total_timesteps 711.
Path 39 | total_timesteps 735.
Path 40 | total_timesteps 792.
Path 41 | total_timesteps 812.
Path 42 | total_timesteps 845.
Path 43 | total_timesteps 874.
Path 44 | total_timesteps 891.
Path 45 | total_timesteps 928.
Path 46 | total_timesteps 951.
Path 47 | total_timesteps 969.
Path 48 | total_timesteps 1005.
Path 49 | total_timesteps 1015.
Path 50 | total_timesteps 1036.
Path 51 | total_timesteps 1057.
Path 52 | total_timesteps 1083.
Path 53 | total_timesteps 1115.
Path 54 | total_timesteps 1132.
Path 55 | total_timesteps 1157.
Path 56 | total_timesteps 1164.
Path 57 | total_timesteps 1184.
Path 58 | total_timesteps 1194.
Path 59 | total_timesteps 1231.
Path 60 | total_timesteps 1255.
Path 61 | total_timesteps 1282.
Path 62 | total_timesteps 1296.
Path 63 | total_timesteps 1318.
Path 64 | total_timesteps 1342.
Path 65 | total_timesteps 1360.
Path 66 | total_timesteps 1378.
Path 67 | total_timesteps 1394.
Path 68 | total_timesteps 1424.
Path 69 | total_timesteps 1453.
Path 70 | total_timesteps 1470.
Path 71 | total_timesteps 1482.
Path 72 | total_timesteps 1514.
Path 73 | total_timesteps 1548.
Path 74 | total_timesteps 1559.
Path 75 | total_timesteps 1572.
Path 76 | total_timesteps 1590.
Path 77 | total_timesteps 1606.
Path 78 | total_timesteps 1625.
Path 79 | total_timesteps 1670.
Path 80 | total_timesteps 1696.
Path 81 | total_timesteps 1709.
Path 82 | total_timesteps 1725.
Path 83 | total_timesteps 1744.
Path 84 | total_timesteps 1766.
Path 85 | total_timesteps 1805.
Path 86 | total_timesteps 1820.
Path 87 | total_timesteps 1829.
Path 88 | total_timesteps 1853.
Path 89 | total_timesteps 1877.
Path 90 | total_timesteps 1890.
Path 91 | total_timesteps 1911.
Path 92 | total_timesteps 1919.
Path 93 | total_timesteps 1928.
Path 94 | total_timesteps 1952.
Path 95 | total_timesteps 1969.
Path 96 | total_timesteps 1983.
Path 97 | total_timesteps 2014.
Path 98 | total_timesteps 2029.
Path 99 | total_timesteps 2042.
Path 100 | total_timesteps 2053.
Path 101 | total_timesteps 2067.
Path 102 | total_timesteps 2090.
Path 103 | total_timesteps 2125.
Path 104 | total_timesteps 2154.
Path 105 | total_timesteps 2168.
Path 106 | total_timesteps 2190.
Path 107 | total_timesteps 2216.
Path 108 | total_timesteps 2231.
Path 109 | total_timesteps 2258.
Path 110 | total_timesteps 2316.
Path 111 | total_timesteps 2326.
Path 112 | total_timesteps 2340.
Path 113 | total_timesteps 2353.
Path 114 | total_timesteps 2388.
Path 115 | total_timesteps 2405.
Path 116 | total_timesteps 2417.
Path 117 | total_timesteps 2437.
Path 118 | total_timesteps 2475.
Path 119 | total_timesteps 2487.
Path 120 | total_timesteps 2497.
Path 121 | total_timesteps 2534.
Path 122 | total_timesteps 2559.
Path 123 | total_timesteps 2573.
Path 124 | total_timesteps 2593.
Path 125 | total_timesteps 2630.
Path 126 | total_timesteps 2641.
Path 127 | total_timesteps 2674.
Path 128 | total_timesteps 2691.
Path 129 | total_timesteps 2704.
Path 130 | total_timesteps 2727.
Path 131 | total_timesteps 2740.
Path 132 | total_timesteps 2765.
Path 133 | total_timesteps 2796.
Path 134 | total_timesteps 2821.
Path 135 | total_timesteps 2840.
Path 136 | total_timesteps 2856.
Path 137 | total_timesteps 2867.
Path 138 | total_timesteps 2885.
Path 139 | total_timesteps 2894.
Path 140 | total_timesteps 2904.
Path 141 | total_timesteps 2929.
Path 142 | total_timesteps 2940.
Path 143 | total_timesteps 2949.
Path 144 | total_timesteps 2957.
Path 145 | total_timesteps 2992.
Path 146 | total_timesteps 3005.
Path 147 | total_timesteps 3060.
Path 148 | total_timesteps 3080.
Path 149 | total_timesteps 3092.
Path 150 | total_timesteps 3108.
Path 151 | total_timesteps 3120.
Path 152 | total_timesteps 3152.
Path 153 | total_timesteps 3181.
Path 154 | total_timesteps 3204.
Path 155 | total_timesteps 3223.
Path 156 | total_timesteps 3237.
Path 157 | total_timesteps 3254.
Path 158 | total_timesteps 3278.
Path 159 | total_timesteps 3303.
Path 160 | total_timesteps 3314.
Path 161 | total_timesteps 3347.
Path 162 | total_timesteps 3372.
Path 163 | total_timesteps 3395.
Path 164 | total_timesteps 3426.
Path 165 | total_timesteps 3452.
Path 166 | total_timesteps 3471.
Path 167 | total_timesteps 3504.
Path 168 | total_timesteps 3520.
Path 169 | total_timesteps 3544.
Path 170 | total_timesteps 3554.
Path 171 | total_timesteps 3573.
Path 172 | total_timesteps 3585.
Path 173 | total_timesteps 3608.
Path 174 | total_timesteps 3636.
Path 175 | total_timesteps 3659.
Path 176 | total_timesteps 3679.
Path 177 | total_timesteps 3697.
Path 178 | total_timesteps 3718.
Path 179 | total_timesteps 3733.
Path 180 | total_timesteps 3751.
Path 181 | total_timesteps 3781.
Path 182 | total_timesteps 3797.
Path 183 | total_timesteps 3816.
Path 184 | total_timesteps 3826.
Path 185 | total_timesteps 3861.
Path 186 | total_timesteps 3875.
Path 187 | total_timesteps 3892.
Path 188 | total_timesteps 3911.
Path 189 | total_timesteps 3920.
Path 190 | total_timesteps 3959.
Path 191 | total_timesteps 3968.
Path 192 | total_timesteps 3980.
Path 193 | total_timesteps 3988.
Path 194 | total_timesteps 4012.
Path 195 | total_timesteps 4036.
Path 196 | total_timesteps 4060.
Path 197 | total_timesteps 4078.
Path 198 | total_timesteps 4103.
Path 199 | total_timesteps 4117.
Path 200 | total_timesteps 4157.
Path 201 | total_timesteps 4174.
Path 202 | total_timesteps 4190.
Path 203 | total_timesteps 4204.
Path 204 | total_timesteps 4244.
Path 205 | total_timesteps 4258.
Path 206 | total_timesteps 4271.
Path 207 | total_timesteps 4288.
Path 208 | total_timesteps 4322.
Path 209 | total_timesteps 4338.
Path 210 | total_timesteps 4352.
Path 211 | total_timesteps 4384.
Path 212 | total_timesteps 4410.
Path 213 | total_timesteps 4424.
Path 214 | total_timesteps 4439.
Path 215 | total_timesteps 4456.
Path 216 | total_timesteps 4471.
Path 217 | total_timesteps 4495.
Path 218 | total_timesteps 4524.
Path 219 | total_timesteps 4552.
Path 220 | total_timesteps 4585.
Path 221 | total_timesteps 4594.
Path 222 | total_timesteps 4604.
Path 223 | total_timesteps 4643.
Path 224 | total_timesteps 4655.
Path 225 | total_timesteps 4678.
Path 226 | total_timesteps 4716.
Path 227 | total_timesteps 4729.
Path 228 | total_timesteps 4743.
Path 229 | total_timesteps 4753.
Path 230 | total_timesteps 4762.
Path 231 | total_timesteps 4790.
Path 232 | total_timesteps 4811.
Path 233 | total_timesteps 4841.
Path 234 | total_timesteps 4869.
Path 235 | total_timesteps 4897.
Path 236 | total_timesteps 4932.
Path 237 | total_timesteps 4958.
Path 238 | total_timesteps 4996.
Path 239 | total_timesteps 5023.
Path 240 | total_timesteps 5046.
Path 241 | total_timesteps 5065.
Path 242 | total_timesteps 5077.
Path 243 | total_timesteps 5095.
Path 244 | total_timesteps 5108.
Path 245 | total_timesteps 5135.
Path 246 | total_timesteps 5147.
Path 247 | total_timesteps 5183.
Path 248 | total_timesteps 5223.
Path 249 | total_timesteps 5242.
Path 250 | total_timesteps 5257.
Path 251 | total_timesteps 5284.
Path 252 | total_timesteps 5299.
Path 253 | total_timesteps 5328.
Path 254 | total_timesteps 5366.
Path 255 | total_timesteps 5385.
Path 256 | total_timesteps 5408.
Path 257 | total_timesteps 5424.
Path 258 | total_timesteps 5436.
Path 259 | total_timesteps 5449.
Path 260 | total_timesteps 5462.
Path 261 | total_timesteps 5474.
Path 262 | total_timesteps 5488.
Path 263 | total_timesteps 5510.
Path 264 | total_timesteps 5528.
Path 265 | total_timesteps 5541.
Path 266 | total_timesteps 5549.
Path 267 | total_timesteps 5568.
Path 268 | total_timesteps 5578.
Path 269 | total_timesteps 5585.
Path 270 | total_timesteps 5598.
Path 271 | total_timesteps 5609.
Path 272 | total_timesteps 5642.
Path 273 | total_timesteps 5658.
Path 274 | total_timesteps 5668.
Path 275 | total_timesteps 5702.
Path 276 | total_timesteps 5756.
Path 277 | total_timesteps 5770.
Path 278 | total_timesteps 5793.
Path 279 | total_timesteps 5828.
Path 280 | total_timesteps 5849.
Path 281 | total_timesteps 5871.
Path 282 | total_timesteps 5886.
Path 283 | total_timesteps 5908.
Path 284 | total_timesteps 5922.
Path 285 | total_timesteps 5948.
Path 286 | total_timesteps 5974.
Path 287 | total_timesteps 5993.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -9.28    |
| Iteration     | 4        |
| MaximumReturn | 6.47     |
| MinimumReturn | -32.9    |
| TotalSamples  | 24075    |
----------------------------
itr #5 | 
Fitting dynamics.
Validation loss = 0.36708784103393555
Validation loss = 0.3724711239337921
Validation loss = 0.3842187225818634
Validation loss = 0.3771287202835083
Validation loss = 0.3913845717906952
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 24.
Path 2 | total_timesteps 45.
Path 3 | total_timesteps 83.
Path 4 | total_timesteps 102.
Path 5 | total_timesteps 138.
Path 6 | total_timesteps 189.
Path 7 | total_timesteps 217.
Path 8 | total_timesteps 244.
Path 9 | total_timesteps 262.
Path 10 | total_timesteps 313.
Path 11 | total_timesteps 344.
Path 12 | total_timesteps 397.
Path 13 | total_timesteps 433.
Path 14 | total_timesteps 462.
Path 15 | total_timesteps 490.
Path 16 | total_timesteps 535.
Path 17 | total_timesteps 580.
Path 18 | total_timesteps 629.
Path 19 | total_timesteps 660.
Path 20 | total_timesteps 681.
Path 21 | total_timesteps 716.
Path 22 | total_timesteps 750.
Path 23 | total_timesteps 783.
Path 24 | total_timesteps 802.
Path 25 | total_timesteps 843.
Path 26 | total_timesteps 878.
Path 27 | total_timesteps 897.
Path 28 | total_timesteps 920.
Path 29 | total_timesteps 942.
Path 30 | total_timesteps 967.
Path 31 | total_timesteps 985.
Path 32 | total_timesteps 1036.
Path 33 | total_timesteps 1074.
Path 34 | total_timesteps 1115.
Path 35 | total_timesteps 1130.
Path 36 | total_timesteps 1176.
Path 37 | total_timesteps 1237.
Path 38 | total_timesteps 1260.
Path 39 | total_timesteps 1275.
Path 40 | total_timesteps 1288.
Path 41 | total_timesteps 1306.
Path 42 | total_timesteps 1339.
Path 43 | total_timesteps 1377.
Path 44 | total_timesteps 1399.
Path 45 | total_timesteps 1423.
Path 46 | total_timesteps 1450.
Path 47 | total_timesteps 1492.
Path 48 | total_timesteps 1524.
Path 49 | total_timesteps 1546.
Path 50 | total_timesteps 1586.
Path 51 | total_timesteps 1621.
Path 52 | total_timesteps 1664.
Path 53 | total_timesteps 1689.
Path 54 | total_timesteps 1728.
Path 55 | total_timesteps 1772.
Path 56 | total_timesteps 1811.
Path 57 | total_timesteps 1831.
Path 58 | total_timesteps 1871.
Path 59 | total_timesteps 1884.
Path 60 | total_timesteps 1907.
Path 61 | total_timesteps 1932.
Path 62 | total_timesteps 1950.
Path 63 | total_timesteps 1987.
Path 64 | total_timesteps 2018.
Path 65 | total_timesteps 2060.
Path 66 | total_timesteps 2084.
Path 67 | total_timesteps 2139.
Path 68 | total_timesteps 2157.
Path 69 | total_timesteps 2176.
Path 70 | total_timesteps 2195.
Path 71 | total_timesteps 2242.
Path 72 | total_timesteps 2305.
Path 73 | total_timesteps 2325.
Path 74 | total_timesteps 2339.
Path 75 | total_timesteps 2386.
Path 76 | total_timesteps 2472.
Path 77 | total_timesteps 2513.
Path 78 | total_timesteps 2525.
Path 79 | total_timesteps 2547.
Path 80 | total_timesteps 2578.
Path 81 | total_timesteps 2617.
Path 82 | total_timesteps 2655.
Path 83 | total_timesteps 2682.
Path 84 | total_timesteps 2716.
Path 85 | total_timesteps 2753.
Path 86 | total_timesteps 2814.
Path 87 | total_timesteps 2834.
Path 88 | total_timesteps 2864.
Path 89 | total_timesteps 2887.
Path 90 | total_timesteps 2910.
Path 91 | total_timesteps 2930.
Path 92 | total_timesteps 2965.
Path 93 | total_timesteps 2977.
Path 94 | total_timesteps 3023.
Path 95 | total_timesteps 3053.
Path 96 | total_timesteps 3082.
Path 97 | total_timesteps 3127.
Path 98 | total_timesteps 3162.
Path 99 | total_timesteps 3184.
Path 100 | total_timesteps 3239.
Path 101 | total_timesteps 3263.
Path 102 | total_timesteps 3286.
Path 103 | total_timesteps 3326.
Path 104 | total_timesteps 3347.
Path 105 | total_timesteps 3400.
Path 106 | total_timesteps 3427.
Path 107 | total_timesteps 3470.
Path 108 | total_timesteps 3498.
Path 109 | total_timesteps 3512.
Path 110 | total_timesteps 3566.
Path 111 | total_timesteps 3587.
Path 112 | total_timesteps 3605.
Path 113 | total_timesteps 3633.
Path 114 | total_timesteps 3686.
Path 115 | total_timesteps 3732.
Path 116 | total_timesteps 3759.
Path 117 | total_timesteps 3788.
Path 118 | total_timesteps 3801.
Path 119 | total_timesteps 3853.
Path 120 | total_timesteps 3900.
Path 121 | total_timesteps 3953.
Path 122 | total_timesteps 3994.
Path 123 | total_timesteps 4037.
Path 124 | total_timesteps 4048.
Path 125 | total_timesteps 4083.
Path 126 | total_timesteps 4096.
Path 127 | total_timesteps 4128.
Path 128 | total_timesteps 4147.
Path 129 | total_timesteps 4177.
Path 130 | total_timesteps 4201.
Path 131 | total_timesteps 4231.
Path 132 | total_timesteps 4258.
Path 133 | total_timesteps 4289.
Path 134 | total_timesteps 4339.
Path 135 | total_timesteps 4375.
Path 136 | total_timesteps 4428.
Path 137 | total_timesteps 4439.
Path 138 | total_timesteps 4467.
Path 139 | total_timesteps 4493.
Path 140 | total_timesteps 4527.
Path 141 | total_timesteps 4561.
Path 142 | total_timesteps 4619.
Path 143 | total_timesteps 4663.
Path 144 | total_timesteps 4675.
Path 145 | total_timesteps 4703.
Path 146 | total_timesteps 4728.
Path 147 | total_timesteps 4748.
Path 148 | total_timesteps 4776.
Path 149 | total_timesteps 4790.
Path 150 | total_timesteps 4814.
Path 151 | total_timesteps 4844.
Path 152 | total_timesteps 4871.
Path 153 | total_timesteps 4898.
Path 154 | total_timesteps 4930.
Path 155 | total_timesteps 4995.
Path 156 | total_timesteps 5039.
Path 157 | total_timesteps 5069.
Path 158 | total_timesteps 5152.
Path 159 | total_timesteps 5198.
Path 160 | total_timesteps 5224.
Path 161 | total_timesteps 5248.
Path 162 | total_timesteps 5271.
Path 163 | total_timesteps 5363.
Path 164 | total_timesteps 5388.
Path 165 | total_timesteps 5425.
Path 166 | total_timesteps 5462.
Path 167 | total_timesteps 5501.
Path 168 | total_timesteps 5556.
Path 169 | total_timesteps 5600.
Path 170 | total_timesteps 5645.
Path 171 | total_timesteps 5684.
Path 172 | total_timesteps 5736.
Path 173 | total_timesteps 5765.
Path 174 | total_timesteps 5788.
Path 175 | total_timesteps 5817.
Path 176 | total_timesteps 5839.
Path 177 | total_timesteps 5875.
Path 178 | total_timesteps 5907.
Path 179 | total_timesteps 5924.
Path 180 | total_timesteps 5968.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -12.8    |
| Iteration     | 5        |
| MaximumReturn | 3.65     |
| MinimumReturn | -64.1    |
| TotalSamples  | 28115    |
----------------------------
itr #6 | 
Fitting dynamics.
Validation loss = 0.3791378438472748
Validation loss = 0.38334396481513977
Validation loss = 0.384615957736969
Validation loss = 0.38973182439804077
Validation loss = 0.3971923291683197
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 35.
Path 2 | total_timesteps 73.
Path 3 | total_timesteps 107.
Path 4 | total_timesteps 132.
Path 5 | total_timesteps 165.
Path 6 | total_timesteps 213.
Path 7 | total_timesteps 245.
Path 8 | total_timesteps 281.
Path 9 | total_timesteps 346.
Path 10 | total_timesteps 374.
Path 11 | total_timesteps 468.
Path 12 | total_timesteps 511.
Path 13 | total_timesteps 529.
Path 14 | total_timesteps 551.
Path 15 | total_timesteps 570.
Path 16 | total_timesteps 607.
Path 17 | total_timesteps 621.
Path 18 | total_timesteps 663.
Path 19 | total_timesteps 712.
Path 20 | total_timesteps 723.
Path 21 | total_timesteps 738.
Path 22 | total_timesteps 756.
Path 23 | total_timesteps 816.
Path 24 | total_timesteps 829.
Path 25 | total_timesteps 850.
Path 26 | total_timesteps 886.
Path 27 | total_timesteps 922.
Path 28 | total_timesteps 956.
Path 29 | total_timesteps 999.
Path 30 | total_timesteps 1037.
Path 31 | total_timesteps 1084.
Path 32 | total_timesteps 1142.
Path 33 | total_timesteps 1160.
Path 34 | total_timesteps 1184.
Path 35 | total_timesteps 1206.
Path 36 | total_timesteps 1222.
Path 37 | total_timesteps 1253.
Path 38 | total_timesteps 1287.
Path 39 | total_timesteps 1329.
Path 40 | total_timesteps 1360.
Path 41 | total_timesteps 1427.
Path 42 | total_timesteps 1502.
Path 43 | total_timesteps 1540.
Path 44 | total_timesteps 1562.
Path 45 | total_timesteps 1576.
Path 46 | total_timesteps 1600.
Path 47 | total_timesteps 1649.
Path 48 | total_timesteps 1680.
Path 49 | total_timesteps 1703.
Path 50 | total_timesteps 1749.
Path 51 | total_timesteps 1794.
Path 52 | total_timesteps 1825.
Path 53 | total_timesteps 1878.
Path 54 | total_timesteps 1954.
Path 55 | total_timesteps 1968.
Path 56 | total_timesteps 1989.
Path 57 | total_timesteps 2004.
Path 58 | total_timesteps 2014.
Path 59 | total_timesteps 2045.
Path 60 | total_timesteps 2087.
Path 61 | total_timesteps 2104.
Path 62 | total_timesteps 2136.
Path 63 | total_timesteps 2174.
Path 64 | total_timesteps 2253.
Path 65 | total_timesteps 2284.
Path 66 | total_timesteps 2299.
Path 67 | total_timesteps 2345.
Path 68 | total_timesteps 2363.
Path 69 | total_timesteps 2389.
Path 70 | total_timesteps 2430.
Path 71 | total_timesteps 2448.
Path 72 | total_timesteps 2537.
Path 73 | total_timesteps 2570.
Path 74 | total_timesteps 2595.
Path 75 | total_timesteps 2610.
Path 76 | total_timesteps 2648.
Path 77 | total_timesteps 2670.
Path 78 | total_timesteps 2714.
Path 79 | total_timesteps 2740.
Path 80 | total_timesteps 2772.
Path 81 | total_timesteps 2804.
Path 82 | total_timesteps 2815.
Path 83 | total_timesteps 2832.
Path 84 | total_timesteps 2866.
Path 85 | total_timesteps 2880.
Path 86 | total_timesteps 2928.
Path 87 | total_timesteps 2945.
Path 88 | total_timesteps 2980.
Path 89 | total_timesteps 3007.
Path 90 | total_timesteps 3020.
Path 91 | total_timesteps 3047.
Path 92 | total_timesteps 3074.
Path 93 | total_timesteps 3115.
Path 94 | total_timesteps 3178.
Path 95 | total_timesteps 3222.
Path 96 | total_timesteps 3241.
Path 97 | total_timesteps 3262.
Path 98 | total_timesteps 3323.
Path 99 | total_timesteps 3351.
Path 100 | total_timesteps 3396.
Path 101 | total_timesteps 3415.
Path 102 | total_timesteps 3431.
Path 103 | total_timesteps 3457.
Path 104 | total_timesteps 3495.
Path 105 | total_timesteps 3523.
Path 106 | total_timesteps 3543.
Path 107 | total_timesteps 3572.
Path 108 | total_timesteps 3601.
Path 109 | total_timesteps 3650.
Path 110 | total_timesteps 3667.
Path 111 | total_timesteps 3700.
Path 112 | total_timesteps 3716.
Path 113 | total_timesteps 3759.
Path 114 | total_timesteps 3802.
Path 115 | total_timesteps 3843.
Path 116 | total_timesteps 3916.
Path 117 | total_timesteps 3957.
Path 118 | total_timesteps 4001.
Path 119 | total_timesteps 4036.
Path 120 | total_timesteps 4151.
Path 121 | total_timesteps 4199.
Path 122 | total_timesteps 4245.
Path 123 | total_timesteps 4313.
Path 124 | total_timesteps 4371.
Path 125 | total_timesteps 4381.
Path 126 | total_timesteps 4405.
Path 127 | total_timesteps 4499.
Path 128 | total_timesteps 4549.
Path 129 | total_timesteps 4582.
Path 130 | total_timesteps 4620.
Path 131 | total_timesteps 4639.
Path 132 | total_timesteps 4694.
Path 133 | total_timesteps 4713.
Path 134 | total_timesteps 4748.
Path 135 | total_timesteps 4766.
Path 136 | total_timesteps 4805.
Path 137 | total_timesteps 4823.
Path 138 | total_timesteps 4875.
Path 139 | total_timesteps 4921.
Path 140 | total_timesteps 4971.
Path 141 | total_timesteps 5001.
Path 142 | total_timesteps 5046.
Path 143 | total_timesteps 5095.
Path 144 | total_timesteps 5136.
Path 145 | total_timesteps 5205.
Path 146 | total_timesteps 5248.
Path 147 | total_timesteps 5278.
Path 148 | total_timesteps 5291.
Path 149 | total_timesteps 5313.
Path 150 | total_timesteps 5375.
Path 151 | total_timesteps 5421.
Path 152 | total_timesteps 5435.
Path 153 | total_timesteps 5471.
Path 154 | total_timesteps 5507.
Path 155 | total_timesteps 5546.
Path 156 | total_timesteps 5612.
Path 157 | total_timesteps 5686.
Path 158 | total_timesteps 5732.
Path 159 | total_timesteps 5768.
Path 160 | total_timesteps 5794.
Path 161 | total_timesteps 5814.
Path 162 | total_timesteps 5849.
Path 163 | total_timesteps 5907.
Path 164 | total_timesteps 5938.
Path 165 | total_timesteps 5972.
Path 166 | total_timesteps 5987.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -13.3    |
| Iteration     | 6        |
| MaximumReturn | 16.3     |
| MinimumReturn | -51.9    |
| TotalSamples  | 32127    |
----------------------------
itr #7 | 
Fitting dynamics.
Validation loss = 0.38210922479629517
Validation loss = 0.3928535580635071
Validation loss = 0.39379051327705383
Validation loss = 0.40420764684677124
Validation loss = 0.40592968463897705
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 42.
Path 2 | total_timesteps 59.
Path 3 | total_timesteps 74.
Path 4 | total_timesteps 99.
Path 5 | total_timesteps 124.
Path 6 | total_timesteps 153.
Path 7 | total_timesteps 194.
Path 8 | total_timesteps 230.
Path 9 | total_timesteps 276.
Path 10 | total_timesteps 308.
Path 11 | total_timesteps 338.
Path 12 | total_timesteps 367.
Path 13 | total_timesteps 417.
Path 14 | total_timesteps 443.
Path 15 | total_timesteps 465.
Path 16 | total_timesteps 478.
Path 17 | total_timesteps 500.
Path 18 | total_timesteps 530.
Path 19 | total_timesteps 559.
Path 20 | total_timesteps 587.
Path 21 | total_timesteps 606.
Path 22 | total_timesteps 653.
Path 23 | total_timesteps 681.
Path 24 | total_timesteps 706.
Path 25 | total_timesteps 747.
Path 26 | total_timesteps 786.
Path 27 | total_timesteps 813.
Path 28 | total_timesteps 838.
Path 29 | total_timesteps 863.
Path 30 | total_timesteps 900.
Path 31 | total_timesteps 922.
Path 32 | total_timesteps 964.
Path 33 | total_timesteps 1021.
Path 34 | total_timesteps 1056.
Path 35 | total_timesteps 1123.
Path 36 | total_timesteps 1156.
Path 37 | total_timesteps 1219.
Path 38 | total_timesteps 1244.
Path 39 | total_timesteps 1271.
Path 40 | total_timesteps 1303.
Path 41 | total_timesteps 1355.
Path 42 | total_timesteps 1378.
Path 43 | total_timesteps 1413.
Path 44 | total_timesteps 1440.
Path 45 | total_timesteps 1475.
Path 46 | total_timesteps 1510.
Path 47 | total_timesteps 1538.
Path 48 | total_timesteps 1568.
Path 49 | total_timesteps 1603.
Path 50 | total_timesteps 1638.
Path 51 | total_timesteps 1659.
Path 52 | total_timesteps 1680.
Path 53 | total_timesteps 1711.
Path 54 | total_timesteps 1740.
Path 55 | total_timesteps 1774.
Path 56 | total_timesteps 1827.
Path 57 | total_timesteps 1856.
Path 58 | total_timesteps 1872.
Path 59 | total_timesteps 1909.
Path 60 | total_timesteps 1956.
Path 61 | total_timesteps 1986.
Path 62 | total_timesteps 2016.
Path 63 | total_timesteps 2030.
Path 64 | total_timesteps 2049.
Path 65 | total_timesteps 2078.
Path 66 | total_timesteps 2110.
Path 67 | total_timesteps 2147.
Path 68 | total_timesteps 2171.
Path 69 | total_timesteps 2198.
Path 70 | total_timesteps 2222.
Path 71 | total_timesteps 2281.
Path 72 | total_timesteps 2332.
Path 73 | total_timesteps 2368.
Path 74 | total_timesteps 2401.
Path 75 | total_timesteps 2464.
Path 76 | total_timesteps 2486.
Path 77 | total_timesteps 2535.
Path 78 | total_timesteps 2567.
Path 79 | total_timesteps 2578.
Path 80 | total_timesteps 2632.
Path 81 | total_timesteps 2664.
Path 82 | total_timesteps 2684.
Path 83 | total_timesteps 2713.
Path 84 | total_timesteps 2764.
Path 85 | total_timesteps 2817.
Path 86 | total_timesteps 2846.
Path 87 | total_timesteps 2910.
Path 88 | total_timesteps 2966.
Path 89 | total_timesteps 2994.
Path 90 | total_timesteps 3004.
Path 91 | total_timesteps 3029.
Path 92 | total_timesteps 3055.
Path 93 | total_timesteps 3081.
Path 94 | total_timesteps 3097.
Path 95 | total_timesteps 3132.
Path 96 | total_timesteps 3159.
Path 97 | total_timesteps 3187.
Path 98 | total_timesteps 3230.
Path 99 | total_timesteps 3269.
Path 100 | total_timesteps 3312.
Path 101 | total_timesteps 3346.
Path 102 | total_timesteps 3406.
Path 103 | total_timesteps 3447.
Path 104 | total_timesteps 3482.
Path 105 | total_timesteps 3526.
Path 106 | total_timesteps 3540.
Path 107 | total_timesteps 3604.
Path 108 | total_timesteps 3633.
Path 109 | total_timesteps 3645.
Path 110 | total_timesteps 3683.
Path 111 | total_timesteps 3737.
Path 112 | total_timesteps 3776.
Path 113 | total_timesteps 3791.
Path 114 | total_timesteps 3801.
Path 115 | total_timesteps 3854.
Path 116 | total_timesteps 3886.
Path 117 | total_timesteps 3926.
Path 118 | total_timesteps 3964.
Path 119 | total_timesteps 3993.
Path 120 | total_timesteps 4030.
Path 121 | total_timesteps 4071.
Path 122 | total_timesteps 4098.
Path 123 | total_timesteps 4129.
Path 124 | total_timesteps 4151.
Path 125 | total_timesteps 4191.
Path 126 | total_timesteps 4218.
Path 127 | total_timesteps 4271.
Path 128 | total_timesteps 4291.
Path 129 | total_timesteps 4315.
Path 130 | total_timesteps 4340.
Path 131 | total_timesteps 4374.
Path 132 | total_timesteps 4453.
Path 133 | total_timesteps 4471.
Path 134 | total_timesteps 4501.
Path 135 | total_timesteps 4530.
Path 136 | total_timesteps 4563.
Path 137 | total_timesteps 4618.
Path 138 | total_timesteps 4632.
Path 139 | total_timesteps 4652.
Path 140 | total_timesteps 4701.
Path 141 | total_timesteps 4760.
Path 142 | total_timesteps 4806.
Path 143 | total_timesteps 4851.
Path 144 | total_timesteps 4879.
Path 145 | total_timesteps 4906.
Path 146 | total_timesteps 4934.
Path 147 | total_timesteps 4957.
Path 148 | total_timesteps 4990.
Path 149 | total_timesteps 5016.
Path 150 | total_timesteps 5035.
Path 151 | total_timesteps 5057.
Path 152 | total_timesteps 5097.
Path 153 | total_timesteps 5144.
Path 154 | total_timesteps 5177.
Path 155 | total_timesteps 5215.
Path 156 | total_timesteps 5256.
Path 157 | total_timesteps 5291.
Path 158 | total_timesteps 5343.
Path 159 | total_timesteps 5407.
Path 160 | total_timesteps 5422.
Path 161 | total_timesteps 5439.
Path 162 | total_timesteps 5460.
Path 163 | total_timesteps 5481.
Path 164 | total_timesteps 5514.
Path 165 | total_timesteps 5547.
Path 166 | total_timesteps 5582.
Path 167 | total_timesteps 5598.
Path 168 | total_timesteps 5619.
Path 169 | total_timesteps 5685.
Path 170 | total_timesteps 5712.
Path 171 | total_timesteps 5731.
Path 172 | total_timesteps 5754.
Path 173 | total_timesteps 5787.
Path 174 | total_timesteps 5831.
Path 175 | total_timesteps 5876.
Path 176 | total_timesteps 5904.
Path 177 | total_timesteps 5957.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -14.6    |
| Iteration     | 7        |
| MaximumReturn | 2.08     |
| MinimumReturn | -37.5    |
| TotalSamples  | 36133    |
----------------------------
itr #8 | 
Fitting dynamics.
Validation loss = 0.3915039300918579
Validation loss = 0.3935306966304779
Validation loss = 0.4040870666503906
Validation loss = 0.4102563261985779
Validation loss = 0.4173557758331299
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 47.
Path 2 | total_timesteps 95.
Path 3 | total_timesteps 109.
Path 4 | total_timesteps 128.
Path 5 | total_timesteps 160.
Path 6 | total_timesteps 192.
Path 7 | total_timesteps 217.
Path 8 | total_timesteps 239.
Path 9 | total_timesteps 278.
Path 10 | total_timesteps 296.
Path 11 | total_timesteps 370.
Path 12 | total_timesteps 401.
Path 13 | total_timesteps 449.
Path 14 | total_timesteps 491.
Path 15 | total_timesteps 510.
Path 16 | total_timesteps 538.
Path 17 | total_timesteps 549.
Path 18 | total_timesteps 611.
Path 19 | total_timesteps 635.
Path 20 | total_timesteps 673.
Path 21 | total_timesteps 711.
Path 22 | total_timesteps 750.
Path 23 | total_timesteps 765.
Path 24 | total_timesteps 784.
Path 25 | total_timesteps 815.
Path 26 | total_timesteps 839.
Path 27 | total_timesteps 868.
Path 28 | total_timesteps 889.
Path 29 | total_timesteps 910.
Path 30 | total_timesteps 943.
Path 31 | total_timesteps 974.
Path 32 | total_timesteps 1004.
Path 33 | total_timesteps 1031.
Path 34 | total_timesteps 1053.
Path 35 | total_timesteps 1085.
Path 36 | total_timesteps 1117.
Path 37 | total_timesteps 1152.
Path 38 | total_timesteps 1174.
Path 39 | total_timesteps 1192.
Path 40 | total_timesteps 1215.
Path 41 | total_timesteps 1252.
Path 42 | total_timesteps 1276.
Path 43 | total_timesteps 1309.
Path 44 | total_timesteps 1334.
Path 45 | total_timesteps 1359.
Path 46 | total_timesteps 1409.
Path 47 | total_timesteps 1436.
Path 48 | total_timesteps 1479.
Path 49 | total_timesteps 1521.
Path 50 | total_timesteps 1593.
Path 51 | total_timesteps 1624.
Path 52 | total_timesteps 1658.
Path 53 | total_timesteps 1694.
Path 54 | total_timesteps 1705.
Path 55 | total_timesteps 1731.
Path 56 | total_timesteps 1758.
Path 57 | total_timesteps 1790.
Path 58 | total_timesteps 1822.
Path 59 | total_timesteps 1868.
Path 60 | total_timesteps 1898.
Path 61 | total_timesteps 1934.
Path 62 | total_timesteps 1963.
Path 63 | total_timesteps 1983.
Path 64 | total_timesteps 2015.
Path 65 | total_timesteps 2043.
Path 66 | total_timesteps 2075.
Path 67 | total_timesteps 2105.
Path 68 | total_timesteps 2134.
Path 69 | total_timesteps 2162.
Path 70 | total_timesteps 2191.
Path 71 | total_timesteps 2221.
Path 72 | total_timesteps 2239.
Path 73 | total_timesteps 2265.
Path 74 | total_timesteps 2278.
Path 75 | total_timesteps 2306.
Path 76 | total_timesteps 2325.
Path 77 | total_timesteps 2373.
Path 78 | total_timesteps 2404.
Path 79 | total_timesteps 2452.
Path 80 | total_timesteps 2492.
Path 81 | total_timesteps 2519.
Path 82 | total_timesteps 2547.
Path 83 | total_timesteps 2571.
Path 84 | total_timesteps 2600.
Path 85 | total_timesteps 2626.
Path 86 | total_timesteps 2645.
Path 87 | total_timesteps 2701.
Path 88 | total_timesteps 2739.
Path 89 | total_timesteps 2758.
Path 90 | total_timesteps 2795.
Path 91 | total_timesteps 2846.
Path 92 | total_timesteps 2864.
Path 93 | total_timesteps 2879.
Path 94 | total_timesteps 2906.
Path 95 | total_timesteps 2926.
Path 96 | total_timesteps 2964.
Path 97 | total_timesteps 3006.
Path 98 | total_timesteps 3056.
Path 99 | total_timesteps 3080.
Path 100 | total_timesteps 3110.
Path 101 | total_timesteps 3170.
Path 102 | total_timesteps 3191.
Path 103 | total_timesteps 3268.
Path 104 | total_timesteps 3307.
Path 105 | total_timesteps 3341.
Path 106 | total_timesteps 3360.
Path 107 | total_timesteps 3404.
Path 108 | total_timesteps 3424.
Path 109 | total_timesteps 3439.
Path 110 | total_timesteps 3470.
Path 111 | total_timesteps 3498.
Path 112 | total_timesteps 3523.
Path 113 | total_timesteps 3571.
Path 114 | total_timesteps 3604.
Path 115 | total_timesteps 3628.
Path 116 | total_timesteps 3648.
Path 117 | total_timesteps 3668.
Path 118 | total_timesteps 3698.
Path 119 | total_timesteps 3722.
Path 120 | total_timesteps 3741.
Path 121 | total_timesteps 3769.
Path 122 | total_timesteps 3797.
Path 123 | total_timesteps 3823.
Path 124 | total_timesteps 3843.
Path 125 | total_timesteps 3880.
Path 126 | total_timesteps 3921.
Path 127 | total_timesteps 3945.
Path 128 | total_timesteps 3996.
Path 129 | total_timesteps 4100.
Path 130 | total_timesteps 4133.
Path 131 | total_timesteps 4158.
Path 132 | total_timesteps 4184.
Path 133 | total_timesteps 4202.
Path 134 | total_timesteps 4240.
Path 135 | total_timesteps 4263.
Path 136 | total_timesteps 4284.
Path 137 | total_timesteps 4300.
Path 138 | total_timesteps 4325.
Path 139 | total_timesteps 4351.
Path 140 | total_timesteps 4367.
Path 141 | total_timesteps 4382.
Path 142 | total_timesteps 4420.
Path 143 | total_timesteps 4456.
Path 144 | total_timesteps 4466.
Path 145 | total_timesteps 4489.
Path 146 | total_timesteps 4509.
Path 147 | total_timesteps 4556.
Path 148 | total_timesteps 4580.
Path 149 | total_timesteps 4629.
Path 150 | total_timesteps 4662.
Path 151 | total_timesteps 4681.
Path 152 | total_timesteps 4710.
Path 153 | total_timesteps 4736.
Path 154 | total_timesteps 4765.
Path 155 | total_timesteps 4790.
Path 156 | total_timesteps 4834.
Path 157 | total_timesteps 4891.
Path 158 | total_timesteps 4922.
Path 159 | total_timesteps 4978.
Path 160 | total_timesteps 5019.
Path 161 | total_timesteps 5052.
Path 162 | total_timesteps 5084.
Path 163 | total_timesteps 5118.
Path 164 | total_timesteps 5136.
Path 165 | total_timesteps 5183.
Path 166 | total_timesteps 5235.
Path 167 | total_timesteps 5260.
Path 168 | total_timesteps 5301.
Path 169 | total_timesteps 5331.
Path 170 | total_timesteps 5385.
Path 171 | total_timesteps 5429.
Path 172 | total_timesteps 5456.
Path 173 | total_timesteps 5480.
Path 174 | total_timesteps 5505.
Path 175 | total_timesteps 5537.
Path 176 | total_timesteps 5563.
Path 177 | total_timesteps 5604.
Path 178 | total_timesteps 5629.
Path 179 | total_timesteps 5660.
Path 180 | total_timesteps 5685.
Path 181 | total_timesteps 5718.
Path 182 | total_timesteps 5742.
Path 183 | total_timesteps 5763.
Path 184 | total_timesteps 5783.
Path 185 | total_timesteps 5817.
Path 186 | total_timesteps 5860.
Path 187 | total_timesteps 5890.
Path 188 | total_timesteps 5904.
Path 189 | total_timesteps 5916.
Path 190 | total_timesteps 5940.
Path 191 | total_timesteps 5995.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -14.2    |
| Iteration     | 8        |
| MaximumReturn | 13.7     |
| MinimumReturn | -40      |
| TotalSamples  | 40154    |
----------------------------
itr #9 | 
Fitting dynamics.
Validation loss = 0.3976316750049591
Validation loss = 0.4033186435699463
Validation loss = 0.4111161231994629
Validation loss = 0.4131980836391449
Validation loss = 0.4141000211238861
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 10.
Path 2 | total_timesteps 41.
Path 3 | total_timesteps 52.
Path 4 | total_timesteps 89.
Path 5 | total_timesteps 123.
Path 6 | total_timesteps 159.
Path 7 | total_timesteps 218.
Path 8 | total_timesteps 254.
Path 9 | total_timesteps 277.
Path 10 | total_timesteps 308.
Path 11 | total_timesteps 340.
Path 12 | total_timesteps 383.
Path 13 | total_timesteps 409.
Path 14 | total_timesteps 424.
Path 15 | total_timesteps 439.
Path 16 | total_timesteps 477.
Path 17 | total_timesteps 502.
Path 18 | total_timesteps 564.
Path 19 | total_timesteps 597.
Path 20 | total_timesteps 630.
Path 21 | total_timesteps 652.
Path 22 | total_timesteps 709.
Path 23 | total_timesteps 781.
Path 24 | total_timesteps 830.
Path 25 | total_timesteps 852.
Path 26 | total_timesteps 890.
Path 27 | total_timesteps 921.
Path 28 | total_timesteps 952.
Path 29 | total_timesteps 974.
Path 30 | total_timesteps 1000.
Path 31 | total_timesteps 1025.
Path 32 | total_timesteps 1060.
Path 33 | total_timesteps 1095.
Path 34 | total_timesteps 1137.
Path 35 | total_timesteps 1171.
Path 36 | total_timesteps 1200.
Path 37 | total_timesteps 1225.
Path 38 | total_timesteps 1267.
Path 39 | total_timesteps 1283.
Path 40 | total_timesteps 1312.
Path 41 | total_timesteps 1342.
Path 42 | total_timesteps 1362.
Path 43 | total_timesteps 1391.
Path 44 | total_timesteps 1431.
Path 45 | total_timesteps 1457.
Path 46 | total_timesteps 1495.
Path 47 | total_timesteps 1537.
Path 48 | total_timesteps 1571.
Path 49 | total_timesteps 1586.
Path 50 | total_timesteps 1637.
Path 51 | total_timesteps 1664.
Path 52 | total_timesteps 1715.
Path 53 | total_timesteps 1745.
Path 54 | total_timesteps 1793.
Path 55 | total_timesteps 1830.
Path 56 | total_timesteps 1860.
Path 57 | total_timesteps 1878.
Path 58 | total_timesteps 1900.
Path 59 | total_timesteps 1920.
Path 60 | total_timesteps 1944.
Path 61 | total_timesteps 1989.
Path 62 | total_timesteps 2009.
Path 63 | total_timesteps 2040.
Path 64 | total_timesteps 2074.
Path 65 | total_timesteps 2097.
Path 66 | total_timesteps 2139.
Path 67 | total_timesteps 2175.
Path 68 | total_timesteps 2212.
Path 69 | total_timesteps 2246.
Path 70 | total_timesteps 2267.
Path 71 | total_timesteps 2306.
Path 72 | total_timesteps 2342.
Path 73 | total_timesteps 2378.
Path 74 | total_timesteps 2426.
Path 75 | total_timesteps 2460.
Path 76 | total_timesteps 2478.
Path 77 | total_timesteps 2508.
Path 78 | total_timesteps 2545.
Path 79 | total_timesteps 2591.
Path 80 | total_timesteps 2651.
Path 81 | total_timesteps 2686.
Path 82 | total_timesteps 2722.
Path 83 | total_timesteps 2769.
Path 84 | total_timesteps 2792.
Path 85 | total_timesteps 2838.
Path 86 | total_timesteps 2865.
Path 87 | total_timesteps 2904.
Path 88 | total_timesteps 2939.
Path 89 | total_timesteps 2998.
Path 90 | total_timesteps 3035.
Path 91 | total_timesteps 3063.
Path 92 | total_timesteps 3091.
Path 93 | total_timesteps 3141.
Path 94 | total_timesteps 3191.
Path 95 | total_timesteps 3207.
Path 96 | total_timesteps 3233.
Path 97 | total_timesteps 3260.
Path 98 | total_timesteps 3302.
Path 99 | total_timesteps 3338.
Path 100 | total_timesteps 3375.
Path 101 | total_timesteps 3407.
Path 102 | total_timesteps 3423.
Path 103 | total_timesteps 3468.
Path 104 | total_timesteps 3510.
Path 105 | total_timesteps 3548.
Path 106 | total_timesteps 3570.
Path 107 | total_timesteps 3612.
Path 108 | total_timesteps 3662.
Path 109 | total_timesteps 3689.
Path 110 | total_timesteps 3724.
Path 111 | total_timesteps 3763.
Path 112 | total_timesteps 3803.
Path 113 | total_timesteps 3895.
Path 114 | total_timesteps 3922.
Path 115 | total_timesteps 3967.
Path 116 | total_timesteps 4044.
Path 117 | total_timesteps 4091.
Path 118 | total_timesteps 4120.
Path 119 | total_timesteps 4144.
Path 120 | total_timesteps 4177.
Path 121 | total_timesteps 4216.
Path 122 | total_timesteps 4239.
Path 123 | total_timesteps 4271.
Path 124 | total_timesteps 4284.
Path 125 | total_timesteps 4324.
Path 126 | total_timesteps 4359.
Path 127 | total_timesteps 4386.
Path 128 | total_timesteps 4407.
Path 129 | total_timesteps 4430.
Path 130 | total_timesteps 4472.
Path 131 | total_timesteps 4508.
Path 132 | total_timesteps 4545.
Path 133 | total_timesteps 4576.
Path 134 | total_timesteps 4637.
Path 135 | total_timesteps 4668.
Path 136 | total_timesteps 4726.
Path 137 | total_timesteps 4768.
Path 138 | total_timesteps 4779.
Path 139 | total_timesteps 4797.
Path 140 | total_timesteps 4852.
Path 141 | total_timesteps 4889.
Path 142 | total_timesteps 4921.
Path 143 | total_timesteps 4931.
Path 144 | total_timesteps 4940.
Path 145 | total_timesteps 4966.
Path 146 | total_timesteps 5013.
Path 147 | total_timesteps 5027.
Path 148 | total_timesteps 5048.
Path 149 | total_timesteps 5069.
Path 150 | total_timesteps 5090.
Path 151 | total_timesteps 5145.
Path 152 | total_timesteps 5162.
Path 153 | total_timesteps 5179.
Path 154 | total_timesteps 5207.
Path 155 | total_timesteps 5224.
Path 156 | total_timesteps 5256.
Path 157 | total_timesteps 5275.
Path 158 | total_timesteps 5307.
Path 159 | total_timesteps 5341.
Path 160 | total_timesteps 5360.
Path 161 | total_timesteps 5415.
Path 162 | total_timesteps 5436.
Path 163 | total_timesteps 5494.
Path 164 | total_timesteps 5539.
Path 165 | total_timesteps 5555.
Path 166 | total_timesteps 5580.
Path 167 | total_timesteps 5618.
Path 168 | total_timesteps 5654.
Path 169 | total_timesteps 5714.
Path 170 | total_timesteps 5758.
Path 171 | total_timesteps 5795.
Path 172 | total_timesteps 5830.
Path 173 | total_timesteps 5875.
Path 174 | total_timesteps 5917.
Path 175 | total_timesteps 5928.
Path 176 | total_timesteps 5972.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -14.8    |
| Iteration     | 9        |
| MaximumReturn | 0.453    |
| MinimumReturn | -46.3    |
| TotalSamples  | 44174    |
----------------------------
itr #10 | 
Fitting dynamics.
Validation loss = 0.3969491720199585
Validation loss = 0.4053225517272949
Validation loss = 0.4131060540676117
Validation loss = 0.4140610992908478
Validation loss = 0.42194491624832153
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 56.
Path 2 | total_timesteps 68.
Path 3 | total_timesteps 116.
Path 4 | total_timesteps 160.
Path 5 | total_timesteps 206.
Path 6 | total_timesteps 249.
Path 7 | total_timesteps 287.
Path 8 | total_timesteps 313.
Path 9 | total_timesteps 330.
Path 10 | total_timesteps 351.
Path 11 | total_timesteps 386.
Path 12 | total_timesteps 409.
Path 13 | total_timesteps 459.
Path 14 | total_timesteps 522.
Path 15 | total_timesteps 577.
Path 16 | total_timesteps 600.
Path 17 | total_timesteps 613.
Path 18 | total_timesteps 643.
Path 19 | total_timesteps 716.
Path 20 | total_timesteps 741.
Path 21 | total_timesteps 777.
Path 22 | total_timesteps 805.
Path 23 | total_timesteps 842.
Path 24 | total_timesteps 898.
Path 25 | total_timesteps 935.
Path 26 | total_timesteps 978.
Path 27 | total_timesteps 1011.
Path 28 | total_timesteps 1042.
Path 29 | total_timesteps 1067.
Path 30 | total_timesteps 1100.
Path 31 | total_timesteps 1122.
Path 32 | total_timesteps 1150.
Path 33 | total_timesteps 1214.
Path 34 | total_timesteps 1269.
Path 35 | total_timesteps 1295.
Path 36 | total_timesteps 1317.
Path 37 | total_timesteps 1367.
Path 38 | total_timesteps 1396.
Path 39 | total_timesteps 1434.
Path 40 | total_timesteps 1470.
Path 41 | total_timesteps 1519.
Path 42 | total_timesteps 1548.
Path 43 | total_timesteps 1578.
Path 44 | total_timesteps 1589.
Path 45 | total_timesteps 1609.
Path 46 | total_timesteps 1642.
Path 47 | total_timesteps 1674.
Path 48 | total_timesteps 1729.
Path 49 | total_timesteps 1761.
Path 50 | total_timesteps 1785.
Path 51 | total_timesteps 1835.
Path 52 | total_timesteps 1850.
Path 53 | total_timesteps 1888.
Path 54 | total_timesteps 1941.
Path 55 | total_timesteps 1977.
Path 56 | total_timesteps 1997.
Path 57 | total_timesteps 2016.
Path 58 | total_timesteps 2044.
Path 59 | total_timesteps 2086.
Path 60 | total_timesteps 2127.
Path 61 | total_timesteps 2154.
Path 62 | total_timesteps 2193.
Path 63 | total_timesteps 2209.
Path 64 | total_timesteps 2257.
Path 65 | total_timesteps 2304.
Path 66 | total_timesteps 2335.
Path 67 | total_timesteps 2369.
Path 68 | total_timesteps 2404.
Path 69 | total_timesteps 2454.
Path 70 | total_timesteps 2473.
Path 71 | total_timesteps 2533.
Path 72 | total_timesteps 2566.
Path 73 | total_timesteps 2603.
Path 74 | total_timesteps 2633.
Path 75 | total_timesteps 2677.
Path 76 | total_timesteps 2693.
Path 77 | total_timesteps 2733.
Path 78 | total_timesteps 2765.
Path 79 | total_timesteps 2784.
Path 80 | total_timesteps 2808.
Path 81 | total_timesteps 2871.
Path 82 | total_timesteps 2887.
Path 83 | total_timesteps 2919.
Path 84 | total_timesteps 2962.
Path 85 | total_timesteps 2972.
Path 86 | total_timesteps 3006.
Path 87 | total_timesteps 3041.
Path 88 | total_timesteps 3069.
Path 89 | total_timesteps 3113.
Path 90 | total_timesteps 3152.
Path 91 | total_timesteps 3211.
Path 92 | total_timesteps 3226.
Path 93 | total_timesteps 3278.
Path 94 | total_timesteps 3316.
Path 95 | total_timesteps 3347.
Path 96 | total_timesteps 3366.
Path 97 | total_timesteps 3417.
Path 98 | total_timesteps 3447.
Path 99 | total_timesteps 3497.
Path 100 | total_timesteps 3518.
Path 101 | total_timesteps 3535.
Path 102 | total_timesteps 3565.
Path 103 | total_timesteps 3582.
Path 104 | total_timesteps 3628.
Path 105 | total_timesteps 3650.
Path 106 | total_timesteps 3679.
Path 107 | total_timesteps 3726.
Path 108 | total_timesteps 3746.
Path 109 | total_timesteps 3789.
Path 110 | total_timesteps 3806.
Path 111 | total_timesteps 3847.
Path 112 | total_timesteps 3856.
Path 113 | total_timesteps 3893.
Path 114 | total_timesteps 3946.
Path 115 | total_timesteps 3991.
Path 116 | total_timesteps 4027.
Path 117 | total_timesteps 4051.
Path 118 | total_timesteps 4064.
Path 119 | total_timesteps 4099.
Path 120 | total_timesteps 4134.
Path 121 | total_timesteps 4155.
Path 122 | total_timesteps 4179.
Path 123 | total_timesteps 4215.
Path 124 | total_timesteps 4235.
Path 125 | total_timesteps 4271.
Path 126 | total_timesteps 4314.
Path 127 | total_timesteps 4352.
Path 128 | total_timesteps 4373.
Path 129 | total_timesteps 4419.
Path 130 | total_timesteps 4441.
Path 131 | total_timesteps 4479.
Path 132 | total_timesteps 4511.
Path 133 | total_timesteps 4556.
Path 134 | total_timesteps 4589.
Path 135 | total_timesteps 4596.
Path 136 | total_timesteps 4631.
Path 137 | total_timesteps 4663.
Path 138 | total_timesteps 4700.
Path 139 | total_timesteps 4746.
Path 140 | total_timesteps 4768.
Path 141 | total_timesteps 4793.
Path 142 | total_timesteps 4819.
Path 143 | total_timesteps 4843.
Path 144 | total_timesteps 4881.
Path 145 | total_timesteps 4904.
Path 146 | total_timesteps 4943.
Path 147 | total_timesteps 4978.
Path 148 | total_timesteps 4999.
Path 149 | total_timesteps 5039.
Path 150 | total_timesteps 5067.
Path 151 | total_timesteps 5112.
Path 152 | total_timesteps 5162.
Path 153 | total_timesteps 5197.
Path 154 | total_timesteps 5227.
Path 155 | total_timesteps 5257.
Path 156 | total_timesteps 5281.
Path 157 | total_timesteps 5329.
Path 158 | total_timesteps 5356.
Path 159 | total_timesteps 5371.
Path 160 | total_timesteps 5412.
Path 161 | total_timesteps 5462.
Path 162 | total_timesteps 5487.
Path 163 | total_timesteps 5512.
Path 164 | total_timesteps 5532.
Path 165 | total_timesteps 5544.
Path 166 | total_timesteps 5565.
Path 167 | total_timesteps 5590.
Path 168 | total_timesteps 5620.
Path 169 | total_timesteps 5647.
Path 170 | total_timesteps 5691.
Path 171 | total_timesteps 5714.
Path 172 | total_timesteps 5741.
Path 173 | total_timesteps 5770.
Path 174 | total_timesteps 5795.
Path 175 | total_timesteps 5859.
Path 176 | total_timesteps 5895.
Path 177 | total_timesteps 5926.
Path 178 | total_timesteps 5975.
Path 179 | total_timesteps 5996.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -15.2    |
| Iteration     | 10       |
| MaximumReturn | 14.2     |
| MinimumReturn | -42.3    |
| TotalSamples  | 48192    |
----------------------------
itr #11 | 
Fitting dynamics.
Validation loss = 0.3981383740901947
Validation loss = 0.4102881848812103
Validation loss = 0.4152165353298187
Validation loss = 0.4146801233291626
Validation loss = 0.41884538531303406
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 29.
Path 2 | total_timesteps 65.
Path 3 | total_timesteps 97.
Path 4 | total_timesteps 115.
Path 5 | total_timesteps 150.
Path 6 | total_timesteps 176.
Path 7 | total_timesteps 217.
Path 8 | total_timesteps 252.
Path 9 | total_timesteps 273.
Path 10 | total_timesteps 305.
Path 11 | total_timesteps 340.
Path 12 | total_timesteps 376.
Path 13 | total_timesteps 396.
Path 14 | total_timesteps 428.
Path 15 | total_timesteps 464.
Path 16 | total_timesteps 486.
Path 17 | total_timesteps 495.
Path 18 | total_timesteps 545.
Path 19 | total_timesteps 559.
Path 20 | total_timesteps 580.
Path 21 | total_timesteps 609.
Path 22 | total_timesteps 655.
Path 23 | total_timesteps 696.
Path 24 | total_timesteps 737.
Path 25 | total_timesteps 758.
Path 26 | total_timesteps 796.
Path 27 | total_timesteps 848.
Path 28 | total_timesteps 902.
Path 29 | total_timesteps 938.
Path 30 | total_timesteps 994.
Path 31 | total_timesteps 1017.
Path 32 | total_timesteps 1067.
Path 33 | total_timesteps 1087.
Path 34 | total_timesteps 1123.
Path 35 | total_timesteps 1143.
Path 36 | total_timesteps 1177.
Path 37 | total_timesteps 1207.
Path 38 | total_timesteps 1260.
Path 39 | total_timesteps 1289.
Path 40 | total_timesteps 1349.
Path 41 | total_timesteps 1392.
Path 42 | total_timesteps 1440.
Path 43 | total_timesteps 1467.
Path 44 | total_timesteps 1512.
Path 45 | total_timesteps 1537.
Path 46 | total_timesteps 1561.
Path 47 | total_timesteps 1602.
Path 48 | total_timesteps 1665.
Path 49 | total_timesteps 1701.
Path 50 | total_timesteps 1750.
Path 51 | total_timesteps 1788.
Path 52 | total_timesteps 1818.
Path 53 | total_timesteps 1852.
Path 54 | total_timesteps 1870.
Path 55 | total_timesteps 1923.
Path 56 | total_timesteps 1962.
Path 57 | total_timesteps 1993.
Path 58 | total_timesteps 2015.
Path 59 | total_timesteps 2062.
Path 60 | total_timesteps 2086.
Path 61 | total_timesteps 2127.
Path 62 | total_timesteps 2183.
Path 63 | total_timesteps 2226.
Path 64 | total_timesteps 2250.
Path 65 | total_timesteps 2299.
Path 66 | total_timesteps 2327.
Path 67 | total_timesteps 2356.
Path 68 | total_timesteps 2378.
Path 69 | total_timesteps 2404.
Path 70 | total_timesteps 2432.
Path 71 | total_timesteps 2462.
Path 72 | total_timesteps 2504.
Path 73 | total_timesteps 2572.
Path 74 | total_timesteps 2595.
Path 75 | total_timesteps 2618.
Path 76 | total_timesteps 2642.
Path 77 | total_timesteps 2666.
Path 78 | total_timesteps 2694.
Path 79 | total_timesteps 2727.
Path 80 | total_timesteps 2748.
Path 81 | total_timesteps 2765.
Path 82 | total_timesteps 2805.
Path 83 | total_timesteps 2821.
Path 84 | total_timesteps 2846.
Path 85 | total_timesteps 2880.
Path 86 | total_timesteps 2918.
Path 87 | total_timesteps 2947.
Path 88 | total_timesteps 2976.
Path 89 | total_timesteps 3006.
Path 90 | total_timesteps 3041.
Path 91 | total_timesteps 3064.
Path 92 | total_timesteps 3118.
Path 93 | total_timesteps 3149.
Path 94 | total_timesteps 3172.
Path 95 | total_timesteps 3209.
Path 96 | total_timesteps 3280.
Path 97 | total_timesteps 3317.
Path 98 | total_timesteps 3348.
Path 99 | total_timesteps 3365.
Path 100 | total_timesteps 3392.
Path 101 | total_timesteps 3427.
Path 102 | total_timesteps 3482.
Path 103 | total_timesteps 3512.
Path 104 | total_timesteps 3538.
Path 105 | total_timesteps 3560.
Path 106 | total_timesteps 3601.
Path 107 | total_timesteps 3651.
Path 108 | total_timesteps 3678.
Path 109 | total_timesteps 3704.
Path 110 | total_timesteps 3749.
Path 111 | total_timesteps 3780.
Path 112 | total_timesteps 3825.
Path 113 | total_timesteps 3857.
Path 114 | total_timesteps 3884.
Path 115 | total_timesteps 3910.
Path 116 | total_timesteps 3940.
Path 117 | total_timesteps 3970.
Path 118 | total_timesteps 3987.
Path 119 | total_timesteps 4021.
Path 120 | total_timesteps 4050.
Path 121 | total_timesteps 4072.
Path 122 | total_timesteps 4092.
Path 123 | total_timesteps 4121.
Path 124 | total_timesteps 4150.
Path 125 | total_timesteps 4177.
Path 126 | total_timesteps 4230.
Path 127 | total_timesteps 4255.
Path 128 | total_timesteps 4271.
Path 129 | total_timesteps 4305.
Path 130 | total_timesteps 4338.
Path 131 | total_timesteps 4364.
Path 132 | total_timesteps 4408.
Path 133 | total_timesteps 4458.
Path 134 | total_timesteps 4505.
Path 135 | total_timesteps 4536.
Path 136 | total_timesteps 4566.
Path 137 | total_timesteps 4586.
Path 138 | total_timesteps 4621.
Path 139 | total_timesteps 4666.
Path 140 | total_timesteps 4695.
Path 141 | total_timesteps 4761.
Path 142 | total_timesteps 4787.
Path 143 | total_timesteps 4817.
Path 144 | total_timesteps 4836.
Path 145 | total_timesteps 4858.
Path 146 | total_timesteps 4898.
Path 147 | total_timesteps 4925.
Path 148 | total_timesteps 4967.
Path 149 | total_timesteps 4993.
Path 150 | total_timesteps 5028.
Path 151 | total_timesteps 5053.
Path 152 | total_timesteps 5078.
Path 153 | total_timesteps 5099.
Path 154 | total_timesteps 5142.
Path 155 | total_timesteps 5177.
Path 156 | total_timesteps 5199.
Path 157 | total_timesteps 5233.
Path 158 | total_timesteps 5262.
Path 159 | total_timesteps 5286.
Path 160 | total_timesteps 5313.
Path 161 | total_timesteps 5341.
Path 162 | total_timesteps 5361.
Path 163 | total_timesteps 5403.
Path 164 | total_timesteps 5425.
Path 165 | total_timesteps 5444.
Path 166 | total_timesteps 5492.
Path 167 | total_timesteps 5514.
Path 168 | total_timesteps 5569.
Path 169 | total_timesteps 5601.
Path 170 | total_timesteps 5629.
Path 171 | total_timesteps 5659.
Path 172 | total_timesteps 5703.
Path 173 | total_timesteps 5710.
Path 174 | total_timesteps 5740.
Path 175 | total_timesteps 5767.
Path 176 | total_timesteps 5790.
Path 177 | total_timesteps 5806.
Path 178 | total_timesteps 5827.
Path 179 | total_timesteps 5864.
Path 180 | total_timesteps 5906.
Path 181 | total_timesteps 5915.
Path 182 | total_timesteps 5962.
Path 183 | total_timesteps 5984.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -15.7    |
| Iteration     | 11       |
| MaximumReturn | 4.14     |
| MinimumReturn | -44.6    |
| TotalSamples  | 52214    |
----------------------------
itr #12 | 
Fitting dynamics.
Validation loss = 0.4056216776371002
Validation loss = 0.41059714555740356
Validation loss = 0.4107806384563446
Validation loss = 0.417175829410553
Validation loss = 0.4198298454284668
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 33.
Path 2 | total_timesteps 73.
Path 3 | total_timesteps 100.
Path 4 | total_timesteps 153.
Path 5 | total_timesteps 184.
Path 6 | total_timesteps 221.
Path 7 | total_timesteps 261.
Path 8 | total_timesteps 306.
Path 9 | total_timesteps 322.
Path 10 | total_timesteps 345.
Path 11 | total_timesteps 378.
Path 12 | total_timesteps 411.
Path 13 | total_timesteps 466.
Path 14 | total_timesteps 494.
Path 15 | total_timesteps 529.
Path 16 | total_timesteps 582.
Path 17 | total_timesteps 603.
Path 18 | total_timesteps 640.
Path 19 | total_timesteps 668.
Path 20 | total_timesteps 701.
Path 21 | total_timesteps 748.
Path 22 | total_timesteps 763.
Path 23 | total_timesteps 802.
Path 24 | total_timesteps 866.
Path 25 | total_timesteps 890.
Path 26 | total_timesteps 935.
Path 27 | total_timesteps 983.
Path 28 | total_timesteps 1040.
Path 29 | total_timesteps 1075.
Path 30 | total_timesteps 1087.
Path 31 | total_timesteps 1126.
Path 32 | total_timesteps 1144.
Path 33 | total_timesteps 1177.
Path 34 | total_timesteps 1185.
Path 35 | total_timesteps 1207.
Path 36 | total_timesteps 1236.
Path 37 | total_timesteps 1277.
Path 38 | total_timesteps 1338.
Path 39 | total_timesteps 1365.
Path 40 | total_timesteps 1401.
Path 41 | total_timesteps 1465.
Path 42 | total_timesteps 1489.
Path 43 | total_timesteps 1520.
Path 44 | total_timesteps 1561.
Path 45 | total_timesteps 1579.
Path 46 | total_timesteps 1644.
Path 47 | total_timesteps 1689.
Path 48 | total_timesteps 1714.
Path 49 | total_timesteps 1728.
Path 50 | total_timesteps 1755.
Path 51 | total_timesteps 1788.
Path 52 | total_timesteps 1815.
Path 53 | total_timesteps 1873.
Path 54 | total_timesteps 1900.
Path 55 | total_timesteps 1948.
Path 56 | total_timesteps 1994.
Path 57 | total_timesteps 2037.
Path 58 | total_timesteps 2090.
Path 59 | total_timesteps 2130.
Path 60 | total_timesteps 2173.
Path 61 | total_timesteps 2193.
Path 62 | total_timesteps 2239.
Path 63 | total_timesteps 2280.
Path 64 | total_timesteps 2317.
Path 65 | total_timesteps 2342.
Path 66 | total_timesteps 2367.
Path 67 | total_timesteps 2426.
Path 68 | total_timesteps 2475.
Path 69 | total_timesteps 2516.
Path 70 | total_timesteps 2565.
Path 71 | total_timesteps 2590.
Path 72 | total_timesteps 2610.
Path 73 | total_timesteps 2648.
Path 74 | total_timesteps 2681.
Path 75 | total_timesteps 2700.
Path 76 | total_timesteps 2740.
Path 77 | total_timesteps 2785.
Path 78 | total_timesteps 2837.
Path 79 | total_timesteps 2877.
Path 80 | total_timesteps 2903.
Path 81 | total_timesteps 2939.
Path 82 | total_timesteps 2987.
Path 83 | total_timesteps 3007.
Path 84 | total_timesteps 3033.
Path 85 | total_timesteps 3101.
Path 86 | total_timesteps 3148.
Path 87 | total_timesteps 3184.
Path 88 | total_timesteps 3201.
Path 89 | total_timesteps 3236.
Path 90 | total_timesteps 3258.
Path 91 | total_timesteps 3291.
Path 92 | total_timesteps 3331.
Path 93 | total_timesteps 3373.
Path 94 | total_timesteps 3436.
Path 95 | total_timesteps 3473.
Path 96 | total_timesteps 3495.
Path 97 | total_timesteps 3550.
Path 98 | total_timesteps 3571.
Path 99 | total_timesteps 3584.
Path 100 | total_timesteps 3617.
Path 101 | total_timesteps 3674.
Path 102 | total_timesteps 3741.
Path 103 | total_timesteps 3771.
Path 104 | total_timesteps 3803.
Path 105 | total_timesteps 3831.
Path 106 | total_timesteps 3858.
Path 107 | total_timesteps 3882.
Path 108 | total_timesteps 3920.
Path 109 | total_timesteps 3972.
Path 110 | total_timesteps 4005.
Path 111 | total_timesteps 4052.
Path 112 | total_timesteps 4066.
Path 113 | total_timesteps 4095.
Path 114 | total_timesteps 4148.
Path 115 | total_timesteps 4176.
Path 116 | total_timesteps 4210.
Path 117 | total_timesteps 4241.
Path 118 | total_timesteps 4295.
Path 119 | total_timesteps 4331.
Path 120 | total_timesteps 4346.
Path 121 | total_timesteps 4374.
Path 122 | total_timesteps 4417.
Path 123 | total_timesteps 4455.
Path 124 | total_timesteps 4490.
Path 125 | total_timesteps 4523.
Path 126 | total_timesteps 4568.
Path 127 | total_timesteps 4639.
Path 128 | total_timesteps 4664.
Path 129 | total_timesteps 4705.
Path 130 | total_timesteps 4726.
Path 131 | total_timesteps 4774.
Path 132 | total_timesteps 4807.
Path 133 | total_timesteps 4826.
Path 134 | total_timesteps 4869.
Path 135 | total_timesteps 4915.
Path 136 | total_timesteps 4956.
Path 137 | total_timesteps 5019.
Path 138 | total_timesteps 5049.
Path 139 | total_timesteps 5082.
Path 140 | total_timesteps 5101.
Path 141 | total_timesteps 5129.
Path 142 | total_timesteps 5164.
Path 143 | total_timesteps 5211.
Path 144 | total_timesteps 5238.
Path 145 | total_timesteps 5278.
Path 146 | total_timesteps 5303.
Path 147 | total_timesteps 5322.
Path 148 | total_timesteps 5373.
Path 149 | total_timesteps 5409.
Path 150 | total_timesteps 5431.
Path 151 | total_timesteps 5462.
Path 152 | total_timesteps 5500.
Path 153 | total_timesteps 5530.
Path 154 | total_timesteps 5572.
Path 155 | total_timesteps 5632.
Path 156 | total_timesteps 5665.
Path 157 | total_timesteps 5680.
Path 158 | total_timesteps 5715.
Path 159 | total_timesteps 5753.
Path 160 | total_timesteps 5773.
Path 161 | total_timesteps 5828.
Path 162 | total_timesteps 5865.
Path 163 | total_timesteps 5905.
Path 164 | total_timesteps 5939.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -15.7    |
| Iteration     | 12       |
| MaximumReturn | 6.06     |
| MinimumReturn | -34.7    |
| TotalSamples  | 56216    |
----------------------------
itr #13 | 
Fitting dynamics.
Validation loss = 0.40281733870506287
Validation loss = 0.41014885902404785
Validation loss = 0.4117373526096344
Validation loss = 0.4143480360507965
Validation loss = 0.415761798620224
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 32.
Path 2 | total_timesteps 58.
Path 3 | total_timesteps 73.
Path 4 | total_timesteps 114.
Path 5 | total_timesteps 160.
Path 6 | total_timesteps 194.
Path 7 | total_timesteps 226.
Path 8 | total_timesteps 263.
Path 9 | total_timesteps 287.
Path 10 | total_timesteps 324.
Path 11 | total_timesteps 349.
Path 12 | total_timesteps 391.
Path 13 | total_timesteps 414.
Path 14 | total_timesteps 456.
Path 15 | total_timesteps 480.
Path 16 | total_timesteps 521.
Path 17 | total_timesteps 565.
Path 18 | total_timesteps 595.
Path 19 | total_timesteps 628.
Path 20 | total_timesteps 648.
Path 21 | total_timesteps 668.
Path 22 | total_timesteps 704.
Path 23 | total_timesteps 729.
Path 24 | total_timesteps 794.
Path 25 | total_timesteps 827.
Path 26 | total_timesteps 851.
Path 27 | total_timesteps 898.
Path 28 | total_timesteps 936.
Path 29 | total_timesteps 981.
Path 30 | total_timesteps 1020.
Path 31 | total_timesteps 1053.
Path 32 | total_timesteps 1096.
Path 33 | total_timesteps 1119.
Path 34 | total_timesteps 1154.
Path 35 | total_timesteps 1175.
Path 36 | total_timesteps 1246.
Path 37 | total_timesteps 1296.
Path 38 | total_timesteps 1328.
Path 39 | total_timesteps 1346.
Path 40 | total_timesteps 1357.
Path 41 | total_timesteps 1391.
Path 42 | total_timesteps 1420.
Path 43 | total_timesteps 1478.
Path 44 | total_timesteps 1525.
Path 45 | total_timesteps 1555.
Path 46 | total_timesteps 1586.
Path 47 | total_timesteps 1643.
Path 48 | total_timesteps 1673.
Path 49 | total_timesteps 1727.
Path 50 | total_timesteps 1753.
Path 51 | total_timesteps 1781.
Path 52 | total_timesteps 1800.
Path 53 | total_timesteps 1829.
Path 54 | total_timesteps 1880.
Path 55 | total_timesteps 1902.
Path 56 | total_timesteps 1930.
Path 57 | total_timesteps 1973.
Path 58 | total_timesteps 1985.
Path 59 | total_timesteps 2035.
Path 60 | total_timesteps 2054.
Path 61 | total_timesteps 2085.
Path 62 | total_timesteps 2122.
Path 63 | total_timesteps 2139.
Path 64 | total_timesteps 2185.
Path 65 | total_timesteps 2213.
Path 66 | total_timesteps 2258.
Path 67 | total_timesteps 2306.
Path 68 | total_timesteps 2351.
Path 69 | total_timesteps 2383.
Path 70 | total_timesteps 2414.
Path 71 | total_timesteps 2466.
Path 72 | total_timesteps 2519.
Path 73 | total_timesteps 2544.
Path 74 | total_timesteps 2574.
Path 75 | total_timesteps 2614.
Path 76 | total_timesteps 2641.
Path 77 | total_timesteps 2671.
Path 78 | total_timesteps 2705.
Path 79 | total_timesteps 2755.
Path 80 | total_timesteps 2816.
Path 81 | total_timesteps 2843.
Path 82 | total_timesteps 2874.
Path 83 | total_timesteps 2911.
Path 84 | total_timesteps 2951.
Path 85 | total_timesteps 3001.
Path 86 | total_timesteps 3030.
Path 87 | total_timesteps 3058.
Path 88 | total_timesteps 3085.
Path 89 | total_timesteps 3152.
Path 90 | total_timesteps 3184.
Path 91 | total_timesteps 3207.
Path 92 | total_timesteps 3236.
Path 93 | total_timesteps 3270.
Path 94 | total_timesteps 3288.
Path 95 | total_timesteps 3328.
Path 96 | total_timesteps 3363.
Path 97 | total_timesteps 3435.
Path 98 | total_timesteps 3500.
Path 99 | total_timesteps 3527.
Path 100 | total_timesteps 3546.
Path 101 | total_timesteps 3578.
Path 102 | total_timesteps 3623.
Path 103 | total_timesteps 3655.
Path 104 | total_timesteps 3679.
Path 105 | total_timesteps 3738.
Path 106 | total_timesteps 3772.
Path 107 | total_timesteps 3820.
Path 108 | total_timesteps 3855.
Path 109 | total_timesteps 3887.
Path 110 | total_timesteps 3930.
Path 111 | total_timesteps 3967.
Path 112 | total_timesteps 3985.
Path 113 | total_timesteps 4013.
Path 114 | total_timesteps 4030.
Path 115 | total_timesteps 4051.
Path 116 | total_timesteps 4066.
Path 117 | total_timesteps 4098.
Path 118 | total_timesteps 4131.
Path 119 | total_timesteps 4170.
Path 120 | total_timesteps 4180.
Path 121 | total_timesteps 4206.
Path 122 | total_timesteps 4244.
Path 123 | total_timesteps 4265.
Path 124 | total_timesteps 4327.
Path 125 | total_timesteps 4368.
Path 126 | total_timesteps 4401.
Path 127 | total_timesteps 4425.
Path 128 | total_timesteps 4448.
Path 129 | total_timesteps 4540.
Path 130 | total_timesteps 4568.
Path 131 | total_timesteps 4608.
Path 132 | total_timesteps 4623.
Path 133 | total_timesteps 4653.
Path 134 | total_timesteps 4679.
Path 135 | total_timesteps 4725.
Path 136 | total_timesteps 4764.
Path 137 | total_timesteps 4790.
Path 138 | total_timesteps 4835.
Path 139 | total_timesteps 4865.
Path 140 | total_timesteps 4906.
Path 141 | total_timesteps 4920.
Path 142 | total_timesteps 4939.
Path 143 | total_timesteps 4973.
Path 144 | total_timesteps 4981.
Path 145 | total_timesteps 5007.
Path 146 | total_timesteps 5039.
Path 147 | total_timesteps 5057.
Path 148 | total_timesteps 5085.
Path 149 | total_timesteps 5161.
Path 150 | total_timesteps 5184.
Path 151 | total_timesteps 5215.
Path 152 | total_timesteps 5239.
Path 153 | total_timesteps 5290.
Path 154 | total_timesteps 5321.
Path 155 | total_timesteps 5355.
Path 156 | total_timesteps 5382.
Path 157 | total_timesteps 5424.
Path 158 | total_timesteps 5463.
Path 159 | total_timesteps 5489.
Path 160 | total_timesteps 5503.
Path 161 | total_timesteps 5533.
Path 162 | total_timesteps 5568.
Path 163 | total_timesteps 5618.
Path 164 | total_timesteps 5630.
Path 165 | total_timesteps 5676.
Path 166 | total_timesteps 5695.
Path 167 | total_timesteps 5722.
Path 168 | total_timesteps 5746.
Path 169 | total_timesteps 5778.
Path 170 | total_timesteps 5817.
Path 171 | total_timesteps 5856.
Path 172 | total_timesteps 5880.
Path 173 | total_timesteps 5919.
Path 174 | total_timesteps 5950.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -15.3    |
| Iteration     | 13       |
| MaximumReturn | 1.49     |
| MinimumReturn | -35.5    |
| TotalSamples  | 60247    |
----------------------------
itr #14 | 
Fitting dynamics.
Validation loss = 0.4057880938053131
Validation loss = 0.4102496802806854
Validation loss = 0.4126063585281372
Validation loss = 0.4133698344230652
Validation loss = 0.41446393728256226
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 13.
Path 2 | total_timesteps 43.
Path 3 | total_timesteps 66.
Path 4 | total_timesteps 88.
Path 5 | total_timesteps 139.
Path 6 | total_timesteps 168.
Path 7 | total_timesteps 214.
Path 8 | total_timesteps 261.
Path 9 | total_timesteps 292.
Path 10 | total_timesteps 325.
Path 11 | total_timesteps 349.
Path 12 | total_timesteps 400.
Path 13 | total_timesteps 425.
Path 14 | total_timesteps 465.
Path 15 | total_timesteps 517.
Path 16 | total_timesteps 535.
Path 17 | total_timesteps 566.
Path 18 | total_timesteps 599.
Path 19 | total_timesteps 630.
Path 20 | total_timesteps 650.
Path 21 | total_timesteps 688.
Path 22 | total_timesteps 704.
Path 23 | total_timesteps 735.
Path 24 | total_timesteps 762.
Path 25 | total_timesteps 817.
Path 26 | total_timesteps 850.
Path 27 | total_timesteps 878.
Path 28 | total_timesteps 893.
Path 29 | total_timesteps 930.
Path 30 | total_timesteps 1020.
Path 31 | total_timesteps 1055.
Path 32 | total_timesteps 1087.
Path 33 | total_timesteps 1144.
Path 34 | total_timesteps 1169.
Path 35 | total_timesteps 1202.
Path 36 | total_timesteps 1242.
Path 37 | total_timesteps 1262.
Path 38 | total_timesteps 1293.
Path 39 | total_timesteps 1327.
Path 40 | total_timesteps 1368.
Path 41 | total_timesteps 1411.
Path 42 | total_timesteps 1443.
Path 43 | total_timesteps 1471.
Path 44 | total_timesteps 1516.
Path 45 | total_timesteps 1566.
Path 46 | total_timesteps 1603.
Path 47 | total_timesteps 1635.
Path 48 | total_timesteps 1691.
Path 49 | total_timesteps 1713.
Path 50 | total_timesteps 1743.
Path 51 | total_timesteps 1782.
Path 52 | total_timesteps 1822.
Path 53 | total_timesteps 1862.
Path 54 | total_timesteps 1888.
Path 55 | total_timesteps 1942.
Path 56 | total_timesteps 2005.
Path 57 | total_timesteps 2045.
Path 58 | total_timesteps 2106.
Path 59 | total_timesteps 2128.
Path 60 | total_timesteps 2154.
Path 61 | total_timesteps 2174.
Path 62 | total_timesteps 2216.
Path 63 | total_timesteps 2244.
Path 64 | total_timesteps 2271.
Path 65 | total_timesteps 2306.
Path 66 | total_timesteps 2329.
Path 67 | total_timesteps 2371.
Path 68 | total_timesteps 2408.
Path 69 | total_timesteps 2448.
Path 70 | total_timesteps 2509.
Path 71 | total_timesteps 2546.
Path 72 | total_timesteps 2573.
Path 73 | total_timesteps 2591.
Path 74 | total_timesteps 2610.
Path 75 | total_timesteps 2646.
Path 76 | total_timesteps 2684.
Path 77 | total_timesteps 2729.
Path 78 | total_timesteps 2774.
Path 79 | total_timesteps 2794.
Path 80 | total_timesteps 2826.
Path 81 | total_timesteps 2856.
Path 82 | total_timesteps 2893.
Path 83 | total_timesteps 2917.
Path 84 | total_timesteps 2976.
Path 85 | total_timesteps 3009.
Path 86 | total_timesteps 3054.
Path 87 | total_timesteps 3076.
Path 88 | total_timesteps 3096.
Path 89 | total_timesteps 3156.
Path 90 | total_timesteps 3191.
Path 91 | total_timesteps 3224.
Path 92 | total_timesteps 3260.
Path 93 | total_timesteps 3304.
Path 94 | total_timesteps 3334.
Path 95 | total_timesteps 3345.
Path 96 | total_timesteps 3385.
Path 97 | total_timesteps 3411.
Path 98 | total_timesteps 3472.
Path 99 | total_timesteps 3493.
Path 100 | total_timesteps 3533.
Path 101 | total_timesteps 3566.
Path 102 | total_timesteps 3585.
Path 103 | total_timesteps 3638.
Path 104 | total_timesteps 3663.
Path 105 | total_timesteps 3685.
Path 106 | total_timesteps 3741.
Path 107 | total_timesteps 3773.
Path 108 | total_timesteps 3798.
Path 109 | total_timesteps 3845.
Path 110 | total_timesteps 3875.
Path 111 | total_timesteps 3900.
Path 112 | total_timesteps 3964.
Path 113 | total_timesteps 4012.
Path 114 | total_timesteps 4050.
Path 115 | total_timesteps 4115.
Path 116 | total_timesteps 4144.
Path 117 | total_timesteps 4184.
Path 118 | total_timesteps 4238.
Path 119 | total_timesteps 4260.
Path 120 | total_timesteps 4275.
Path 121 | total_timesteps 4326.
Path 122 | total_timesteps 4375.
Path 123 | total_timesteps 4401.
Path 124 | total_timesteps 4436.
Path 125 | total_timesteps 4476.
Path 126 | total_timesteps 4497.
Path 127 | total_timesteps 4557.
Path 128 | total_timesteps 4603.
Path 129 | total_timesteps 4631.
Path 130 | total_timesteps 4666.
Path 131 | total_timesteps 4688.
Path 132 | total_timesteps 4708.
Path 133 | total_timesteps 4745.
Path 134 | total_timesteps 4776.
Path 135 | total_timesteps 4809.
Path 136 | total_timesteps 4851.
Path 137 | total_timesteps 4878.
Path 138 | total_timesteps 4897.
Path 139 | total_timesteps 4938.
Path 140 | total_timesteps 4958.
Path 141 | total_timesteps 4981.
Path 142 | total_timesteps 5004.
Path 143 | total_timesteps 5042.
Path 144 | total_timesteps 5061.
Path 145 | total_timesteps 5098.
Path 146 | total_timesteps 5132.
Path 147 | total_timesteps 5155.
Path 148 | total_timesteps 5178.
Path 149 | total_timesteps 5217.
Path 150 | total_timesteps 5241.
Path 151 | total_timesteps 5258.
Path 152 | total_timesteps 5285.
Path 153 | total_timesteps 5311.
Path 154 | total_timesteps 5330.
Path 155 | total_timesteps 5363.
Path 156 | total_timesteps 5396.
Path 157 | total_timesteps 5447.
Path 158 | total_timesteps 5476.
Path 159 | total_timesteps 5501.
Path 160 | total_timesteps 5534.
Path 161 | total_timesteps 5563.
Path 162 | total_timesteps 5592.
Path 163 | total_timesteps 5619.
Path 164 | total_timesteps 5661.
Path 165 | total_timesteps 5708.
Path 166 | total_timesteps 5748.
Path 167 | total_timesteps 5764.
Path 168 | total_timesteps 5793.
Path 169 | total_timesteps 5863.
Path 170 | total_timesteps 5883.
Path 171 | total_timesteps 5923.
Path 172 | total_timesteps 5958.
Path 173 | total_timesteps 5995.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -15.1    |
| Iteration     | 14       |
| MaximumReturn | 24.5     |
| MinimumReturn | -38.8    |
| TotalSamples  | 64269    |
----------------------------
itr #15 | 
Fitting dynamics.
Validation loss = 0.4037841558456421
Validation loss = 0.4126923084259033
Validation loss = 0.4156036078929901
Validation loss = 0.4188973009586334
Validation loss = 0.41801267862319946
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 31.
Path 2 | total_timesteps 59.
Path 3 | total_timesteps 84.
Path 4 | total_timesteps 105.
Path 5 | total_timesteps 167.
Path 6 | total_timesteps 211.
Path 7 | total_timesteps 248.
Path 8 | total_timesteps 276.
Path 9 | total_timesteps 302.
Path 10 | total_timesteps 347.
Path 11 | total_timesteps 366.
Path 12 | total_timesteps 413.
Path 13 | total_timesteps 432.
Path 14 | total_timesteps 457.
Path 15 | total_timesteps 482.
Path 16 | total_timesteps 500.
Path 17 | total_timesteps 536.
Path 18 | total_timesteps 569.
Path 19 | total_timesteps 615.
Path 20 | total_timesteps 652.
Path 21 | total_timesteps 693.
Path 22 | total_timesteps 705.
Path 23 | total_timesteps 760.
Path 24 | total_timesteps 782.
Path 25 | total_timesteps 813.
Path 26 | total_timesteps 854.
Path 27 | total_timesteps 878.
Path 28 | total_timesteps 908.
Path 29 | total_timesteps 930.
Path 30 | total_timesteps 985.
Path 31 | total_timesteps 1003.
Path 32 | total_timesteps 1034.
Path 33 | total_timesteps 1063.
Path 34 | total_timesteps 1093.
Path 35 | total_timesteps 1120.
Path 36 | total_timesteps 1134.
Path 37 | total_timesteps 1182.
Path 38 | total_timesteps 1210.
Path 39 | total_timesteps 1247.
Path 40 | total_timesteps 1283.
Path 41 | total_timesteps 1306.
Path 42 | total_timesteps 1340.
Path 43 | total_timesteps 1366.
Path 44 | total_timesteps 1398.
Path 45 | total_timesteps 1436.
Path 46 | total_timesteps 1460.
Path 47 | total_timesteps 1536.
Path 48 | total_timesteps 1569.
Path 49 | total_timesteps 1607.
Path 50 | total_timesteps 1651.
Path 51 | total_timesteps 1678.
Path 52 | total_timesteps 1730.
Path 53 | total_timesteps 1786.
Path 54 | total_timesteps 1794.
Path 55 | total_timesteps 1827.
Path 56 | total_timesteps 1851.
Path 57 | total_timesteps 1885.
Path 58 | total_timesteps 1961.
Path 59 | total_timesteps 1982.
Path 60 | total_timesteps 2007.
Path 61 | total_timesteps 2030.
Path 62 | total_timesteps 2091.
Path 63 | total_timesteps 2120.
Path 64 | total_timesteps 2204.
Path 65 | total_timesteps 2244.
Path 66 | total_timesteps 2282.
Path 67 | total_timesteps 2305.
Path 68 | total_timesteps 2351.
Path 69 | total_timesteps 2378.
Path 70 | total_timesteps 2406.
Path 71 | total_timesteps 2443.
Path 72 | total_timesteps 2465.
Path 73 | total_timesteps 2507.
Path 74 | total_timesteps 2590.
Path 75 | total_timesteps 2604.
Path 76 | total_timesteps 2635.
Path 77 | total_timesteps 2668.
Path 78 | total_timesteps 2714.
Path 79 | total_timesteps 2738.
Path 80 | total_timesteps 2769.
Path 81 | total_timesteps 2813.
Path 82 | total_timesteps 2854.
Path 83 | total_timesteps 2918.
Path 84 | total_timesteps 2954.
Path 85 | total_timesteps 2977.
Path 86 | total_timesteps 3002.
Path 87 | total_timesteps 3013.
Path 88 | total_timesteps 3055.
Path 89 | total_timesteps 3093.
Path 90 | total_timesteps 3139.
Path 91 | total_timesteps 3185.
Path 92 | total_timesteps 3223.
Path 93 | total_timesteps 3261.
Path 94 | total_timesteps 3293.
Path 95 | total_timesteps 3335.
Path 96 | total_timesteps 3351.
Path 97 | total_timesteps 3381.
Path 98 | total_timesteps 3395.
Path 99 | total_timesteps 3439.
Path 100 | total_timesteps 3473.
Path 101 | total_timesteps 3521.
Path 102 | total_timesteps 3563.
Path 103 | total_timesteps 3605.
Path 104 | total_timesteps 3630.
Path 105 | total_timesteps 3657.
Path 106 | total_timesteps 3693.
Path 107 | total_timesteps 3741.
Path 108 | total_timesteps 3775.
Path 109 | total_timesteps 3813.
Path 110 | total_timesteps 3835.
Path 111 | total_timesteps 3866.
Path 112 | total_timesteps 3886.
Path 113 | total_timesteps 3924.
Path 114 | total_timesteps 3934.
Path 115 | total_timesteps 3958.
Path 116 | total_timesteps 3991.
Path 117 | total_timesteps 4014.
Path 118 | total_timesteps 4040.
Path 119 | total_timesteps 4126.
Path 120 | total_timesteps 4159.
Path 121 | total_timesteps 4224.
Path 122 | total_timesteps 4291.
Path 123 | total_timesteps 4318.
Path 124 | total_timesteps 4345.
Path 125 | total_timesteps 4383.
Path 126 | total_timesteps 4420.
Path 127 | total_timesteps 4450.
Path 128 | total_timesteps 4465.
Path 129 | total_timesteps 4495.
Path 130 | total_timesteps 4531.
Path 131 | total_timesteps 4559.
Path 132 | total_timesteps 4614.
Path 133 | total_timesteps 4653.
Path 134 | total_timesteps 4687.
Path 135 | total_timesteps 4698.
Path 136 | total_timesteps 4723.
Path 137 | total_timesteps 4759.
Path 138 | total_timesteps 4775.
Path 139 | total_timesteps 4823.
Path 140 | total_timesteps 4856.
Path 141 | total_timesteps 4895.
Path 142 | total_timesteps 4922.
Path 143 | total_timesteps 4954.
Path 144 | total_timesteps 5000.
Path 145 | total_timesteps 5031.
Path 146 | total_timesteps 5061.
Path 147 | total_timesteps 5122.
Path 148 | total_timesteps 5160.
Path 149 | total_timesteps 5215.
Path 150 | total_timesteps 5255.
Path 151 | total_timesteps 5296.
Path 152 | total_timesteps 5369.
Path 153 | total_timesteps 5403.
Path 154 | total_timesteps 5440.
Path 155 | total_timesteps 5486.
Path 156 | total_timesteps 5516.
Path 157 | total_timesteps 5556.
Path 158 | total_timesteps 5578.
Path 159 | total_timesteps 5606.
Path 160 | total_timesteps 5640.
Path 161 | total_timesteps 5678.
Path 162 | total_timesteps 5701.
Path 163 | total_timesteps 5740.
Path 164 | total_timesteps 5795.
Path 165 | total_timesteps 5835.
Path 166 | total_timesteps 5860.
Path 167 | total_timesteps 5887.
Path 168 | total_timesteps 5930.
Path 169 | total_timesteps 5980.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.8    |
| Iteration     | 15       |
| MaximumReturn | -0.658   |
| MinimumReturn | -66.7    |
| TotalSamples  | 68280    |
----------------------------
itr #16 | 
Fitting dynamics.
Validation loss = 0.4049966037273407
Validation loss = 0.411114364862442
Validation loss = 0.4184621274471283
Validation loss = 0.41317319869995117
Validation loss = 0.411862850189209
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 30.
Path 2 | total_timesteps 64.
Path 3 | total_timesteps 89.
Path 4 | total_timesteps 131.
Path 5 | total_timesteps 169.
Path 6 | total_timesteps 202.
Path 7 | total_timesteps 232.
Path 8 | total_timesteps 288.
Path 9 | total_timesteps 330.
Path 10 | total_timesteps 369.
Path 11 | total_timesteps 405.
Path 12 | total_timesteps 439.
Path 13 | total_timesteps 470.
Path 14 | total_timesteps 503.
Path 15 | total_timesteps 526.
Path 16 | total_timesteps 582.
Path 17 | total_timesteps 613.
Path 18 | total_timesteps 678.
Path 19 | total_timesteps 715.
Path 20 | total_timesteps 761.
Path 21 | total_timesteps 788.
Path 22 | total_timesteps 824.
Path 23 | total_timesteps 860.
Path 24 | total_timesteps 911.
Path 25 | total_timesteps 934.
Path 26 | total_timesteps 961.
Path 27 | total_timesteps 985.
Path 28 | total_timesteps 1011.
Path 29 | total_timesteps 1040.
Path 30 | total_timesteps 1099.
Path 31 | total_timesteps 1146.
Path 32 | total_timesteps 1157.
Path 33 | total_timesteps 1198.
Path 34 | total_timesteps 1214.
Path 35 | total_timesteps 1241.
Path 36 | total_timesteps 1265.
Path 37 | total_timesteps 1298.
Path 38 | total_timesteps 1330.
Path 39 | total_timesteps 1354.
Path 40 | total_timesteps 1389.
Path 41 | total_timesteps 1414.
Path 42 | total_timesteps 1484.
Path 43 | total_timesteps 1515.
Path 44 | total_timesteps 1526.
Path 45 | total_timesteps 1565.
Path 46 | total_timesteps 1617.
Path 47 | total_timesteps 1654.
Path 48 | total_timesteps 1664.
Path 49 | total_timesteps 1709.
Path 50 | total_timesteps 1754.
Path 51 | total_timesteps 1780.
Path 52 | total_timesteps 1813.
Path 53 | total_timesteps 1886.
Path 54 | total_timesteps 1910.
Path 55 | total_timesteps 1941.
Path 56 | total_timesteps 1973.
Path 57 | total_timesteps 2013.
Path 58 | total_timesteps 2090.
Path 59 | total_timesteps 2138.
Path 60 | total_timesteps 2182.
Path 61 | total_timesteps 2214.
Path 62 | total_timesteps 2241.
Path 63 | total_timesteps 2290.
Path 64 | total_timesteps 2320.
Path 65 | total_timesteps 2338.
Path 66 | total_timesteps 2409.
Path 67 | total_timesteps 2435.
Path 68 | total_timesteps 2461.
Path 69 | total_timesteps 2495.
Path 70 | total_timesteps 2508.
Path 71 | total_timesteps 2535.
Path 72 | total_timesteps 2566.
Path 73 | total_timesteps 2606.
Path 74 | total_timesteps 2625.
Path 75 | total_timesteps 2660.
Path 76 | total_timesteps 2696.
Path 77 | total_timesteps 2720.
Path 78 | total_timesteps 2768.
Path 79 | total_timesteps 2781.
Path 80 | total_timesteps 2811.
Path 81 | total_timesteps 2840.
Path 82 | total_timesteps 2896.
Path 83 | total_timesteps 2940.
Path 84 | total_timesteps 2994.
Path 85 | total_timesteps 3047.
Path 86 | total_timesteps 3085.
Path 87 | total_timesteps 3121.
Path 88 | total_timesteps 3138.
Path 89 | total_timesteps 3160.
Path 90 | total_timesteps 3203.
Path 91 | total_timesteps 3233.
Path 92 | total_timesteps 3244.
Path 93 | total_timesteps 3283.
Path 94 | total_timesteps 3317.
Path 95 | total_timesteps 3352.
Path 96 | total_timesteps 3383.
Path 97 | total_timesteps 3415.
Path 98 | total_timesteps 3440.
Path 99 | total_timesteps 3470.
Path 100 | total_timesteps 3512.
Path 101 | total_timesteps 3561.
Path 102 | total_timesteps 3581.
Path 103 | total_timesteps 3601.
Path 104 | total_timesteps 3620.
Path 105 | total_timesteps 3646.
Path 106 | total_timesteps 3676.
Path 107 | total_timesteps 3693.
Path 108 | total_timesteps 3731.
Path 109 | total_timesteps 3775.
Path 110 | total_timesteps 3789.
Path 111 | total_timesteps 3806.
Path 112 | total_timesteps 3840.
Path 113 | total_timesteps 3867.
Path 114 | total_timesteps 3894.
Path 115 | total_timesteps 3926.
Path 116 | total_timesteps 3984.
Path 117 | total_timesteps 4037.
Path 118 | total_timesteps 4068.
Path 119 | total_timesteps 4100.
Path 120 | total_timesteps 4140.
Path 121 | total_timesteps 4168.
Path 122 | total_timesteps 4191.
Path 123 | total_timesteps 4217.
Path 124 | total_timesteps 4249.
Path 125 | total_timesteps 4287.
Path 126 | total_timesteps 4320.
Path 127 | total_timesteps 4334.
Path 128 | total_timesteps 4374.
Path 129 | total_timesteps 4411.
Path 130 | total_timesteps 4445.
Path 131 | total_timesteps 4518.
Path 132 | total_timesteps 4548.
Path 133 | total_timesteps 4582.
Path 134 | total_timesteps 4609.
Path 135 | total_timesteps 4628.
Path 136 | total_timesteps 4687.
Path 137 | total_timesteps 4720.
Path 138 | total_timesteps 4745.
Path 139 | total_timesteps 4762.
Path 140 | total_timesteps 4792.
Path 141 | total_timesteps 4851.
Path 142 | total_timesteps 4897.
Path 143 | total_timesteps 4927.
Path 144 | total_timesteps 4997.
Path 145 | total_timesteps 5067.
Path 146 | total_timesteps 5092.
Path 147 | total_timesteps 5138.
Path 148 | total_timesteps 5167.
Path 149 | total_timesteps 5198.
Path 150 | total_timesteps 5262.
Path 151 | total_timesteps 5314.
Path 152 | total_timesteps 5336.
Path 153 | total_timesteps 5386.
Path 154 | total_timesteps 5416.
Path 155 | total_timesteps 5434.
Path 156 | total_timesteps 5450.
Path 157 | total_timesteps 5498.
Path 158 | total_timesteps 5536.
Path 159 | total_timesteps 5578.
Path 160 | total_timesteps 5651.
Path 161 | total_timesteps 5698.
Path 162 | total_timesteps 5715.
Path 163 | total_timesteps 5757.
Path 164 | total_timesteps 5816.
Path 165 | total_timesteps 5852.
Path 166 | total_timesteps 5891.
Path 167 | total_timesteps 5965.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.5    |
| Iteration     | 16       |
| MaximumReturn | 3.22     |
| MinimumReturn | -40.7    |
| TotalSamples  | 72282    |
----------------------------
itr #17 | 
Fitting dynamics.
Validation loss = 0.4056686460971832
Validation loss = 0.41098299622535706
Validation loss = 0.415624737739563
Validation loss = 0.41478294134140015
Validation loss = 0.42067623138427734
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 33.
Path 2 | total_timesteps 70.
Path 3 | total_timesteps 97.
Path 4 | total_timesteps 133.
Path 5 | total_timesteps 180.
Path 6 | total_timesteps 229.
Path 7 | total_timesteps 260.
Path 8 | total_timesteps 281.
Path 9 | total_timesteps 309.
Path 10 | total_timesteps 356.
Path 11 | total_timesteps 411.
Path 12 | total_timesteps 433.
Path 13 | total_timesteps 449.
Path 14 | total_timesteps 477.
Path 15 | total_timesteps 511.
Path 16 | total_timesteps 571.
Path 17 | total_timesteps 605.
Path 18 | total_timesteps 648.
Path 19 | total_timesteps 666.
Path 20 | total_timesteps 705.
Path 21 | total_timesteps 728.
Path 22 | total_timesteps 769.
Path 23 | total_timesteps 794.
Path 24 | total_timesteps 827.
Path 25 | total_timesteps 866.
Path 26 | total_timesteps 915.
Path 27 | total_timesteps 949.
Path 28 | total_timesteps 982.
Path 29 | total_timesteps 1005.
Path 30 | total_timesteps 1040.
Path 31 | total_timesteps 1082.
Path 32 | total_timesteps 1118.
Path 33 | total_timesteps 1160.
Path 34 | total_timesteps 1213.
Path 35 | total_timesteps 1241.
Path 36 | total_timesteps 1257.
Path 37 | total_timesteps 1323.
Path 38 | total_timesteps 1344.
Path 39 | total_timesteps 1374.
Path 40 | total_timesteps 1393.
Path 41 | total_timesteps 1408.
Path 42 | total_timesteps 1445.
Path 43 | total_timesteps 1470.
Path 44 | total_timesteps 1514.
Path 45 | total_timesteps 1547.
Path 46 | total_timesteps 1570.
Path 47 | total_timesteps 1600.
Path 48 | total_timesteps 1624.
Path 49 | total_timesteps 1700.
Path 50 | total_timesteps 1742.
Path 51 | total_timesteps 1785.
Path 52 | total_timesteps 1806.
Path 53 | total_timesteps 1851.
Path 54 | total_timesteps 1862.
Path 55 | total_timesteps 1885.
Path 56 | total_timesteps 1913.
Path 57 | total_timesteps 1961.
Path 58 | total_timesteps 1998.
Path 59 | total_timesteps 2054.
Path 60 | total_timesteps 2085.
Path 61 | total_timesteps 2106.
Path 62 | total_timesteps 2155.
Path 63 | total_timesteps 2180.
Path 64 | total_timesteps 2218.
Path 65 | total_timesteps 2249.
Path 66 | total_timesteps 2280.
Path 67 | total_timesteps 2310.
Path 68 | total_timesteps 2349.
Path 69 | total_timesteps 2379.
Path 70 | total_timesteps 2420.
Path 71 | total_timesteps 2456.
Path 72 | total_timesteps 2486.
Path 73 | total_timesteps 2521.
Path 74 | total_timesteps 2583.
Path 75 | total_timesteps 2624.
Path 76 | total_timesteps 2668.
Path 77 | total_timesteps 2706.
Path 78 | total_timesteps 2728.
Path 79 | total_timesteps 2760.
Path 80 | total_timesteps 2790.
Path 81 | total_timesteps 2828.
Path 82 | total_timesteps 2898.
Path 83 | total_timesteps 2938.
Path 84 | total_timesteps 2954.
Path 85 | total_timesteps 3010.
Path 86 | total_timesteps 3053.
Path 87 | total_timesteps 3120.
Path 88 | total_timesteps 3156.
Path 89 | total_timesteps 3199.
Path 90 | total_timesteps 3210.
Path 91 | total_timesteps 3260.
Path 92 | total_timesteps 3274.
Path 93 | total_timesteps 3297.
Path 94 | total_timesteps 3326.
Path 95 | total_timesteps 3362.
Path 96 | total_timesteps 3409.
Path 97 | total_timesteps 3450.
Path 98 | total_timesteps 3490.
Path 99 | total_timesteps 3509.
Path 100 | total_timesteps 3539.
Path 101 | total_timesteps 3567.
Path 102 | total_timesteps 3599.
Path 103 | total_timesteps 3647.
Path 104 | total_timesteps 3698.
Path 105 | total_timesteps 3768.
Path 106 | total_timesteps 3802.
Path 107 | total_timesteps 3843.
Path 108 | total_timesteps 3893.
Path 109 | total_timesteps 3930.
Path 110 | total_timesteps 3971.
Path 111 | total_timesteps 3981.
Path 112 | total_timesteps 4017.
Path 113 | total_timesteps 4046.
Path 114 | total_timesteps 4088.
Path 115 | total_timesteps 4132.
Path 116 | total_timesteps 4184.
Path 117 | total_timesteps 4215.
Path 118 | total_timesteps 4257.
Path 119 | total_timesteps 4282.
Path 120 | total_timesteps 4322.
Path 121 | total_timesteps 4351.
Path 122 | total_timesteps 4395.
Path 123 | total_timesteps 4431.
Path 124 | total_timesteps 4464.
Path 125 | total_timesteps 4496.
Path 126 | total_timesteps 4546.
Path 127 | total_timesteps 4566.
Path 128 | total_timesteps 4604.
Path 129 | total_timesteps 4644.
Path 130 | total_timesteps 4684.
Path 131 | total_timesteps 4707.
Path 132 | total_timesteps 4727.
Path 133 | total_timesteps 4759.
Path 134 | total_timesteps 4799.
Path 135 | total_timesteps 4831.
Path 136 | total_timesteps 4891.
Path 137 | total_timesteps 4933.
Path 138 | total_timesteps 4958.
Path 139 | total_timesteps 5033.
Path 140 | total_timesteps 5071.
Path 141 | total_timesteps 5101.
Path 142 | total_timesteps 5140.
Path 143 | total_timesteps 5166.
Path 144 | total_timesteps 5198.
Path 145 | total_timesteps 5229.
Path 146 | total_timesteps 5250.
Path 147 | total_timesteps 5293.
Path 148 | total_timesteps 5330.
Path 149 | total_timesteps 5383.
Path 150 | total_timesteps 5419.
Path 151 | total_timesteps 5469.
Path 152 | total_timesteps 5522.
Path 153 | total_timesteps 5556.
Path 154 | total_timesteps 5592.
Path 155 | total_timesteps 5612.
Path 156 | total_timesteps 5647.
Path 157 | total_timesteps 5709.
Path 158 | total_timesteps 5739.
Path 159 | total_timesteps 5759.
Path 160 | total_timesteps 5791.
Path 161 | total_timesteps 5854.
Path 162 | total_timesteps 5905.
Path 163 | total_timesteps 5941.
Path 164 | total_timesteps 5986.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.2    |
| Iteration     | 17       |
| MaximumReturn | -0.1     |
| MinimumReturn | -42.7    |
| TotalSamples  | 76296    |
----------------------------
itr #18 | 
Fitting dynamics.
Validation loss = 0.40590277314186096
Validation loss = 0.4111784100532532
Validation loss = 0.4159974753856659
Validation loss = 0.4165419638156891
Validation loss = 0.4135226607322693
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 16.
Path 2 | total_timesteps 48.
Path 3 | total_timesteps 64.
Path 4 | total_timesteps 107.
Path 5 | total_timesteps 146.
Path 6 | total_timesteps 167.
Path 7 | total_timesteps 215.
Path 8 | total_timesteps 244.
Path 9 | total_timesteps 304.
Path 10 | total_timesteps 341.
Path 11 | total_timesteps 391.
Path 12 | total_timesteps 423.
Path 13 | total_timesteps 465.
Path 14 | total_timesteps 548.
Path 15 | total_timesteps 599.
Path 16 | total_timesteps 638.
Path 17 | total_timesteps 687.
Path 18 | total_timesteps 757.
Path 19 | total_timesteps 775.
Path 20 | total_timesteps 803.
Path 21 | total_timesteps 835.
Path 22 | total_timesteps 882.
Path 23 | total_timesteps 1018.
Path 24 | total_timesteps 1120.
Path 25 | total_timesteps 1155.
Path 26 | total_timesteps 1211.
Path 27 | total_timesteps 1250.
Path 28 | total_timesteps 1287.
Path 29 | total_timesteps 1328.
Path 30 | total_timesteps 1370.
Path 31 | total_timesteps 1435.
Path 32 | total_timesteps 1458.
Path 33 | total_timesteps 1486.
Path 34 | total_timesteps 1550.
Path 35 | total_timesteps 1588.
Path 36 | total_timesteps 1633.
Path 37 | total_timesteps 1666.
Path 38 | total_timesteps 1709.
Path 39 | total_timesteps 1746.
Path 40 | total_timesteps 1792.
Path 41 | total_timesteps 1823.
Path 42 | total_timesteps 1879.
Path 43 | total_timesteps 1911.
Path 44 | total_timesteps 1935.
Path 45 | total_timesteps 1977.
Path 46 | total_timesteps 2052.
Path 47 | total_timesteps 2119.
Path 48 | total_timesteps 2158.
Path 49 | total_timesteps 2191.
Path 50 | total_timesteps 2263.
Path 51 | total_timesteps 2284.
Path 52 | total_timesteps 2332.
Path 53 | total_timesteps 2355.
Path 54 | total_timesteps 2388.
Path 55 | total_timesteps 2421.
Path 56 | total_timesteps 2463.
Path 57 | total_timesteps 2503.
Path 58 | total_timesteps 2538.
Path 59 | total_timesteps 2568.
Path 60 | total_timesteps 2596.
Path 61 | total_timesteps 2660.
Path 62 | total_timesteps 2697.
Path 63 | total_timesteps 2727.
Path 64 | total_timesteps 2750.
Path 65 | total_timesteps 2769.
Path 66 | total_timesteps 2788.
Path 67 | total_timesteps 2824.
Path 68 | total_timesteps 2880.
Path 69 | total_timesteps 2943.
Path 70 | total_timesteps 2963.
Path 71 | total_timesteps 2999.
Path 72 | total_timesteps 3033.
Path 73 | total_timesteps 3092.
Path 74 | total_timesteps 3126.
Path 75 | total_timesteps 3151.
Path 76 | total_timesteps 3211.
Path 77 | total_timesteps 3239.
Path 78 | total_timesteps 3293.
Path 79 | total_timesteps 3351.
Path 80 | total_timesteps 3390.
Path 81 | total_timesteps 3443.
Path 82 | total_timesteps 3467.
Path 83 | total_timesteps 3522.
Path 84 | total_timesteps 3554.
Path 85 | total_timesteps 3568.
Path 86 | total_timesteps 3605.
Path 87 | total_timesteps 3647.
Path 88 | total_timesteps 3669.
Path 89 | total_timesteps 3696.
Path 90 | total_timesteps 3738.
Path 91 | total_timesteps 3764.
Path 92 | total_timesteps 3792.
Path 93 | total_timesteps 3824.
Path 94 | total_timesteps 3864.
Path 95 | total_timesteps 3916.
Path 96 | total_timesteps 3978.
Path 97 | total_timesteps 4021.
Path 98 | total_timesteps 4046.
Path 99 | total_timesteps 4108.
Path 100 | total_timesteps 4133.
Path 101 | total_timesteps 4166.
Path 102 | total_timesteps 4188.
Path 103 | total_timesteps 4209.
Path 104 | total_timesteps 4271.
Path 105 | total_timesteps 4305.
Path 106 | total_timesteps 4390.
Path 107 | total_timesteps 4420.
Path 108 | total_timesteps 4476.
Path 109 | total_timesteps 4520.
Path 110 | total_timesteps 4551.
Path 111 | total_timesteps 4579.
Path 112 | total_timesteps 4620.
Path 113 | total_timesteps 4666.
Path 114 | total_timesteps 4700.
Path 115 | total_timesteps 4733.
Path 116 | total_timesteps 4770.
Path 117 | total_timesteps 4796.
Path 118 | total_timesteps 4847.
Path 119 | total_timesteps 4872.
Path 120 | total_timesteps 4880.
Path 121 | total_timesteps 4903.
Path 122 | total_timesteps 4920.
Path 123 | total_timesteps 4943.
Path 124 | total_timesteps 4975.
Path 125 | total_timesteps 5055.
Path 126 | total_timesteps 5092.
Path 127 | total_timesteps 5176.
Path 128 | total_timesteps 5245.
Path 129 | total_timesteps 5276.
Path 130 | total_timesteps 5318.
Path 131 | total_timesteps 5377.
Path 132 | total_timesteps 5444.
Path 133 | total_timesteps 5488.
Path 134 | total_timesteps 5544.
Path 135 | total_timesteps 5586.
Path 136 | total_timesteps 5622.
Path 137 | total_timesteps 5673.
Path 138 | total_timesteps 5689.
Path 139 | total_timesteps 5726.
Path 140 | total_timesteps 5787.
Path 141 | total_timesteps 5830.
Path 142 | total_timesteps 5869.
Path 143 | total_timesteps 5910.
Path 144 | total_timesteps 5941.
Path 145 | total_timesteps 5977.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.9    |
| Iteration     | 18       |
| MaximumReturn | 19.1     |
| MinimumReturn | -58.8    |
| TotalSamples  | 80312    |
----------------------------
itr #19 | 
Fitting dynamics.
Validation loss = 0.40536394715309143
Validation loss = 0.40760746598243713
Validation loss = 0.41582226753234863
Validation loss = 0.41173839569091797
Validation loss = 0.4162727892398834
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 37.
Path 2 | total_timesteps 74.
Path 3 | total_timesteps 97.
Path 4 | total_timesteps 148.
Path 5 | total_timesteps 185.
Path 6 | total_timesteps 203.
Path 7 | total_timesteps 214.
Path 8 | total_timesteps 265.
Path 9 | total_timesteps 312.
Path 10 | total_timesteps 387.
Path 11 | total_timesteps 446.
Path 12 | total_timesteps 482.
Path 13 | total_timesteps 516.
Path 14 | total_timesteps 556.
Path 15 | total_timesteps 609.
Path 16 | total_timesteps 636.
Path 17 | total_timesteps 690.
Path 18 | total_timesteps 717.
Path 19 | total_timesteps 750.
Path 20 | total_timesteps 770.
Path 21 | total_timesteps 800.
Path 22 | total_timesteps 827.
Path 23 | total_timesteps 863.
Path 24 | total_timesteps 902.
Path 25 | total_timesteps 968.
Path 26 | total_timesteps 1028.
Path 27 | total_timesteps 1058.
Path 28 | total_timesteps 1109.
Path 29 | total_timesteps 1159.
Path 30 | total_timesteps 1192.
Path 31 | total_timesteps 1241.
Path 32 | total_timesteps 1297.
Path 33 | total_timesteps 1346.
Path 34 | total_timesteps 1378.
Path 35 | total_timesteps 1419.
Path 36 | total_timesteps 1462.
Path 37 | total_timesteps 1506.
Path 38 | total_timesteps 1530.
Path 39 | total_timesteps 1595.
Path 40 | total_timesteps 1618.
Path 41 | total_timesteps 1655.
Path 42 | total_timesteps 1673.
Path 43 | total_timesteps 1719.
Path 44 | total_timesteps 1763.
Path 45 | total_timesteps 1815.
Path 46 | total_timesteps 1864.
Path 47 | total_timesteps 1898.
Path 48 | total_timesteps 1935.
Path 49 | total_timesteps 1962.
Path 50 | total_timesteps 2006.
Path 51 | total_timesteps 2050.
Path 52 | total_timesteps 2077.
Path 53 | total_timesteps 2151.
Path 54 | total_timesteps 2191.
Path 55 | total_timesteps 2256.
Path 56 | total_timesteps 2296.
Path 57 | total_timesteps 2326.
Path 58 | total_timesteps 2369.
Path 59 | total_timesteps 2394.
Path 60 | total_timesteps 2452.
Path 61 | total_timesteps 2495.
Path 62 | total_timesteps 2535.
Path 63 | total_timesteps 2548.
Path 64 | total_timesteps 2575.
Path 65 | total_timesteps 2609.
Path 66 | total_timesteps 2646.
Path 67 | total_timesteps 2680.
Path 68 | total_timesteps 2751.
Path 69 | total_timesteps 2807.
Path 70 | total_timesteps 2848.
Path 71 | total_timesteps 2888.
Path 72 | total_timesteps 2910.
Path 73 | total_timesteps 2947.
Path 74 | total_timesteps 3006.
Path 75 | total_timesteps 3028.
Path 76 | total_timesteps 3070.
Path 77 | total_timesteps 3092.
Path 78 | total_timesteps 3140.
Path 79 | total_timesteps 3188.
Path 80 | total_timesteps 3238.
Path 81 | total_timesteps 3266.
Path 82 | total_timesteps 3276.
Path 83 | total_timesteps 3305.
Path 84 | total_timesteps 3352.
Path 85 | total_timesteps 3405.
Path 86 | total_timesteps 3439.
Path 87 | total_timesteps 3475.
Path 88 | total_timesteps 3504.
Path 89 | total_timesteps 3539.
Path 90 | total_timesteps 3562.
Path 91 | total_timesteps 3616.
Path 92 | total_timesteps 3645.
Path 93 | total_timesteps 3678.
Path 94 | total_timesteps 3705.
Path 95 | total_timesteps 3736.
Path 96 | total_timesteps 3764.
Path 97 | total_timesteps 3855.
Path 98 | total_timesteps 3891.
Path 99 | total_timesteps 3919.
Path 100 | total_timesteps 3939.
Path 101 | total_timesteps 3992.
Path 102 | total_timesteps 4054.
Path 103 | total_timesteps 4098.
Path 104 | total_timesteps 4121.
Path 105 | total_timesteps 4180.
Path 106 | total_timesteps 4214.
Path 107 | total_timesteps 4239.
Path 108 | total_timesteps 4274.
Path 109 | total_timesteps 4310.
Path 110 | total_timesteps 4348.
Path 111 | total_timesteps 4382.
Path 112 | total_timesteps 4464.
Path 113 | total_timesteps 4487.
Path 114 | total_timesteps 4522.
Path 115 | total_timesteps 4571.
Path 116 | total_timesteps 4615.
Path 117 | total_timesteps 4657.
Path 118 | total_timesteps 4709.
Path 119 | total_timesteps 4739.
Path 120 | total_timesteps 4774.
Path 121 | total_timesteps 4816.
Path 122 | total_timesteps 4855.
Path 123 | total_timesteps 4888.
Path 124 | total_timesteps 4930.
Path 125 | total_timesteps 4952.
Path 126 | total_timesteps 4971.
Path 127 | total_timesteps 5023.
Path 128 | total_timesteps 5061.
Path 129 | total_timesteps 5085.
Path 130 | total_timesteps 5102.
Path 131 | total_timesteps 5113.
Path 132 | total_timesteps 5146.
Path 133 | total_timesteps 5202.
Path 134 | total_timesteps 5248.
Path 135 | total_timesteps 5267.
Path 136 | total_timesteps 5334.
Path 137 | total_timesteps 5367.
Path 138 | total_timesteps 5420.
Path 139 | total_timesteps 5448.
Path 140 | total_timesteps 5480.
Path 141 | total_timesteps 5518.
Path 142 | total_timesteps 5573.
Path 143 | total_timesteps 5591.
Path 144 | total_timesteps 5629.
Path 145 | total_timesteps 5668.
Path 146 | total_timesteps 5695.
Path 147 | total_timesteps 5729.
Path 148 | total_timesteps 5789.
Path 149 | total_timesteps 5808.
Path 150 | total_timesteps 5857.
Path 151 | total_timesteps 5908.
Path 152 | total_timesteps 5931.
Path 153 | total_timesteps 5967.
Path 154 | total_timesteps 5992.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -18.4    |
| Iteration     | 19       |
| MaximumReturn | 5.54     |
| MinimumReturn | -42.3    |
| TotalSamples  | 84316    |
----------------------------
itr #20 | 
Fitting dynamics.
Validation loss = 0.40255293250083923
Validation loss = 0.40875425934791565
Validation loss = 0.4123891294002533
Validation loss = 0.41523662209510803
Validation loss = 0.4137870669364929
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 43.
Path 2 | total_timesteps 67.
Path 3 | total_timesteps 115.
Path 4 | total_timesteps 137.
Path 5 | total_timesteps 165.
Path 6 | total_timesteps 194.
Path 7 | total_timesteps 228.
Path 8 | total_timesteps 268.
Path 9 | total_timesteps 304.
Path 10 | total_timesteps 313.
Path 11 | total_timesteps 332.
Path 12 | total_timesteps 364.
Path 13 | total_timesteps 436.
Path 14 | total_timesteps 457.
Path 15 | total_timesteps 497.
Path 16 | total_timesteps 543.
Path 17 | total_timesteps 578.
Path 18 | total_timesteps 617.
Path 19 | total_timesteps 657.
Path 20 | total_timesteps 681.
Path 21 | total_timesteps 698.
Path 22 | total_timesteps 761.
Path 23 | total_timesteps 804.
Path 24 | total_timesteps 859.
Path 25 | total_timesteps 885.
Path 26 | total_timesteps 920.
Path 27 | total_timesteps 946.
Path 28 | total_timesteps 990.
Path 29 | total_timesteps 1019.
Path 30 | total_timesteps 1085.
Path 31 | total_timesteps 1188.
Path 32 | total_timesteps 1203.
Path 33 | total_timesteps 1233.
Path 34 | total_timesteps 1272.
Path 35 | total_timesteps 1292.
Path 36 | total_timesteps 1321.
Path 37 | total_timesteps 1351.
Path 38 | total_timesteps 1376.
Path 39 | total_timesteps 1401.
Path 40 | total_timesteps 1508.
Path 41 | total_timesteps 1550.
Path 42 | total_timesteps 1581.
Path 43 | total_timesteps 1649.
Path 44 | total_timesteps 1690.
Path 45 | total_timesteps 1719.
Path 46 | total_timesteps 1760.
Path 47 | total_timesteps 1804.
Path 48 | total_timesteps 1830.
Path 49 | total_timesteps 1882.
Path 50 | total_timesteps 1915.
Path 51 | total_timesteps 1967.
Path 52 | total_timesteps 2050.
Path 53 | total_timesteps 2115.
Path 54 | total_timesteps 2138.
Path 55 | total_timesteps 2164.
Path 56 | total_timesteps 2186.
Path 57 | total_timesteps 2213.
Path 58 | total_timesteps 2248.
Path 59 | total_timesteps 2280.
Path 60 | total_timesteps 2299.
Path 61 | total_timesteps 2330.
Path 62 | total_timesteps 2383.
Path 63 | total_timesteps 2422.
Path 64 | total_timesteps 2461.
Path 65 | total_timesteps 2485.
Path 66 | total_timesteps 2524.
Path 67 | total_timesteps 2551.
Path 68 | total_timesteps 2583.
Path 69 | total_timesteps 2598.
Path 70 | total_timesteps 2624.
Path 71 | total_timesteps 2661.
Path 72 | total_timesteps 2698.
Path 73 | total_timesteps 2723.
Path 74 | total_timesteps 2858.
Path 75 | total_timesteps 2892.
Path 76 | total_timesteps 2937.
Path 77 | total_timesteps 2951.
Path 78 | total_timesteps 3019.
Path 79 | total_timesteps 3044.
Path 80 | total_timesteps 3064.
Path 81 | total_timesteps 3103.
Path 82 | total_timesteps 3143.
Path 83 | total_timesteps 3214.
Path 84 | total_timesteps 3253.
Path 85 | total_timesteps 3292.
Path 86 | total_timesteps 3340.
Path 87 | total_timesteps 3380.
Path 88 | total_timesteps 3414.
Path 89 | total_timesteps 3438.
Path 90 | total_timesteps 3468.
Path 91 | total_timesteps 3533.
Path 92 | total_timesteps 3594.
Path 93 | total_timesteps 3639.
Path 94 | total_timesteps 3676.
Path 95 | total_timesteps 3729.
Path 96 | total_timesteps 3769.
Path 97 | total_timesteps 3795.
Path 98 | total_timesteps 3840.
Path 99 | total_timesteps 3893.
Path 100 | total_timesteps 3935.
Path 101 | total_timesteps 3968.
Path 102 | total_timesteps 4033.
Path 103 | total_timesteps 4065.
Path 104 | total_timesteps 4091.
Path 105 | total_timesteps 4126.
Path 106 | total_timesteps 4163.
Path 107 | total_timesteps 4210.
Path 108 | total_timesteps 4262.
Path 109 | total_timesteps 4304.
Path 110 | total_timesteps 4337.
Path 111 | total_timesteps 4393.
Path 112 | total_timesteps 4435.
Path 113 | total_timesteps 4483.
Path 114 | total_timesteps 4516.
Path 115 | total_timesteps 4547.
Path 116 | total_timesteps 4606.
Path 117 | total_timesteps 4636.
Path 118 | total_timesteps 4668.
Path 119 | total_timesteps 4696.
Path 120 | total_timesteps 4738.
Path 121 | total_timesteps 4774.
Path 122 | total_timesteps 4800.
Path 123 | total_timesteps 4823.
Path 124 | total_timesteps 4881.
Path 125 | total_timesteps 4912.
Path 126 | total_timesteps 4973.
Path 127 | total_timesteps 4987.
Path 128 | total_timesteps 5033.
Path 129 | total_timesteps 5077.
Path 130 | total_timesteps 5119.
Path 131 | total_timesteps 5151.
Path 132 | total_timesteps 5202.
Path 133 | total_timesteps 5250.
Path 134 | total_timesteps 5298.
Path 135 | total_timesteps 5347.
Path 136 | total_timesteps 5356.
Path 137 | total_timesteps 5390.
Path 138 | total_timesteps 5447.
Path 139 | total_timesteps 5488.
Path 140 | total_timesteps 5501.
Path 141 | total_timesteps 5562.
Path 142 | total_timesteps 5591.
Path 143 | total_timesteps 5621.
Path 144 | total_timesteps 5646.
Path 145 | total_timesteps 5672.
Path 146 | total_timesteps 5724.
Path 147 | total_timesteps 5751.
Path 148 | total_timesteps 5813.
Path 149 | total_timesteps 5838.
Path 150 | total_timesteps 5867.
Path 151 | total_timesteps 5887.
Path 152 | total_timesteps 5941.
Path 153 | total_timesteps 5955.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -17.6    |
| Iteration     | 20       |
| MaximumReturn | 34.8     |
| MinimumReturn | -58.8    |
| TotalSamples  | 88331    |
----------------------------
itr #21 | 
Fitting dynamics.
Validation loss = 0.4034612774848938
Validation loss = 0.4091714024543762
Validation loss = 0.4136687219142914
Validation loss = 0.41471242904663086
Validation loss = 0.41193294525146484
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 31.
Path 2 | total_timesteps 67.
Path 3 | total_timesteps 93.
Path 4 | total_timesteps 123.
Path 5 | total_timesteps 163.
Path 6 | total_timesteps 183.
Path 7 | total_timesteps 200.
Path 8 | total_timesteps 235.
Path 9 | total_timesteps 299.
Path 10 | total_timesteps 322.
Path 11 | total_timesteps 418.
Path 12 | total_timesteps 458.
Path 13 | total_timesteps 512.
Path 14 | total_timesteps 553.
Path 15 | total_timesteps 619.
Path 16 | total_timesteps 650.
Path 17 | total_timesteps 687.
Path 18 | total_timesteps 714.
Path 19 | total_timesteps 771.
Path 20 | total_timesteps 788.
Path 21 | total_timesteps 843.
Path 22 | total_timesteps 886.
Path 23 | total_timesteps 909.
Path 24 | total_timesteps 940.
Path 25 | total_timesteps 968.
Path 26 | total_timesteps 996.
Path 27 | total_timesteps 1044.
Path 28 | total_timesteps 1081.
Path 29 | total_timesteps 1138.
Path 30 | total_timesteps 1258.
Path 31 | total_timesteps 1306.
Path 32 | total_timesteps 1339.
Path 33 | total_timesteps 1367.
Path 34 | total_timesteps 1408.
Path 35 | total_timesteps 1462.
Path 36 | total_timesteps 1506.
Path 37 | total_timesteps 1531.
Path 38 | total_timesteps 1561.
Path 39 | total_timesteps 1592.
Path 40 | total_timesteps 1635.
Path 41 | total_timesteps 1660.
Path 42 | total_timesteps 1700.
Path 43 | total_timesteps 1724.
Path 44 | total_timesteps 1758.
Path 45 | total_timesteps 1782.
Path 46 | total_timesteps 1825.
Path 47 | total_timesteps 1873.
Path 48 | total_timesteps 1894.
Path 49 | total_timesteps 1925.
Path 50 | total_timesteps 1964.
Path 51 | total_timesteps 1991.
Path 52 | total_timesteps 2012.
Path 53 | total_timesteps 2082.
Path 54 | total_timesteps 2123.
Path 55 | total_timesteps 2149.
Path 56 | total_timesteps 2203.
Path 57 | total_timesteps 2244.
Path 58 | total_timesteps 2281.
Path 59 | total_timesteps 2331.
Path 60 | total_timesteps 2379.
Path 61 | total_timesteps 2401.
Path 62 | total_timesteps 2445.
Path 63 | total_timesteps 2504.
Path 64 | total_timesteps 2522.
Path 65 | total_timesteps 2538.
Path 66 | total_timesteps 2590.
Path 67 | total_timesteps 2627.
Path 68 | total_timesteps 2661.
Path 69 | total_timesteps 2691.
Path 70 | total_timesteps 2720.
Path 71 | total_timesteps 2761.
Path 72 | total_timesteps 2817.
Path 73 | total_timesteps 2869.
Path 74 | total_timesteps 2895.
Path 75 | total_timesteps 2907.
Path 76 | total_timesteps 2993.
Path 77 | total_timesteps 3028.
Path 78 | total_timesteps 3079.
Path 79 | total_timesteps 3104.
Path 80 | total_timesteps 3136.
Path 81 | total_timesteps 3169.
Path 82 | total_timesteps 3189.
Path 83 | total_timesteps 3229.
Path 84 | total_timesteps 3292.
Path 85 | total_timesteps 3372.
Path 86 | total_timesteps 3399.
Path 87 | total_timesteps 3414.
Path 88 | total_timesteps 3447.
Path 89 | total_timesteps 3478.
Path 90 | total_timesteps 3523.
Path 91 | total_timesteps 3569.
Path 92 | total_timesteps 3614.
Path 93 | total_timesteps 3652.
Path 94 | total_timesteps 3686.
Path 95 | total_timesteps 3714.
Path 96 | total_timesteps 3732.
Path 97 | total_timesteps 3861.
Path 98 | total_timesteps 3898.
Path 99 | total_timesteps 3929.
Path 100 | total_timesteps 3974.
Path 101 | total_timesteps 4006.
Path 102 | total_timesteps 4034.
Path 103 | total_timesteps 4069.
Path 104 | total_timesteps 4121.
Path 105 | total_timesteps 4145.
Path 106 | total_timesteps 4171.
Path 107 | total_timesteps 4214.
Path 108 | total_timesteps 4251.
Path 109 | total_timesteps 4285.
Path 110 | total_timesteps 4320.
Path 111 | total_timesteps 4343.
Path 112 | total_timesteps 4366.
Path 113 | total_timesteps 4398.
Path 114 | total_timesteps 4427.
Path 115 | total_timesteps 4471.
Path 116 | total_timesteps 4491.
Path 117 | total_timesteps 4553.
Path 118 | total_timesteps 4596.
Path 119 | total_timesteps 4631.
Path 120 | total_timesteps 4664.
Path 121 | total_timesteps 4683.
Path 122 | total_timesteps 4744.
Path 123 | total_timesteps 4772.
Path 124 | total_timesteps 4838.
Path 125 | total_timesteps 4881.
Path 126 | total_timesteps 4909.
Path 127 | total_timesteps 4945.
Path 128 | total_timesteps 4977.
Path 129 | total_timesteps 5032.
Path 130 | total_timesteps 5067.
Path 131 | total_timesteps 5097.
Path 132 | total_timesteps 5138.
Path 133 | total_timesteps 5182.
Path 134 | total_timesteps 5209.
Path 135 | total_timesteps 5242.
Path 136 | total_timesteps 5278.
Path 137 | total_timesteps 5338.
Path 138 | total_timesteps 5352.
Path 139 | total_timesteps 5420.
Path 140 | total_timesteps 5451.
Path 141 | total_timesteps 5464.
Path 142 | total_timesteps 5504.
Path 143 | total_timesteps 5540.
Path 144 | total_timesteps 5596.
Path 145 | total_timesteps 5641.
Path 146 | total_timesteps 5689.
Path 147 | total_timesteps 5730.
Path 148 | total_timesteps 5780.
Path 149 | total_timesteps 5846.
Path 150 | total_timesteps 5904.
Path 151 | total_timesteps 5934.
Path 152 | total_timesteps 5954.
Path 153 | total_timesteps 5994.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -17.7    |
| Iteration     | 21       |
| MaximumReturn | 21.5     |
| MinimumReturn | -54.2    |
| TotalSamples  | 92357    |
----------------------------
itr #22 | 
Fitting dynamics.
Validation loss = 0.4040486216545105
Validation loss = 0.40779247879981995
Validation loss = 0.4101993441581726
Validation loss = 0.41072869300842285
Validation loss = 0.4108525514602661
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 30.
Path 2 | total_timesteps 64.
Path 3 | total_timesteps 89.
Path 4 | total_timesteps 146.
Path 5 | total_timesteps 175.
Path 6 | total_timesteps 205.
Path 7 | total_timesteps 241.
Path 8 | total_timesteps 276.
Path 9 | total_timesteps 315.
Path 10 | total_timesteps 360.
Path 11 | total_timesteps 413.
Path 12 | total_timesteps 505.
Path 13 | total_timesteps 528.
Path 14 | total_timesteps 552.
Path 15 | total_timesteps 586.
Path 16 | total_timesteps 610.
Path 17 | total_timesteps 644.
Path 18 | total_timesteps 658.
Path 19 | total_timesteps 682.
Path 20 | total_timesteps 734.
Path 21 | total_timesteps 762.
Path 22 | total_timesteps 808.
Path 23 | total_timesteps 829.
Path 24 | total_timesteps 854.
Path 25 | total_timesteps 877.
Path 26 | total_timesteps 894.
Path 27 | total_timesteps 933.
Path 28 | total_timesteps 1013.
Path 29 | total_timesteps 1066.
Path 30 | total_timesteps 1099.
Path 31 | total_timesteps 1123.
Path 32 | total_timesteps 1157.
Path 33 | total_timesteps 1214.
Path 34 | total_timesteps 1234.
Path 35 | total_timesteps 1270.
Path 36 | total_timesteps 1312.
Path 37 | total_timesteps 1339.
Path 38 | total_timesteps 1358.
Path 39 | total_timesteps 1414.
Path 40 | total_timesteps 1458.
Path 41 | total_timesteps 1478.
Path 42 | total_timesteps 1511.
Path 43 | total_timesteps 1555.
Path 44 | total_timesteps 1580.
Path 45 | total_timesteps 1641.
Path 46 | total_timesteps 1683.
Path 47 | total_timesteps 1734.
Path 48 | total_timesteps 1779.
Path 49 | total_timesteps 1813.
Path 50 | total_timesteps 1855.
Path 51 | total_timesteps 1885.
Path 52 | total_timesteps 1922.
Path 53 | total_timesteps 1954.
Path 54 | total_timesteps 1988.
Path 55 | total_timesteps 2018.
Path 56 | total_timesteps 2059.
Path 57 | total_timesteps 2101.
Path 58 | total_timesteps 2153.
Path 59 | total_timesteps 2179.
Path 60 | total_timesteps 2218.
Path 61 | total_timesteps 2251.
Path 62 | total_timesteps 2276.
Path 63 | total_timesteps 2319.
Path 64 | total_timesteps 2349.
Path 65 | total_timesteps 2359.
Path 66 | total_timesteps 2393.
Path 67 | total_timesteps 2439.
Path 68 | total_timesteps 2514.
Path 69 | total_timesteps 2555.
Path 70 | total_timesteps 2602.
Path 71 | total_timesteps 2648.
Path 72 | total_timesteps 2707.
Path 73 | total_timesteps 2757.
Path 74 | total_timesteps 2789.
Path 75 | total_timesteps 2803.
Path 76 | total_timesteps 2845.
Path 77 | total_timesteps 2889.
Path 78 | total_timesteps 2922.
Path 79 | total_timesteps 2940.
Path 80 | total_timesteps 2963.
Path 81 | total_timesteps 2993.
Path 82 | total_timesteps 3028.
Path 83 | total_timesteps 3195.
Path 84 | total_timesteps 3233.
Path 85 | total_timesteps 3248.
Path 86 | total_timesteps 3272.
Path 87 | total_timesteps 3303.
Path 88 | total_timesteps 3347.
Path 89 | total_timesteps 3362.
Path 90 | total_timesteps 3395.
Path 91 | total_timesteps 3432.
Path 92 | total_timesteps 3463.
Path 93 | total_timesteps 3473.
Path 94 | total_timesteps 3525.
Path 95 | total_timesteps 3568.
Path 96 | total_timesteps 3620.
Path 97 | total_timesteps 3662.
Path 98 | total_timesteps 3687.
Path 99 | total_timesteps 3729.
Path 100 | total_timesteps 3785.
Path 101 | total_timesteps 3862.
Path 102 | total_timesteps 3885.
Path 103 | total_timesteps 3932.
Path 104 | total_timesteps 3987.
Path 105 | total_timesteps 4022.
Path 106 | total_timesteps 4061.
Path 107 | total_timesteps 4098.
Path 108 | total_timesteps 4126.
Path 109 | total_timesteps 4150.
Path 110 | total_timesteps 4186.
Path 111 | total_timesteps 4201.
Path 112 | total_timesteps 4226.
Path 113 | total_timesteps 4258.
Path 114 | total_timesteps 4301.
Path 115 | total_timesteps 4324.
Path 116 | total_timesteps 4443.
Path 117 | total_timesteps 4477.
Path 118 | total_timesteps 4522.
Path 119 | total_timesteps 4546.
Path 120 | total_timesteps 4567.
Path 121 | total_timesteps 4607.
Path 122 | total_timesteps 4632.
Path 123 | total_timesteps 4660.
Path 124 | total_timesteps 4706.
Path 125 | total_timesteps 4725.
Path 126 | total_timesteps 4760.
Path 127 | total_timesteps 4801.
Path 128 | total_timesteps 4832.
Path 129 | total_timesteps 4870.
Path 130 | total_timesteps 4913.
Path 131 | total_timesteps 4946.
Path 132 | total_timesteps 4967.
Path 133 | total_timesteps 4990.
Path 134 | total_timesteps 5001.
Path 135 | total_timesteps 5038.
Path 136 | total_timesteps 5078.
Path 137 | total_timesteps 5118.
Path 138 | total_timesteps 5157.
Path 139 | total_timesteps 5204.
Path 140 | total_timesteps 5248.
Path 141 | total_timesteps 5286.
Path 142 | total_timesteps 5301.
Path 143 | total_timesteps 5367.
Path 144 | total_timesteps 5396.
Path 145 | total_timesteps 5428.
Path 146 | total_timesteps 5458.
Path 147 | total_timesteps 5516.
Path 148 | total_timesteps 5533.
Path 149 | total_timesteps 5591.
Path 150 | total_timesteps 5639.
Path 151 | total_timesteps 5678.
Path 152 | total_timesteps 5702.
Path 153 | total_timesteps 5755.
Path 154 | total_timesteps 5807.
Path 155 | total_timesteps 5844.
Path 156 | total_timesteps 5863.
Path 157 | total_timesteps 5908.
Path 158 | total_timesteps 5936.
Path 159 | total_timesteps 5970.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -15.5    |
| Iteration     | 22       |
| MaximumReturn | 29.1     |
| MinimumReturn | -89.1    |
| TotalSamples  | 96370    |
----------------------------
itr #23 | 
Fitting dynamics.
Validation loss = 0.4005568027496338
Validation loss = 0.4078940451145172
Validation loss = 0.4093533754348755
Validation loss = 0.41025933623313904
Validation loss = 0.40978729724884033
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 45.
Path 2 | total_timesteps 76.
Path 3 | total_timesteps 139.
Path 4 | total_timesteps 167.
Path 5 | total_timesteps 208.
Path 6 | total_timesteps 259.
Path 7 | total_timesteps 275.
Path 8 | total_timesteps 305.
Path 9 | total_timesteps 335.
Path 10 | total_timesteps 360.
Path 11 | total_timesteps 381.
Path 12 | total_timesteps 453.
Path 13 | total_timesteps 483.
Path 14 | total_timesteps 504.
Path 15 | total_timesteps 550.
Path 16 | total_timesteps 579.
Path 17 | total_timesteps 620.
Path 18 | total_timesteps 659.
Path 19 | total_timesteps 701.
Path 20 | total_timesteps 738.
Path 21 | total_timesteps 790.
Path 22 | total_timesteps 822.
Path 23 | total_timesteps 850.
Path 24 | total_timesteps 895.
Path 25 | total_timesteps 953.
Path 26 | total_timesteps 981.
Path 27 | total_timesteps 1018.
Path 28 | total_timesteps 1075.
Path 29 | total_timesteps 1102.
Path 30 | total_timesteps 1158.
Path 31 | total_timesteps 1191.
Path 32 | total_timesteps 1246.
Path 33 | total_timesteps 1290.
Path 34 | total_timesteps 1335.
Path 35 | total_timesteps 1376.
Path 36 | total_timesteps 1404.
Path 37 | total_timesteps 1434.
Path 38 | total_timesteps 1482.
Path 39 | total_timesteps 1511.
Path 40 | total_timesteps 1533.
Path 41 | total_timesteps 1548.
Path 42 | total_timesteps 1574.
Path 43 | total_timesteps 1636.
Path 44 | total_timesteps 1695.
Path 45 | total_timesteps 1743.
Path 46 | total_timesteps 1796.
Path 47 | total_timesteps 1817.
Path 48 | total_timesteps 1869.
Path 49 | total_timesteps 1909.
Path 50 | total_timesteps 1943.
Path 51 | total_timesteps 1987.
Path 52 | total_timesteps 2017.
Path 53 | total_timesteps 2049.
Path 54 | total_timesteps 2075.
Path 55 | total_timesteps 2086.
Path 56 | total_timesteps 2116.
Path 57 | total_timesteps 2163.
Path 58 | total_timesteps 2182.
Path 59 | total_timesteps 2236.
Path 60 | total_timesteps 2300.
Path 61 | total_timesteps 2359.
Path 62 | total_timesteps 2398.
Path 63 | total_timesteps 2424.
Path 64 | total_timesteps 2452.
Path 65 | total_timesteps 2468.
Path 66 | total_timesteps 2516.
Path 67 | total_timesteps 2582.
Path 68 | total_timesteps 2625.
Path 69 | total_timesteps 2646.
Path 70 | total_timesteps 2689.
Path 71 | total_timesteps 2726.
Path 72 | total_timesteps 2750.
Path 73 | total_timesteps 2777.
Path 74 | total_timesteps 2819.
Path 75 | total_timesteps 2848.
Path 76 | total_timesteps 2876.
Path 77 | total_timesteps 2896.
Path 78 | total_timesteps 2940.
Path 79 | total_timesteps 2963.
Path 80 | total_timesteps 2987.
Path 81 | total_timesteps 3009.
Path 82 | total_timesteps 3058.
Path 83 | total_timesteps 3088.
Path 84 | total_timesteps 3115.
Path 85 | total_timesteps 3146.
Path 86 | total_timesteps 3198.
Path 87 | total_timesteps 3253.
Path 88 | total_timesteps 3280.
Path 89 | total_timesteps 3321.
Path 90 | total_timesteps 3375.
Path 91 | total_timesteps 3417.
Path 92 | total_timesteps 3438.
Path 93 | total_timesteps 3471.
Path 94 | total_timesteps 3494.
Path 95 | total_timesteps 3544.
Path 96 | total_timesteps 3583.
Path 97 | total_timesteps 3613.
Path 98 | total_timesteps 3633.
Path 99 | total_timesteps 3646.
Path 100 | total_timesteps 3661.
Path 101 | total_timesteps 3713.
Path 102 | total_timesteps 3734.
Path 103 | total_timesteps 3766.
Path 104 | total_timesteps 3789.
Path 105 | total_timesteps 3827.
Path 106 | total_timesteps 3865.
Path 107 | total_timesteps 3898.
Path 108 | total_timesteps 3920.
Path 109 | total_timesteps 3957.
Path 110 | total_timesteps 3978.
Path 111 | total_timesteps 4005.
Path 112 | total_timesteps 4056.
Path 113 | total_timesteps 4095.
Path 114 | total_timesteps 4156.
Path 115 | total_timesteps 4218.
Path 116 | total_timesteps 4257.
Path 117 | total_timesteps 4289.
Path 118 | total_timesteps 4316.
Path 119 | total_timesteps 4344.
Path 120 | total_timesteps 4376.
Path 121 | total_timesteps 4413.
Path 122 | total_timesteps 4445.
Path 123 | total_timesteps 4474.
Path 124 | total_timesteps 4506.
Path 125 | total_timesteps 4553.
Path 126 | total_timesteps 4598.
Path 127 | total_timesteps 4657.
Path 128 | total_timesteps 4709.
Path 129 | total_timesteps 4742.
Path 130 | total_timesteps 4787.
Path 131 | total_timesteps 4811.
Path 132 | total_timesteps 4853.
Path 133 | total_timesteps 4910.
Path 134 | total_timesteps 4941.
Path 135 | total_timesteps 4959.
Path 136 | total_timesteps 4994.
Path 137 | total_timesteps 5037.
Path 138 | total_timesteps 5064.
Path 139 | total_timesteps 5094.
Path 140 | total_timesteps 5125.
Path 141 | total_timesteps 5162.
Path 142 | total_timesteps 5243.
Path 143 | total_timesteps 5289.
Path 144 | total_timesteps 5326.
Path 145 | total_timesteps 5365.
Path 146 | total_timesteps 5395.
Path 147 | total_timesteps 5428.
Path 148 | total_timesteps 5454.
Path 149 | total_timesteps 5480.
Path 150 | total_timesteps 5516.
Path 151 | total_timesteps 5544.
Path 152 | total_timesteps 5584.
Path 153 | total_timesteps 5634.
Path 154 | total_timesteps 5688.
Path 155 | total_timesteps 5706.
Path 156 | total_timesteps 5743.
Path 157 | total_timesteps 5796.
Path 158 | total_timesteps 5843.
Path 159 | total_timesteps 5873.
Path 160 | total_timesteps 5915.
Path 161 | total_timesteps 5946.
Path 162 | total_timesteps 5966.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.4    |
| Iteration     | 23       |
| MaximumReturn | 6.89     |
| MinimumReturn | -35.2    |
| TotalSamples  | 100372   |
----------------------------
itr #24 | 
Fitting dynamics.
Validation loss = 0.39905959367752075
Validation loss = 0.4058806300163269
Validation loss = 0.4090697765350342
Validation loss = 0.40711456537246704
Validation loss = 0.40781840682029724
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 34.
Path 2 | total_timesteps 49.
Path 3 | total_timesteps 71.
Path 4 | total_timesteps 93.
Path 5 | total_timesteps 131.
Path 6 | total_timesteps 178.
Path 7 | total_timesteps 214.
Path 8 | total_timesteps 307.
Path 9 | total_timesteps 356.
Path 10 | total_timesteps 393.
Path 11 | total_timesteps 415.
Path 12 | total_timesteps 558.
Path 13 | total_timesteps 586.
Path 14 | total_timesteps 629.
Path 15 | total_timesteps 684.
Path 16 | total_timesteps 766.
Path 17 | total_timesteps 811.
Path 18 | total_timesteps 846.
Path 19 | total_timesteps 877.
Path 20 | total_timesteps 909.
Path 21 | total_timesteps 943.
Path 22 | total_timesteps 983.
Path 23 | total_timesteps 1043.
Path 24 | total_timesteps 1073.
Path 25 | total_timesteps 1099.
Path 26 | total_timesteps 1135.
Path 27 | total_timesteps 1174.
Path 28 | total_timesteps 1223.
Path 29 | total_timesteps 1276.
Path 30 | total_timesteps 1308.
Path 31 | total_timesteps 1354.
Path 32 | total_timesteps 1431.
Path 33 | total_timesteps 1463.
Path 34 | total_timesteps 1499.
Path 35 | total_timesteps 1548.
Path 36 | total_timesteps 1575.
Path 37 | total_timesteps 1592.
Path 38 | total_timesteps 1648.
Path 39 | total_timesteps 1697.
Path 40 | total_timesteps 1738.
Path 41 | total_timesteps 1756.
Path 42 | total_timesteps 1790.
Path 43 | total_timesteps 1830.
Path 44 | total_timesteps 1861.
Path 45 | total_timesteps 1893.
Path 46 | total_timesteps 1963.
Path 47 | total_timesteps 2024.
Path 48 | total_timesteps 2089.
Path 49 | total_timesteps 2150.
Path 50 | total_timesteps 2187.
Path 51 | total_timesteps 2249.
Path 52 | total_timesteps 2277.
Path 53 | total_timesteps 2310.
Path 54 | total_timesteps 2378.
Path 55 | total_timesteps 2416.
Path 56 | total_timesteps 2456.
Path 57 | total_timesteps 2497.
Path 58 | total_timesteps 2517.
Path 59 | total_timesteps 2552.
Path 60 | total_timesteps 2600.
Path 61 | total_timesteps 2649.
Path 62 | total_timesteps 2676.
Path 63 | total_timesteps 2702.
Path 64 | total_timesteps 2727.
Path 65 | total_timesteps 2780.
Path 66 | total_timesteps 2794.
Path 67 | total_timesteps 2893.
Path 68 | total_timesteps 2960.
Path 69 | total_timesteps 3006.
Path 70 | total_timesteps 3036.
Path 71 | total_timesteps 3057.
Path 72 | total_timesteps 3097.
Path 73 | total_timesteps 3152.
Path 74 | total_timesteps 3177.
Path 75 | total_timesteps 3212.
Path 76 | total_timesteps 3246.
Path 77 | total_timesteps 3296.
Path 78 | total_timesteps 3326.
Path 79 | total_timesteps 3384.
Path 80 | total_timesteps 3424.
Path 81 | total_timesteps 3465.
Path 82 | total_timesteps 3517.
Path 83 | total_timesteps 3552.
Path 84 | total_timesteps 3609.
Path 85 | total_timesteps 3628.
Path 86 | total_timesteps 3655.
Path 87 | total_timesteps 3676.
Path 88 | total_timesteps 3713.
Path 89 | total_timesteps 3789.
Path 90 | total_timesteps 3834.
Path 91 | total_timesteps 3894.
Path 92 | total_timesteps 3934.
Path 93 | total_timesteps 3993.
Path 94 | total_timesteps 4040.
Path 95 | total_timesteps 4086.
Path 96 | total_timesteps 4133.
Path 97 | total_timesteps 4164.
Path 98 | total_timesteps 4188.
Path 99 | total_timesteps 4241.
Path 100 | total_timesteps 4288.
Path 101 | total_timesteps 4336.
Path 102 | total_timesteps 4382.
Path 103 | total_timesteps 4425.
Path 104 | total_timesteps 4462.
Path 105 | total_timesteps 4501.
Path 106 | total_timesteps 4574.
Path 107 | total_timesteps 4613.
Path 108 | total_timesteps 4633.
Path 109 | total_timesteps 4676.
Path 110 | total_timesteps 4722.
Path 111 | total_timesteps 4776.
Path 112 | total_timesteps 4855.
Path 113 | total_timesteps 4897.
Path 114 | total_timesteps 4952.
Path 115 | total_timesteps 4993.
Path 116 | total_timesteps 5033.
Path 117 | total_timesteps 5083.
Path 118 | total_timesteps 5139.
Path 119 | total_timesteps 5172.
Path 120 | total_timesteps 5197.
Path 121 | total_timesteps 5271.
Path 122 | total_timesteps 5305.
Path 123 | total_timesteps 5332.
Path 124 | total_timesteps 5365.
Path 125 | total_timesteps 5383.
Path 126 | total_timesteps 5422.
Path 127 | total_timesteps 5445.
Path 128 | total_timesteps 5505.
Path 129 | total_timesteps 5543.
Path 130 | total_timesteps 5588.
Path 131 | total_timesteps 5627.
Path 132 | total_timesteps 5659.
Path 133 | total_timesteps 5705.
Path 134 | total_timesteps 5752.
Path 135 | total_timesteps 5803.
Path 136 | total_timesteps 5840.
Path 137 | total_timesteps 5869.
Path 138 | total_timesteps 5907.
Path 139 | total_timesteps 5941.
Path 140 | total_timesteps 5955.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -18.5    |
| Iteration     | 24       |
| MaximumReturn | 123      |
| MinimumReturn | -54.8    |
| TotalSamples  | 104385   |
----------------------------
itr #25 | 
Fitting dynamics.
Validation loss = 0.40245959162712097
Validation loss = 0.40411967039108276
Validation loss = 0.40614283084869385
Validation loss = 0.4108336269855499
Validation loss = 0.4084851145744324
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 66.
Path 2 | total_timesteps 96.
Path 3 | total_timesteps 154.
Path 4 | total_timesteps 224.
Path 5 | total_timesteps 257.
Path 6 | total_timesteps 301.
Path 7 | total_timesteps 351.
Path 8 | total_timesteps 378.
Path 9 | total_timesteps 398.
Path 10 | total_timesteps 436.
Path 11 | total_timesteps 506.
Path 12 | total_timesteps 554.
Path 13 | total_timesteps 595.
Path 14 | total_timesteps 645.
Path 15 | total_timesteps 695.
Path 16 | total_timesteps 767.
Path 17 | total_timesteps 811.
Path 18 | total_timesteps 840.
Path 19 | total_timesteps 865.
Path 20 | total_timesteps 890.
Path 21 | total_timesteps 939.
Path 22 | total_timesteps 970.
Path 23 | total_timesteps 1010.
Path 24 | total_timesteps 1045.
Path 25 | total_timesteps 1076.
Path 26 | total_timesteps 1103.
Path 27 | total_timesteps 1135.
Path 28 | total_timesteps 1180.
Path 29 | total_timesteps 1216.
Path 30 | total_timesteps 1272.
Path 31 | total_timesteps 1333.
Path 32 | total_timesteps 1370.
Path 33 | total_timesteps 1396.
Path 34 | total_timesteps 1420.
Path 35 | total_timesteps 1441.
Path 36 | total_timesteps 1466.
Path 37 | total_timesteps 1518.
Path 38 | total_timesteps 1552.
Path 39 | total_timesteps 1598.
Path 40 | total_timesteps 1656.
Path 41 | total_timesteps 1700.
Path 42 | total_timesteps 1723.
Path 43 | total_timesteps 1752.
Path 44 | total_timesteps 1786.
Path 45 | total_timesteps 1822.
Path 46 | total_timesteps 1850.
Path 47 | total_timesteps 1881.
Path 48 | total_timesteps 1940.
Path 49 | total_timesteps 1959.
Path 50 | total_timesteps 1987.
Path 51 | total_timesteps 2020.
Path 52 | total_timesteps 2052.
Path 53 | total_timesteps 2092.
Path 54 | total_timesteps 2135.
Path 55 | total_timesteps 2174.
Path 56 | total_timesteps 2259.
Path 57 | total_timesteps 2294.
Path 58 | total_timesteps 2326.
Path 59 | total_timesteps 2388.
Path 60 | total_timesteps 2415.
Path 61 | total_timesteps 2438.
Path 62 | total_timesteps 2480.
Path 63 | total_timesteps 2512.
Path 64 | total_timesteps 2543.
Path 65 | total_timesteps 2575.
Path 66 | total_timesteps 2613.
Path 67 | total_timesteps 2647.
Path 68 | total_timesteps 2672.
Path 69 | total_timesteps 2722.
Path 70 | total_timesteps 2759.
Path 71 | total_timesteps 2799.
Path 72 | total_timesteps 2829.
Path 73 | total_timesteps 2856.
Path 74 | total_timesteps 2887.
Path 75 | total_timesteps 2968.
Path 76 | total_timesteps 3003.
Path 77 | total_timesteps 3031.
Path 78 | total_timesteps 3084.
Path 79 | total_timesteps 3128.
Path 80 | total_timesteps 3169.
Path 81 | total_timesteps 3204.
Path 82 | total_timesteps 3250.
Path 83 | total_timesteps 3276.
Path 84 | total_timesteps 3308.
Path 85 | total_timesteps 3347.
Path 86 | total_timesteps 3384.
Path 87 | total_timesteps 3432.
Path 88 | total_timesteps 3459.
Path 89 | total_timesteps 3490.
Path 90 | total_timesteps 3541.
Path 91 | total_timesteps 3571.
Path 92 | total_timesteps 3596.
Path 93 | total_timesteps 3629.
Path 94 | total_timesteps 3663.
Path 95 | total_timesteps 3689.
Path 96 | total_timesteps 3707.
Path 97 | total_timesteps 3744.
Path 98 | total_timesteps 3769.
Path 99 | total_timesteps 3790.
Path 100 | total_timesteps 3818.
Path 101 | total_timesteps 3891.
Path 102 | total_timesteps 3909.
Path 103 | total_timesteps 3978.
Path 104 | total_timesteps 4013.
Path 105 | total_timesteps 4088.
Path 106 | total_timesteps 4129.
Path 107 | total_timesteps 4199.
Path 108 | total_timesteps 4241.
Path 109 | total_timesteps 4288.
Path 110 | total_timesteps 4340.
Path 111 | total_timesteps 4393.
Path 112 | total_timesteps 4455.
Path 113 | total_timesteps 4485.
Path 114 | total_timesteps 4509.
Path 115 | total_timesteps 4547.
Path 116 | total_timesteps 4588.
Path 117 | total_timesteps 4613.
Path 118 | total_timesteps 4643.
Path 119 | total_timesteps 4686.
Path 120 | total_timesteps 4711.
Path 121 | total_timesteps 4759.
Path 122 | total_timesteps 4793.
Path 123 | total_timesteps 4817.
Path 124 | total_timesteps 4842.
Path 125 | total_timesteps 4874.
Path 126 | total_timesteps 4901.
Path 127 | total_timesteps 4927.
Path 128 | total_timesteps 4969.
Path 129 | total_timesteps 5003.
Path 130 | total_timesteps 5044.
Path 131 | total_timesteps 5081.
Path 132 | total_timesteps 5104.
Path 133 | total_timesteps 5140.
Path 134 | total_timesteps 5183.
Path 135 | total_timesteps 5282.
Path 136 | total_timesteps 5338.
Path 137 | total_timesteps 5368.
Path 138 | total_timesteps 5413.
Path 139 | total_timesteps 5439.
Path 140 | total_timesteps 5499.
Path 141 | total_timesteps 5546.
Path 142 | total_timesteps 5579.
Path 143 | total_timesteps 5623.
Path 144 | total_timesteps 5667.
Path 145 | total_timesteps 5723.
Path 146 | total_timesteps 5757.
Path 147 | total_timesteps 5789.
Path 148 | total_timesteps 5823.
Path 149 | total_timesteps 5865.
Path 150 | total_timesteps 5896.
Path 151 | total_timesteps 5944.
Path 152 | total_timesteps 5991.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.8    |
| Iteration     | 25       |
| MaximumReturn | 13.9     |
| MinimumReturn | -36.6    |
| TotalSamples  | 108399   |
----------------------------
itr #26 | 
Fitting dynamics.
Validation loss = 0.4031449854373932
Validation loss = 0.40304648876190186
Validation loss = 0.4050617814064026
Validation loss = 0.40853747725486755
Validation loss = 0.40544021129608154
Validation loss = 0.4062516391277313
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 50.
Path 2 | total_timesteps 103.
Path 3 | total_timesteps 137.
Path 4 | total_timesteps 170.
Path 5 | total_timesteps 211.
Path 6 | total_timesteps 316.
Path 7 | total_timesteps 354.
Path 8 | total_timesteps 370.
Path 9 | total_timesteps 417.
Path 10 | total_timesteps 486.
Path 11 | total_timesteps 511.
Path 12 | total_timesteps 546.
Path 13 | total_timesteps 576.
Path 14 | total_timesteps 624.
Path 15 | total_timesteps 658.
Path 16 | total_timesteps 686.
Path 17 | total_timesteps 719.
Path 18 | total_timesteps 817.
Path 19 | total_timesteps 840.
Path 20 | total_timesteps 914.
Path 21 | total_timesteps 947.
Path 22 | total_timesteps 995.
Path 23 | total_timesteps 1042.
Path 24 | total_timesteps 1083.
Path 25 | total_timesteps 1098.
Path 26 | total_timesteps 1141.
Path 27 | total_timesteps 1201.
Path 28 | total_timesteps 1232.
Path 29 | total_timesteps 1297.
Path 30 | total_timesteps 1343.
Path 31 | total_timesteps 1367.
Path 32 | total_timesteps 1396.
Path 33 | total_timesteps 1435.
Path 34 | total_timesteps 1472.
Path 35 | total_timesteps 1508.
Path 36 | total_timesteps 1591.
Path 37 | total_timesteps 1626.
Path 38 | total_timesteps 1658.
Path 39 | total_timesteps 1690.
Path 40 | total_timesteps 1744.
Path 41 | total_timesteps 1776.
Path 42 | total_timesteps 1810.
Path 43 | total_timesteps 1841.
Path 44 | total_timesteps 1881.
Path 45 | total_timesteps 2034.
Path 46 | total_timesteps 2083.
Path 47 | total_timesteps 2127.
Path 48 | total_timesteps 2170.
Path 49 | total_timesteps 2209.
Path 50 | total_timesteps 2231.
Path 51 | total_timesteps 2262.
Path 52 | total_timesteps 2327.
Path 53 | total_timesteps 2348.
Path 54 | total_timesteps 2383.
Path 55 | total_timesteps 2408.
Path 56 | total_timesteps 2460.
Path 57 | total_timesteps 2491.
Path 58 | total_timesteps 2578.
Path 59 | total_timesteps 2609.
Path 60 | total_timesteps 2661.
Path 61 | total_timesteps 2694.
Path 62 | total_timesteps 2742.
Path 63 | total_timesteps 2790.
Path 64 | total_timesteps 2898.
Path 65 | total_timesteps 2940.
Path 66 | total_timesteps 2958.
Path 67 | total_timesteps 2992.
Path 68 | total_timesteps 3016.
Path 69 | total_timesteps 3049.
Path 70 | total_timesteps 3081.
Path 71 | total_timesteps 3113.
Path 72 | total_timesteps 3146.
Path 73 | total_timesteps 3185.
Path 74 | total_timesteps 3211.
Path 75 | total_timesteps 3252.
Path 76 | total_timesteps 3268.
Path 77 | total_timesteps 3286.
Path 78 | total_timesteps 3314.
Path 79 | total_timesteps 3352.
Path 80 | total_timesteps 3395.
Path 81 | total_timesteps 3454.
Path 82 | total_timesteps 3513.
Path 83 | total_timesteps 3555.
Path 84 | total_timesteps 3606.
Path 85 | total_timesteps 3665.
Path 86 | total_timesteps 3715.
Path 87 | total_timesteps 3751.
Path 88 | total_timesteps 3822.
Path 89 | total_timesteps 3857.
Path 90 | total_timesteps 3879.
Path 91 | total_timesteps 3914.
Path 92 | total_timesteps 3940.
Path 93 | total_timesteps 4027.
Path 94 | total_timesteps 4065.
Path 95 | total_timesteps 4126.
Path 96 | total_timesteps 4187.
Path 97 | total_timesteps 4213.
Path 98 | total_timesteps 4315.
Path 99 | total_timesteps 4349.
Path 100 | total_timesteps 4368.
Path 101 | total_timesteps 4398.
Path 102 | total_timesteps 4475.
Path 103 | total_timesteps 4500.
Path 104 | total_timesteps 4559.
Path 105 | total_timesteps 4591.
Path 106 | total_timesteps 4641.
Path 107 | total_timesteps 4702.
Path 108 | total_timesteps 4727.
Path 109 | total_timesteps 4749.
Path 110 | total_timesteps 4768.
Path 111 | total_timesteps 4815.
Path 112 | total_timesteps 4863.
Path 113 | total_timesteps 4887.
Path 114 | total_timesteps 4917.
Path 115 | total_timesteps 4943.
Path 116 | total_timesteps 4987.
Path 117 | total_timesteps 5018.
Path 118 | total_timesteps 5078.
Path 119 | total_timesteps 5117.
Path 120 | total_timesteps 5148.
Path 121 | total_timesteps 5198.
Path 122 | total_timesteps 5228.
Path 123 | total_timesteps 5264.
Path 124 | total_timesteps 5292.
Path 125 | total_timesteps 5348.
Path 126 | total_timesteps 5372.
Path 127 | total_timesteps 5410.
Path 128 | total_timesteps 5450.
Path 129 | total_timesteps 5474.
Path 130 | total_timesteps 5520.
Path 131 | total_timesteps 5566.
Path 132 | total_timesteps 5616.
Path 133 | total_timesteps 5645.
Path 134 | total_timesteps 5698.
Path 135 | total_timesteps 5717.
Path 136 | total_timesteps 5765.
Path 137 | total_timesteps 5806.
Path 138 | total_timesteps 5853.
Path 139 | total_timesteps 5900.
Path 140 | total_timesteps 5925.
Path 141 | total_timesteps 5971.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -19.6    |
| Iteration     | 26       |
| MaximumReturn | -0.245   |
| MinimumReturn | -63      |
| TotalSamples  | 112403   |
----------------------------
itr #27 | 
Fitting dynamics.
Validation loss = 0.4000702202320099
Validation loss = 0.40510469675064087
Validation loss = 0.40732187032699585
Validation loss = 0.40599942207336426
Validation loss = 0.4049687683582306
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 27.
Path 2 | total_timesteps 78.
Path 3 | total_timesteps 123.
Path 4 | total_timesteps 153.
Path 5 | total_timesteps 192.
Path 6 | total_timesteps 230.
Path 7 | total_timesteps 300.
Path 8 | total_timesteps 325.
Path 9 | total_timesteps 370.
Path 10 | total_timesteps 431.
Path 11 | total_timesteps 474.
Path 12 | total_timesteps 525.
Path 13 | total_timesteps 558.
Path 14 | total_timesteps 596.
Path 15 | total_timesteps 619.
Path 16 | total_timesteps 641.
Path 17 | total_timesteps 672.
Path 18 | total_timesteps 712.
Path 19 | total_timesteps 749.
Path 20 | total_timesteps 785.
Path 21 | total_timesteps 815.
Path 22 | total_timesteps 847.
Path 23 | total_timesteps 913.
Path 24 | total_timesteps 965.
Path 25 | total_timesteps 1025.
Path 26 | total_timesteps 1077.
Path 27 | total_timesteps 1124.
Path 28 | total_timesteps 1150.
Path 29 | total_timesteps 1188.
Path 30 | total_timesteps 1224.
Path 31 | total_timesteps 1314.
Path 32 | total_timesteps 1345.
Path 33 | total_timesteps 1402.
Path 34 | total_timesteps 1444.
Path 35 | total_timesteps 1485.
Path 36 | total_timesteps 1514.
Path 37 | total_timesteps 1546.
Path 38 | total_timesteps 1561.
Path 39 | total_timesteps 1591.
Path 40 | total_timesteps 1630.
Path 41 | total_timesteps 1675.
Path 42 | total_timesteps 1713.
Path 43 | total_timesteps 1747.
Path 44 | total_timesteps 1775.
Path 45 | total_timesteps 1837.
Path 46 | total_timesteps 1873.
Path 47 | total_timesteps 1937.
Path 48 | total_timesteps 1985.
Path 49 | total_timesteps 2013.
Path 50 | total_timesteps 2054.
Path 51 | total_timesteps 2102.
Path 52 | total_timesteps 2155.
Path 53 | total_timesteps 2245.
Path 54 | total_timesteps 2273.
Path 55 | total_timesteps 2308.
Path 56 | total_timesteps 2355.
Path 57 | total_timesteps 2386.
Path 58 | total_timesteps 2435.
Path 59 | total_timesteps 2468.
Path 60 | total_timesteps 2498.
Path 61 | total_timesteps 2525.
Path 62 | total_timesteps 2570.
Path 63 | total_timesteps 2616.
Path 64 | total_timesteps 2645.
Path 65 | total_timesteps 2674.
Path 66 | total_timesteps 2701.
Path 67 | total_timesteps 2748.
Path 68 | total_timesteps 2778.
Path 69 | total_timesteps 2804.
Path 70 | total_timesteps 2824.
Path 71 | total_timesteps 2853.
Path 72 | total_timesteps 2885.
Path 73 | total_timesteps 2936.
Path 74 | total_timesteps 2978.
Path 75 | total_timesteps 3030.
Path 76 | total_timesteps 3073.
Path 77 | total_timesteps 3112.
Path 78 | total_timesteps 3152.
Path 79 | total_timesteps 3223.
Path 80 | total_timesteps 3255.
Path 81 | total_timesteps 3283.
Path 82 | total_timesteps 3320.
Path 83 | total_timesteps 3390.
Path 84 | total_timesteps 3455.
Path 85 | total_timesteps 3498.
Path 86 | total_timesteps 3564.
Path 87 | total_timesteps 3610.
Path 88 | total_timesteps 3653.
Path 89 | total_timesteps 3721.
Path 90 | total_timesteps 3760.
Path 91 | total_timesteps 3803.
Path 92 | total_timesteps 3849.
Path 93 | total_timesteps 3901.
Path 94 | total_timesteps 3930.
Path 95 | total_timesteps 3987.
Path 96 | total_timesteps 4042.
Path 97 | total_timesteps 4103.
Path 98 | total_timesteps 4158.
Path 99 | total_timesteps 4197.
Path 100 | total_timesteps 4229.
Path 101 | total_timesteps 4269.
Path 102 | total_timesteps 4314.
Path 103 | total_timesteps 4383.
Path 104 | total_timesteps 4423.
Path 105 | total_timesteps 4464.
Path 106 | total_timesteps 4530.
Path 107 | total_timesteps 4563.
Path 108 | total_timesteps 4601.
Path 109 | total_timesteps 4629.
Path 110 | total_timesteps 4690.
Path 111 | total_timesteps 4727.
Path 112 | total_timesteps 4763.
Path 113 | total_timesteps 4787.
Path 114 | total_timesteps 4819.
Path 115 | total_timesteps 4861.
Path 116 | total_timesteps 4924.
Path 117 | total_timesteps 4959.
Path 118 | total_timesteps 5005.
Path 119 | total_timesteps 5080.
Path 120 | total_timesteps 5124.
Path 121 | total_timesteps 5155.
Path 122 | total_timesteps 5182.
Path 123 | total_timesteps 5215.
Path 124 | total_timesteps 5279.
Path 125 | total_timesteps 5303.
Path 126 | total_timesteps 5353.
Path 127 | total_timesteps 5398.
Path 128 | total_timesteps 5441.
Path 129 | total_timesteps 5457.
Path 130 | total_timesteps 5526.
Path 131 | total_timesteps 5572.
Path 132 | total_timesteps 5604.
Path 133 | total_timesteps 5630.
Path 134 | total_timesteps 5658.
Path 135 | total_timesteps 5729.
Path 136 | total_timesteps 5799.
Path 137 | total_timesteps 5848.
Path 138 | total_timesteps 5898.
Path 139 | total_timesteps 5930.
Path 140 | total_timesteps 5975.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -20      |
| Iteration     | 27       |
| MaximumReturn | 65.6     |
| MinimumReturn | -52.6    |
| TotalSamples  | 116419   |
----------------------------
itr #28 | 
Fitting dynamics.
Validation loss = 0.4045880436897278
Validation loss = 0.40626925230026245
Validation loss = 0.4056141674518585
Validation loss = 0.40574973821640015
Validation loss = 0.4060932993888855
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 35.
Path 2 | total_timesteps 75.
Path 3 | total_timesteps 115.
Path 4 | total_timesteps 189.
Path 5 | total_timesteps 235.
Path 6 | total_timesteps 267.
Path 7 | total_timesteps 314.
Path 8 | total_timesteps 359.
Path 9 | total_timesteps 395.
Path 10 | total_timesteps 438.
Path 11 | total_timesteps 481.
Path 12 | total_timesteps 509.
Path 13 | total_timesteps 561.
Path 14 | total_timesteps 585.
Path 15 | total_timesteps 600.
Path 16 | total_timesteps 648.
Path 17 | total_timesteps 684.
Path 18 | total_timesteps 706.
Path 19 | total_timesteps 749.
Path 20 | total_timesteps 829.
Path 21 | total_timesteps 893.
Path 22 | total_timesteps 973.
Path 23 | total_timesteps 1016.
Path 24 | total_timesteps 1050.
Path 25 | total_timesteps 1094.
Path 26 | total_timesteps 1122.
Path 27 | total_timesteps 1132.
Path 28 | total_timesteps 1180.
Path 29 | total_timesteps 1208.
Path 30 | total_timesteps 1261.
Path 31 | total_timesteps 1313.
Path 32 | total_timesteps 1342.
Path 33 | total_timesteps 1402.
Path 34 | total_timesteps 1448.
Path 35 | total_timesteps 1474.
Path 36 | total_timesteps 1535.
Path 37 | total_timesteps 1560.
Path 38 | total_timesteps 1591.
Path 39 | total_timesteps 1637.
Path 40 | total_timesteps 1680.
Path 41 | total_timesteps 1715.
Path 42 | total_timesteps 1748.
Path 43 | total_timesteps 1777.
Path 44 | total_timesteps 1810.
Path 45 | total_timesteps 1832.
Path 46 | total_timesteps 1890.
Path 47 | total_timesteps 1947.
Path 48 | total_timesteps 1985.
Path 49 | total_timesteps 2006.
Path 50 | total_timesteps 2056.
Path 51 | total_timesteps 2106.
Path 52 | total_timesteps 2146.
Path 53 | total_timesteps 2191.
Path 54 | total_timesteps 2224.
Path 55 | total_timesteps 2268.
Path 56 | total_timesteps 2290.
Path 57 | total_timesteps 2336.
Path 58 | total_timesteps 2366.
Path 59 | total_timesteps 2402.
Path 60 | total_timesteps 2434.
Path 61 | total_timesteps 2475.
Path 62 | total_timesteps 2504.
Path 63 | total_timesteps 2528.
Path 64 | total_timesteps 2562.
Path 65 | total_timesteps 2600.
Path 66 | total_timesteps 2636.
Path 67 | total_timesteps 2668.
Path 68 | total_timesteps 2729.
Path 69 | total_timesteps 2763.
Path 70 | total_timesteps 2804.
Path 71 | total_timesteps 2838.
Path 72 | total_timesteps 2868.
Path 73 | total_timesteps 2900.
Path 74 | total_timesteps 2931.
Path 75 | total_timesteps 2975.
Path 76 | total_timesteps 3029.
Path 77 | total_timesteps 3064.
Path 78 | total_timesteps 3113.
Path 79 | total_timesteps 3156.
Path 80 | total_timesteps 3200.
Path 81 | total_timesteps 3246.
Path 82 | total_timesteps 3267.
Path 83 | total_timesteps 3298.
Path 84 | total_timesteps 3330.
Path 85 | total_timesteps 3343.
Path 86 | total_timesteps 3365.
Path 87 | total_timesteps 3420.
Path 88 | total_timesteps 3458.
Path 89 | total_timesteps 3517.
Path 90 | total_timesteps 3567.
Path 91 | total_timesteps 3608.
Path 92 | total_timesteps 3669.
Path 93 | total_timesteps 3700.
Path 94 | total_timesteps 3765.
Path 95 | total_timesteps 3834.
Path 96 | total_timesteps 3865.
Path 97 | total_timesteps 3897.
Path 98 | total_timesteps 3924.
Path 99 | total_timesteps 4019.
Path 100 | total_timesteps 4050.
Path 101 | total_timesteps 4065.
Path 102 | total_timesteps 4103.
Path 103 | total_timesteps 4132.
Path 104 | total_timesteps 4165.
Path 105 | total_timesteps 4211.
Path 106 | total_timesteps 4268.
Path 107 | total_timesteps 4322.
Path 108 | total_timesteps 4370.
Path 109 | total_timesteps 4404.
Path 110 | total_timesteps 4433.
Path 111 | total_timesteps 4476.
Path 112 | total_timesteps 4519.
Path 113 | total_timesteps 4567.
Path 114 | total_timesteps 4658.
Path 115 | total_timesteps 4696.
Path 116 | total_timesteps 4743.
Path 117 | total_timesteps 4764.
Path 118 | total_timesteps 4807.
Path 119 | total_timesteps 4847.
Path 120 | total_timesteps 4877.
Path 121 | total_timesteps 4906.
Path 122 | total_timesteps 4937.
Path 123 | total_timesteps 4972.
Path 124 | total_timesteps 5010.
Path 125 | total_timesteps 5050.
Path 126 | total_timesteps 5079.
Path 127 | total_timesteps 5089.
Path 128 | total_timesteps 5175.
Path 129 | total_timesteps 5218.
Path 130 | total_timesteps 5268.
Path 131 | total_timesteps 5309.
Path 132 | total_timesteps 5368.
Path 133 | total_timesteps 5402.
Path 134 | total_timesteps 5429.
Path 135 | total_timesteps 5454.
Path 136 | total_timesteps 5530.
Path 137 | total_timesteps 5579.
Path 138 | total_timesteps 5607.
Path 139 | total_timesteps 5623.
Path 140 | total_timesteps 5645.
Path 141 | total_timesteps 5669.
Path 142 | total_timesteps 5694.
Path 143 | total_timesteps 5733.
Path 144 | total_timesteps 5761.
Path 145 | total_timesteps 5803.
Path 146 | total_timesteps 5831.
Path 147 | total_timesteps 5858.
Path 148 | total_timesteps 5880.
Path 149 | total_timesteps 5926.
Path 150 | total_timesteps 5954.
Path 151 | total_timesteps 5998.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -16.9    |
| Iteration     | 28       |
| MaximumReturn | 5.49     |
| MinimumReturn | -42.9    |
| TotalSamples  | 120499   |
----------------------------
itr #29 | 
Fitting dynamics.
Validation loss = 0.40205082297325134
Validation loss = 0.40621474385261536
Validation loss = 0.40582001209259033
Validation loss = 0.4068562388420105
Validation loss = 0.40542811155319214
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 81.
Path 2 | total_timesteps 127.
Path 3 | total_timesteps 179.
Path 4 | total_timesteps 229.
Path 5 | total_timesteps 276.
Path 6 | total_timesteps 326.
Path 7 | total_timesteps 381.
Path 8 | total_timesteps 433.
Path 9 | total_timesteps 484.
Path 10 | total_timesteps 508.
Path 11 | total_timesteps 537.
Path 12 | total_timesteps 566.
Path 13 | total_timesteps 596.
Path 14 | total_timesteps 647.
Path 15 | total_timesteps 696.
Path 16 | total_timesteps 727.
Path 17 | total_timesteps 787.
Path 18 | total_timesteps 805.
Path 19 | total_timesteps 827.
Path 20 | total_timesteps 853.
Path 21 | total_timesteps 898.
Path 22 | total_timesteps 933.
Path 23 | total_timesteps 987.
Path 24 | total_timesteps 1019.
Path 25 | total_timesteps 1086.
Path 26 | total_timesteps 1096.
Path 27 | total_timesteps 1123.
Path 28 | total_timesteps 1156.
Path 29 | total_timesteps 1183.
Path 30 | total_timesteps 1224.
Path 31 | total_timesteps 1258.
Path 32 | total_timesteps 1303.
Path 33 | total_timesteps 1336.
Path 34 | total_timesteps 1365.
Path 35 | total_timesteps 1396.
Path 36 | total_timesteps 1435.
Path 37 | total_timesteps 1450.
Path 38 | total_timesteps 1490.
Path 39 | total_timesteps 1563.
Path 40 | total_timesteps 1577.
Path 41 | total_timesteps 1634.
Path 42 | total_timesteps 1682.
Path 43 | total_timesteps 1720.
Path 44 | total_timesteps 1773.
Path 45 | total_timesteps 1827.
Path 46 | total_timesteps 1872.
Path 47 | total_timesteps 1912.
Path 48 | total_timesteps 1966.
Path 49 | total_timesteps 2009.
Path 50 | total_timesteps 2057.
Path 51 | total_timesteps 2116.
Path 52 | total_timesteps 2157.
Path 53 | total_timesteps 2205.
Path 54 | total_timesteps 2244.
Path 55 | total_timesteps 2271.
Path 56 | total_timesteps 2297.
Path 57 | total_timesteps 2380.
Path 58 | total_timesteps 2416.
Path 59 | total_timesteps 2481.
Path 60 | total_timesteps 2609.
Path 61 | total_timesteps 2644.
Path 62 | total_timesteps 2674.
Path 63 | total_timesteps 2722.
Path 64 | total_timesteps 2754.
Path 65 | total_timesteps 2829.
Path 66 | total_timesteps 2877.
Path 67 | total_timesteps 2930.
Path 68 | total_timesteps 2956.
Path 69 | total_timesteps 3010.
Path 70 | total_timesteps 3052.
Path 71 | total_timesteps 3094.
Path 72 | total_timesteps 3127.
Path 73 | total_timesteps 3166.
Path 74 | total_timesteps 3183.
Path 75 | total_timesteps 3218.
Path 76 | total_timesteps 3270.
Path 77 | total_timesteps 3342.
Path 78 | total_timesteps 3412.
Path 79 | total_timesteps 3443.
Path 80 | total_timesteps 3501.
Path 81 | total_timesteps 3538.
Path 82 | total_timesteps 3563.
Path 83 | total_timesteps 3604.
Path 84 | total_timesteps 3658.
Path 85 | total_timesteps 3691.
Path 86 | total_timesteps 3715.
Path 87 | total_timesteps 3769.
Path 88 | total_timesteps 3821.
Path 89 | total_timesteps 3854.
Path 90 | total_timesteps 3898.
Path 91 | total_timesteps 3919.
Path 92 | total_timesteps 3959.
Path 93 | total_timesteps 3999.
Path 94 | total_timesteps 4061.
Path 95 | total_timesteps 4087.
Path 96 | total_timesteps 4127.
Path 97 | total_timesteps 4157.
Path 98 | total_timesteps 4191.
Path 99 | total_timesteps 4225.
Path 100 | total_timesteps 4300.
Path 101 | total_timesteps 4334.
Path 102 | total_timesteps 4367.
Path 103 | total_timesteps 4404.
Path 104 | total_timesteps 4434.
Path 105 | total_timesteps 4457.
Path 106 | total_timesteps 4505.
Path 107 | total_timesteps 4562.
Path 108 | total_timesteps 4605.
Path 109 | total_timesteps 4652.
Path 110 | total_timesteps 4690.
Path 111 | total_timesteps 4713.
Path 112 | total_timesteps 4767.
Path 113 | total_timesteps 4807.
Path 114 | total_timesteps 4834.
Path 115 | total_timesteps 4853.
Path 116 | total_timesteps 4902.
Path 117 | total_timesteps 4945.
Path 118 | total_timesteps 4978.
Path 119 | total_timesteps 5008.
Path 120 | total_timesteps 5084.
Path 121 | total_timesteps 5135.
Path 122 | total_timesteps 5193.
Path 123 | total_timesteps 5237.
Path 124 | total_timesteps 5283.
Path 125 | total_timesteps 5325.
Path 126 | total_timesteps 5354.
Path 127 | total_timesteps 5369.
Path 128 | total_timesteps 5393.
Path 129 | total_timesteps 5427.
Path 130 | total_timesteps 5473.
Path 131 | total_timesteps 5507.
Path 132 | total_timesteps 5535.
Path 133 | total_timesteps 5588.
Path 134 | total_timesteps 5630.
Path 135 | total_timesteps 5674.
Path 136 | total_timesteps 5715.
Path 137 | total_timesteps 5762.
Path 138 | total_timesteps 5795.
Path 139 | total_timesteps 5891.
Path 140 | total_timesteps 5938.
Path 141 | total_timesteps 5974.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -18.4    |
| Iteration     | 29       |
| MaximumReturn | 23.8     |
| MinimumReturn | -78.7    |
| TotalSamples  | 124507   |
----------------------------
itr #30 | 
Fitting dynamics.
Validation loss = 0.40119946002960205
Validation loss = 0.4023379385471344
Validation loss = 0.4047909080982208
Validation loss = 0.40566742420196533
Validation loss = 0.4054500162601471
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 36.
Path 2 | total_timesteps 64.
Path 3 | total_timesteps 91.
Path 4 | total_timesteps 127.
Path 5 | total_timesteps 182.
Path 6 | total_timesteps 233.
Path 7 | total_timesteps 252.
Path 8 | total_timesteps 299.
Path 9 | total_timesteps 349.
Path 10 | total_timesteps 366.
Path 11 | total_timesteps 444.
Path 12 | total_timesteps 496.
Path 13 | total_timesteps 535.
Path 14 | total_timesteps 562.
Path 15 | total_timesteps 593.
Path 16 | total_timesteps 644.
Path 17 | total_timesteps 673.
Path 18 | total_timesteps 702.
Path 19 | total_timesteps 764.
Path 20 | total_timesteps 836.
Path 21 | total_timesteps 915.
Path 22 | total_timesteps 954.
Path 23 | total_timesteps 993.
Path 24 | total_timesteps 1040.
Path 25 | total_timesteps 1070.
Path 26 | total_timesteps 1117.
Path 27 | total_timesteps 1160.
Path 28 | total_timesteps 1206.
Path 29 | total_timesteps 1246.
Path 30 | total_timesteps 1296.
Path 31 | total_timesteps 1338.
Path 32 | total_timesteps 1388.
Path 33 | total_timesteps 1409.
Path 34 | total_timesteps 1462.
Path 35 | total_timesteps 1486.
Path 36 | total_timesteps 1530.
Path 37 | total_timesteps 1551.
Path 38 | total_timesteps 1579.
Path 39 | total_timesteps 1653.
Path 40 | total_timesteps 1690.
Path 41 | total_timesteps 1738.
Path 42 | total_timesteps 1775.
Path 43 | total_timesteps 1826.
Path 44 | total_timesteps 1851.
Path 45 | total_timesteps 1879.
Path 46 | total_timesteps 1918.
Path 47 | total_timesteps 1956.
Path 48 | total_timesteps 1984.
Path 49 | total_timesteps 2025.
Path 50 | total_timesteps 2077.
Path 51 | total_timesteps 2124.
Path 52 | total_timesteps 2149.
Path 53 | total_timesteps 2208.
Path 54 | total_timesteps 2276.
Path 55 | total_timesteps 2315.
Path 56 | total_timesteps 2356.
Path 57 | total_timesteps 2411.
Path 58 | total_timesteps 2446.
Path 59 | total_timesteps 2493.
Path 60 | total_timesteps 2543.
Path 61 | total_timesteps 2595.
Path 62 | total_timesteps 2620.
Path 63 | total_timesteps 2654.
Path 64 | total_timesteps 2691.
Path 65 | total_timesteps 2759.
Path 66 | total_timesteps 2805.
Path 67 | total_timesteps 2849.
Path 68 | total_timesteps 2886.
Path 69 | total_timesteps 2921.
Path 70 | total_timesteps 2950.
Path 71 | total_timesteps 2996.
Path 72 | total_timesteps 3019.
Path 73 | total_timesteps 3059.
Path 74 | total_timesteps 3087.
Path 75 | total_timesteps 3125.
Path 76 | total_timesteps 3154.
Path 77 | total_timesteps 3188.
Path 78 | total_timesteps 3210.
Path 79 | total_timesteps 3272.
Path 80 | total_timesteps 3313.
Path 81 | total_timesteps 3386.
Path 82 | total_timesteps 3426.
Path 83 | total_timesteps 3461.
Path 84 | total_timesteps 3486.
Path 85 | total_timesteps 3518.
Path 86 | total_timesteps 3557.
Path 87 | total_timesteps 3590.
Path 88 | total_timesteps 3626.
Path 89 | total_timesteps 3696.
Path 90 | total_timesteps 3740.
Path 91 | total_timesteps 3763.
Path 92 | total_timesteps 3793.
Path 93 | total_timesteps 3848.
Path 94 | total_timesteps 3878.
Path 95 | total_timesteps 3907.
Path 96 | total_timesteps 3930.
Path 97 | total_timesteps 3950.
Path 98 | total_timesteps 3997.
Path 99 | total_timesteps 4056.
Path 100 | total_timesteps 4096.
Path 101 | total_timesteps 4129.
Path 102 | total_timesteps 4179.
Path 103 | total_timesteps 4220.
Path 104 | total_timesteps 4284.
Path 105 | total_timesteps 4329.
Path 106 | total_timesteps 4359.
Path 107 | total_timesteps 4383.
Path 108 | total_timesteps 4424.
Path 109 | total_timesteps 4445.
Path 110 | total_timesteps 4494.
Path 111 | total_timesteps 4577.
Path 112 | total_timesteps 4598.
Path 113 | total_timesteps 4621.
Path 114 | total_timesteps 4683.
Path 115 | total_timesteps 4731.
Path 116 | total_timesteps 4762.
Path 117 | total_timesteps 4809.
Path 118 | total_timesteps 4861.
Path 119 | total_timesteps 4891.
Path 120 | total_timesteps 4924.
Path 121 | total_timesteps 4956.
Path 122 | total_timesteps 4987.
Path 123 | total_timesteps 5015.
Path 124 | total_timesteps 5054.
Path 125 | total_timesteps 5086.
Path 126 | total_timesteps 5110.
Path 127 | total_timesteps 5149.
Path 128 | total_timesteps 5200.
Path 129 | total_timesteps 5247.
Path 130 | total_timesteps 5279.
Path 131 | total_timesteps 5394.
Path 132 | total_timesteps 5439.
Path 133 | total_timesteps 5480.
Path 134 | total_timesteps 5521.
Path 135 | total_timesteps 5581.
Path 136 | total_timesteps 5629.
Path 137 | total_timesteps 5659.
Path 138 | total_timesteps 5712.
Path 139 | total_timesteps 5753.
Path 140 | total_timesteps 5783.
Path 141 | total_timesteps 5827.
Path 142 | total_timesteps 5852.
Path 143 | total_timesteps 5889.
Path 144 | total_timesteps 5917.
Path 145 | total_timesteps 5949.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -20.4    |
| Iteration     | 30       |
| MaximumReturn | 0.216    |
| MinimumReturn | -85.4    |
| TotalSamples  | 128521   |
----------------------------
itr #31 | 
Fitting dynamics.
Validation loss = 0.4035605192184448
Validation loss = 0.4032617211341858
Validation loss = 0.4056318998336792
Validation loss = 0.4070957899093628
Validation loss = 0.4067833423614502
Validation loss = 0.4039596915245056
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 30.
Path 2 | total_timesteps 72.
Path 3 | total_timesteps 129.
Path 4 | total_timesteps 161.
Path 5 | total_timesteps 192.
Path 6 | total_timesteps 219.
Path 7 | total_timesteps 257.
Path 8 | total_timesteps 284.
Path 9 | total_timesteps 320.
Path 10 | total_timesteps 348.
Path 11 | total_timesteps 435.
Path 12 | total_timesteps 473.
Path 13 | total_timesteps 529.
Path 14 | total_timesteps 571.
Path 15 | total_timesteps 623.
Path 16 | total_timesteps 665.
Path 17 | total_timesteps 694.
Path 18 | total_timesteps 742.
Path 19 | total_timesteps 775.
Path 20 | total_timesteps 814.
Path 21 | total_timesteps 863.
Path 22 | total_timesteps 891.
Path 23 | total_timesteps 933.
Path 24 | total_timesteps 957.
Path 25 | total_timesteps 1006.
Path 26 | total_timesteps 1045.
Path 27 | total_timesteps 1083.
Path 28 | total_timesteps 1134.
Path 29 | total_timesteps 1160.
Path 30 | total_timesteps 1268.
Path 31 | total_timesteps 1292.
Path 32 | total_timesteps 1350.
Path 33 | total_timesteps 1405.
Path 34 | total_timesteps 1448.
Path 35 | total_timesteps 1469.
Path 36 | total_timesteps 1537.
Path 37 | total_timesteps 1567.
Path 38 | total_timesteps 1594.
Path 39 | total_timesteps 1622.
Path 40 | total_timesteps 1655.
Path 41 | total_timesteps 1684.
Path 42 | total_timesteps 1716.
Path 43 | total_timesteps 1754.
Path 44 | total_timesteps 1784.
Path 45 | total_timesteps 1820.
Path 46 | total_timesteps 1868.
Path 47 | total_timesteps 1957.
Path 48 | total_timesteps 2001.
Path 49 | total_timesteps 2035.
Path 50 | total_timesteps 2063.
Path 51 | total_timesteps 2090.
Path 52 | total_timesteps 2153.
Path 53 | total_timesteps 2194.
Path 54 | total_timesteps 2227.
Path 55 | total_timesteps 2274.
Path 56 | total_timesteps 2342.
Path 57 | total_timesteps 2387.
Path 58 | total_timesteps 2428.
Path 59 | total_timesteps 2450.
Path 60 | total_timesteps 2484.
Path 61 | total_timesteps 2535.
Path 62 | total_timesteps 2567.
Path 63 | total_timesteps 2616.
Path 64 | total_timesteps 2653.
Path 65 | total_timesteps 2763.
Path 66 | total_timesteps 2803.
Path 67 | total_timesteps 2850.
Path 68 | total_timesteps 2878.
Path 69 | total_timesteps 2919.
Path 70 | total_timesteps 2968.
Path 71 | total_timesteps 3010.
Path 72 | total_timesteps 3039.
Path 73 | total_timesteps 3086.
Path 74 | total_timesteps 3153.
Path 75 | total_timesteps 3210.
Path 76 | total_timesteps 3246.
Path 77 | total_timesteps 3330.
Path 78 | total_timesteps 3377.
Path 79 | total_timesteps 3417.
Path 80 | total_timesteps 3451.
Path 81 | total_timesteps 3508.
Path 82 | total_timesteps 3559.
Path 83 | total_timesteps 3596.
Path 84 | total_timesteps 3619.
Path 85 | total_timesteps 3637.
Path 86 | total_timesteps 3666.
Path 87 | total_timesteps 3699.
Path 88 | total_timesteps 3748.
Path 89 | total_timesteps 3770.
Path 90 | total_timesteps 3799.
Path 91 | total_timesteps 3873.
Path 92 | total_timesteps 3916.
Path 93 | total_timesteps 3981.
Path 94 | total_timesteps 4021.
Path 95 | total_timesteps 4090.
Path 96 | total_timesteps 4142.
Path 97 | total_timesteps 4171.
Path 98 | total_timesteps 4245.
Path 99 | total_timesteps 4264.
Path 100 | total_timesteps 4309.
Path 101 | total_timesteps 4375.
Path 102 | total_timesteps 4406.
Path 103 | total_timesteps 4443.
Path 104 | total_timesteps 4479.
Path 105 | total_timesteps 4547.
Path 106 | total_timesteps 4568.
Path 107 | total_timesteps 4604.
Path 108 | total_timesteps 4631.
Path 109 | total_timesteps 4689.
Path 110 | total_timesteps 4728.
Path 111 | total_timesteps 4762.
Path 112 | total_timesteps 4837.
Path 113 | total_timesteps 4875.
Path 114 | total_timesteps 4910.
Path 115 | total_timesteps 4930.
Path 116 | total_timesteps 4971.
Path 117 | total_timesteps 4999.
Path 118 | total_timesteps 5066.
Path 119 | total_timesteps 5093.
Path 120 | total_timesteps 5123.
Path 121 | total_timesteps 5156.
Path 122 | total_timesteps 5201.
Path 123 | total_timesteps 5241.
Path 124 | total_timesteps 5268.
Path 125 | total_timesteps 5312.
Path 126 | total_timesteps 5347.
Path 127 | total_timesteps 5427.
Path 128 | total_timesteps 5456.
Path 129 | total_timesteps 5502.
Path 130 | total_timesteps 5556.
Path 131 | total_timesteps 5581.
Path 132 | total_timesteps 5630.
Path 133 | total_timesteps 5690.
Path 134 | total_timesteps 5748.
Path 135 | total_timesteps 5800.
Path 136 | total_timesteps 5837.
Path 137 | total_timesteps 5893.
Path 138 | total_timesteps 5925.
Path 139 | total_timesteps 5954.
Path 140 | total_timesteps 5999.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -19.4    |
| Iteration     | 31       |
| MaximumReturn | 13.1     |
| MinimumReturn | -86.8    |
| TotalSamples  | 132551   |
----------------------------
itr #32 | 
Fitting dynamics.
Validation loss = 0.4015044569969177
Validation loss = 0.40215542912483215
Validation loss = 0.4042613208293915
Validation loss = 0.40305161476135254
Validation loss = 0.4054001569747925
Done fitting dynamics.
Updating randomness.
Done updating randomness.
Training policy using TRPO.
Re-initialize init_std.
Obtaining samples for iteration 0...
Obtaining samples for iteration 1...
Obtaining samples for iteration 2...
Obtaining samples for iteration 3...
Obtaining samples for iteration 4...
Obtaining samples for iteration 5...
Obtaining samples for iteration 6...
Obtaining samples for iteration 7...
Obtaining samples for iteration 8...
Obtaining samples for iteration 9...
Obtaining samples for iteration 10...
Obtaining samples for iteration 11...
Obtaining samples for iteration 12...
Obtaining samples for iteration 13...
Obtaining samples for iteration 14...
Obtaining samples for iteration 15...
Obtaining samples for iteration 16...
Obtaining samples for iteration 17...
Obtaining samples for iteration 18...
Obtaining samples for iteration 19...
Done training policy.
Generating on-policy rollouts.
Path 0 | total_timesteps 0.
Path 1 | total_timesteps 43.
Path 2 | total_timesteps 89.
Path 3 | total_timesteps 132.
Path 4 | total_timesteps 210.
Path 5 | total_timesteps 255.
Path 6 | total_timesteps 296.
Path 7 | total_timesteps 318.
Path 8 | total_timesteps 341.
Path 9 | total_timesteps 395.
Path 10 | total_timesteps 411.
Path 11 | total_timesteps 462.
Path 12 | total_timesteps 504.
Path 13 | total_timesteps 544.
Path 14 | total_timesteps 590.
Path 15 | total_timesteps 624.
Path 16 | total_timesteps 653.
Path 17 | total_timesteps 737.
Path 18 | total_timesteps 771.
Path 19 | total_timesteps 801.
Path 20 | total_timesteps 837.
Path 21 | total_timesteps 864.
Path 22 | total_timesteps 890.
Path 23 | total_timesteps 931.
Path 24 | total_timesteps 971.
Path 25 | total_timesteps 1012.
Path 26 | total_timesteps 1046.
Path 27 | total_timesteps 1060.
Path 28 | total_timesteps 1095.
Path 29 | total_timesteps 1136.
Path 30 | total_timesteps 1166.
Path 31 | total_timesteps 1224.
Path 32 | total_timesteps 1290.
Path 33 | total_timesteps 1314.
Path 34 | total_timesteps 1383.
Path 35 | total_timesteps 1463.
Path 36 | total_timesteps 1501.
Path 37 | total_timesteps 1583.
Path 38 | total_timesteps 1641.
Path 39 | total_timesteps 1673.
Path 40 | total_timesteps 1697.
Path 41 | total_timesteps 1741.
Path 42 | total_timesteps 1789.
Path 43 | total_timesteps 1870.
Path 44 | total_timesteps 1908.
Path 45 | total_timesteps 1956.
Path 46 | total_timesteps 1990.
Path 47 | total_timesteps 2020.
Path 48 | total_timesteps 2066.
Path 49 | total_timesteps 2102.
Path 50 | total_timesteps 2132.
Path 51 | total_timesteps 2165.
Path 52 | total_timesteps 2204.
Path 53 | total_timesteps 2240.
Path 54 | total_timesteps 2271.
Path 55 | total_timesteps 2357.
Path 56 | total_timesteps 2388.
Path 57 | total_timesteps 2412.
Path 58 | total_timesteps 2449.
Path 59 | total_timesteps 2479.
Path 60 | total_timesteps 2511.
Path 61 | total_timesteps 2535.
Path 62 | total_timesteps 2565.
Path 63 | total_timesteps 2636.
Path 64 | total_timesteps 2693.
Path 65 | total_timesteps 2733.
Path 66 | total_timesteps 2778.
Path 67 | total_timesteps 2811.
Path 68 | total_timesteps 2843.
Path 69 | total_timesteps 2872.
Path 70 | total_timesteps 2919.
Path 71 | total_timesteps 2993.
Path 72 | total_timesteps 3023.
Path 73 | total_timesteps 3073.
Path 74 | total_timesteps 3113.
Path 75 | total_timesteps 3147.
Path 76 | total_timesteps 3182.
Path 77 | total_timesteps 3228.
Path 78 | total_timesteps 3262.
Path 79 | total_timesteps 3297.
Path 80 | total_timesteps 3317.
Path 81 | total_timesteps 3361.
Path 82 | total_timesteps 3378.
Path 83 | total_timesteps 3412.
Path 84 | total_timesteps 3446.
Path 85 | total_timesteps 3488.
Path 86 | total_timesteps 3528.
Path 87 | total_timesteps 3559.
Path 88 | total_timesteps 3601.
Path 89 | total_timesteps 3650.
Path 90 | total_timesteps 3702.
Path 91 | total_timesteps 3769.
Path 92 | total_timesteps 3815.
Path 93 | total_timesteps 3853.
Path 94 | total_timesteps 3882.
Path 95 | total_timesteps 3913.
Path 96 | total_timesteps 3945.
Path 97 | total_timesteps 3989.
Path 98 | total_timesteps 4028.
Path 99 | total_timesteps 4057.
Path 100 | total_timesteps 4084.
Path 101 | total_timesteps 4116.
Path 102 | total_timesteps 4153.
Path 103 | total_timesteps 4163.
Path 104 | total_timesteps 4202.
Path 105 | total_timesteps 4218.
Path 106 | total_timesteps 4236.
Path 107 | total_timesteps 4268.
Path 108 | total_timesteps 4297.
Path 109 | total_timesteps 4347.
Path 110 | total_timesteps 4374.
Path 111 | total_timesteps 4433.
Path 112 | total_timesteps 4503.
Path 113 | total_timesteps 4542.
Path 114 | total_timesteps 4603.
Path 115 | total_timesteps 4636.
Path 116 | total_timesteps 4697.
Path 117 | total_timesteps 4747.
Path 118 | total_timesteps 4778.
Path 119 | total_timesteps 4807.
Path 120 | total_timesteps 4847.
Path 121 | total_timesteps 4878.
Path 122 | total_timesteps 4921.
Path 123 | total_timesteps 4964.
Path 124 | total_timesteps 5019.
Path 125 | total_timesteps 5063.
Path 126 | total_timesteps 5105.
Path 127 | total_timesteps 5148.
Path 128 | total_timesteps 5173.
Path 129 | total_timesteps 5212.
Path 130 | total_timesteps 5245.
Path 131 | total_timesteps 5287.
Path 132 | total_timesteps 5316.
Path 133 | total_timesteps 5380.
Path 134 | total_timesteps 5416.
Path 135 | total_timesteps 5463.
Path 136 | total_timesteps 5499.
Path 137 | total_timesteps 5536.
Path 138 | total_timesteps 5572.
Path 139 | total_timesteps 5614.
Path 140 | total_timesteps 5655.
Path 141 | total_timesteps 5700.
Path 142 | total_timesteps 5756.
Path 143 | total_timesteps 5787.
Path 144 | total_timesteps 5813.
Path 145 | total_timesteps 5849.
Path 146 | total_timesteps 5917.
Path 147 | total_timesteps 5996.
Done generating on-policy rollouts.
Updating normalization.
Done updating normalization.
----------------------------
| AverageReturn | -17.9    |
| Iteration     | 32       |
| MaximumReturn | 2.6      |
| MinimumReturn | -50.1    |
| TotalSamples  | 136573   |
----------------------------
