Cost Loss,Running Env Steps,Real Det violation,Itration,Running Forward KL,Running Update Time,Running Reverse KL,Real Det Return,Real Sto Return,Real Sto violation,Reward Loss
50.60293197631836,0,0.9,0,17.6985,0,9.2186,-114.71,0.43,1.0,563.33447265625
36.49626159667969,5000,0.15,1,17.7927,1,9.9112,-280.61,9.8,1.0,586.5718994140625
59.88719177246094,10000,0.0,2,17.8199,2,9.2549,-1008.73,-30.55,1.0,569.419677734375
45.42216491699219,15000,0.0,3,17.4568,3,9.8306,-689.58,-84.46,1.0,611.97021484375
9.860791206359863,20000,0.1,4,17.1688,4,10.2198,-1103.61,-158.89,1.0,597.4055786132812
12.129582405090332,25000,0.7,5,17.5975,5,9.6485,-631.88,-113.23,1.0,573.6484375
8.355520248413086,30000,0.95,6,17.4639,6,9.6692,-940.03,-122.62,1.0,554.2582397460938
5.152329921722412,35000,0.95,7,17.6019,7,9.3869,-1093.74,-194.2,1.0,522.385986328125
2.579383611679077,40000,0.45,8,17.6108,8,8.9816,-1382.5,-244.2,1.0,502.49957275390625
-13.688243865966797,45000,0.4,9,17.1714,9,9.2023,-1380.72,-203.18,1.0,483.8446960449219
-17.972692489624023,50000,0.0,10,17.5566,10,9.1174,-1588.26,-241.85,1.0,454.1789855957031
-17.900705337524414,55000,0.05,11,17.7362,11,9.1111,-1330.67,-172.76,0.95,451.1876220703125
