Running Update Time,Real Det Return,Real Sto Return,Running Reverse KL,Reward Loss,Running Env Steps,Real Sto violation,Real Det violation,Running Forward KL,Itration
0,-1140.31,-187.91,11.357,747.9129028320312,0,1.0,0.8,17.7017,0
1,-1391.92,-288.54,12.1185,808.5147094726562,5000,1.0,0.4,18.5809,1
