Running Env Steps,Itration,Reward Loss,Running Reverse KL,Running Forward KL,Real Det Return,Running Update Time,Real Sto Return
0,0,-802660.375,399.8623,26.514,-17.37,0,-25.57
5000,1,-797553.6875,391.3409,25.9934,-29.27,1,-33.21
