Running Forward KL,Running Env Steps,Reward Loss,Running Update Time,Itration,Real Sto Return,Running Reverse KL,Real Det Return
31.6649,0,2659078.75,0,0,-139.6,11.3004,-30.23
31.5974,5000,2703855.0,1,1,-123.64,11.0236,-18.96
31.8445,10000,2726383.0,2,2,-103.32,11.9322,5.77
31.8032,15000,2573089.75,3,3,-99.94,11.425,4.39
32.3152,20000,2534890.0,4,4,-84.89,12.64,4.8
