Running Reverse KL,Real Det Return,Reward Loss,Real Sto Return,Real Sto violation,Running Update Time,Real Det violation,Running Env Steps,Itration,Running Forward KL
12.5601,1751.04,30.33733367919922,1776.51,0.05,0,0.0,0,0,16.1933
13.989,1751.95,50.580726623535156,1805.91,0.0,1,0.0,5000,1,15.9912
14.3826,1763.02,31.29464340209961,1816.02,0.0,2,0.0,10000,2,15.797
15.4698,1765.37,25.244518280029297,1800.11,0.0,3,0.0,15000,3,17.144
15.1682,1765.08,17.589330673217773,1813.97,0.0,4,0.0,20000,4,16.6203
15.049,1773.08,-0.8442997932434082,1818.71,0.0,5,0.0,25000,5,15.8893
15.2007,1769.66,-8.46382999420166,1817.34,0.0,6,0.0,30000,6,16.1617
14.9984,1779.42,-32.512203216552734,1824.65,0.0,7,0.0,35000,7,15.8231
14.7434,1763.46,-41.48411560058594,1822.26,0.0,8,0.0,40000,8,16.0445
15.0208,1768.68,-53.01076126098633,1812.05,0.0,9,0.0,45000,9,16.1638
14.2617,1762.24,-69.72712707519531,1821.51,0.0,10,0.0,50000,10,14.9569
13.6657,1759.1,-83.06473541259766,1833.12,0.0,11,0.0,55000,11,15.0222
14.178,1759.35,-99.30374908447266,1825.71,0.0,12,0.0,60000,12,15.7684
13.8274,1760.23,-108.48435974121094,1829.89,0.0,13,0.0,65000,13,14.6725
13.6975,1758.86,-122.38142395019531,1828.3,0.0,14,0.0,70000,14,14.8168
13.5713,1756.89,-137.2854461669922,1817.64,0.0,15,0.0,75000,15,14.8335
