Logging to logs/AntFH-v0/exp-16/pagar_fkl/2024_08_11_05_32_11
--2024-08-11 05:33:59.289092 UTC---
| Itration            | 0         |
| PAGAR Loss          | -6.16e+03 |
| Real Det Return     | 902       |
| Real Sto Return     | -174      |
| Reward Loss         | 2.07e+05  |
| Running Env Steps   | 0         |
| Running Forward KL  | 147       |
| Running Reverse KL  | 2.55e+03  |
| Running Update Time | 0         |
-----------------------------------
--2024-08-11 05:35:54.907248 UTC--
| Itration            | 1        |
| PAGAR Loss          | 2.39e+03 |
| Real Det Return     | 844      |
| Real Sto Return     | -144     |
| Reward Loss         | 6.26e+04 |
| Running Env Steps   | 5000     |
| Running Forward KL  | 143      |
| Running Reverse KL  | 1.41e+03 |
| Running Update Time | 1        |
----------------------------------
--2024-08-11 05:37:50.430491 UTC--
| Itration            | 2        |
| PAGAR Loss          | 4.67e+03 |
| Real Det Return     | 808      |
| Real Sto Return     | -193     |
| Reward Loss         | 3.35e+04 |
| Running Env Steps   | 10000    |
| Running Forward KL  | 144      |
| Running Reverse KL  | 2.07e+03 |
| Running Update Time | 2        |
----------------------------------
--2024-08-11 05:39:47.090039 UTC---
| Itration            | 3         |
| PAGAR Loss          | -5.24e+04 |
| Real Det Return     | 803       |
| Real Sto Return     | -130      |
| Reward Loss         | -1.04e+05 |
| Running Env Steps   | 15000     |
| Running Forward KL  | 148       |
| Running Reverse KL  | 2.1e+03   |
| Running Update Time | 3         |
-----------------------------------
--2024-08-11 05:41:44.614637 UTC---
| Itration            | 4         |
| PAGAR Loss          | 5.7e+03   |
| Real Det Return     | 788       |
| Real Sto Return     | -166      |
| Reward Loss         | -2.62e+05 |
| Running Env Steps   | 20000     |
| Running Forward KL  | 145       |
| Running Reverse KL  | 1.32e+03  |
| Running Update Time | 4         |
-----------------------------------
--2024-08-11 05:43:42.491682 UTC---
| Itration            | 5         |
| PAGAR Loss          | -3.07e+04 |
| Real Det Return     | 825       |
| Real Sto Return     | -220      |
| Reward Loss         | -2.68e+05 |
| Running Env Steps   | 25000     |
| Running Forward KL  | 141       |
| Running Reverse KL  | 1.6e+03   |
| Running Update Time | 5         |
-----------------------------------
--2024-08-11 05:45:38.457921 UTC---
| Itration            | 6         |
| PAGAR Loss          | 9.22e+05  |
| Real Det Return     | 872       |
| Real Sto Return     | -107      |
| Reward Loss         | -2.39e+05 |
| Running Env Steps   | 30000     |
| Running Forward KL  | 141       |
| Running Reverse KL  | 1.44e+03  |
| Running Update Time | 6         |
-----------------------------------
--2024-08-11 05:47:35.222221 UTC---
| Itration            | 7         |
| PAGAR Loss          | 1.62e+05  |
| Real Det Return     | 789       |
| Real Sto Return     | -164      |
| Reward Loss         | -4.35e+05 |
| Running Env Steps   | 35000     |
| Running Forward KL  | 146       |
| Running Reverse KL  | 2.13e+03  |
| Running Update Time | 7         |
-----------------------------------
--2024-08-11 05:49:33.787383 UTC---
| Itration            | 8         |
| PAGAR Loss          | 3e+04     |
| Real Det Return     | 791       |
| Real Sto Return     | -187      |
| Reward Loss         | -3.86e+05 |
| Running Env Steps   | 40000     |
| Running Forward KL  | 143       |
| Running Reverse KL  | 1.23e+03  |
| Running Update Time | 8         |
-----------------------------------
--2024-08-11 05:51:34.908656 UTC---
| Itration            | 9         |
| PAGAR Loss          | 1.47e+06  |
| Real Det Return     | 767       |
| Real Sto Return     | -152      |
| Reward Loss         | -5.09e+05 |
| Running Env Steps   | 45000     |
| Running Forward KL  | 143       |
| Running Reverse KL  | 1.27e+03  |
| Running Update Time | 9         |
-----------------------------------
--2024-08-11 05:53:42.051317 UTC---
| Itration            | 10        |
| PAGAR Loss          | 3.85e+05  |
| Real Det Return     | 603       |
| Real Sto Return     | -127      |
| Reward Loss         | -5.23e+05 |
| Running Env Steps   | 50000     |
| Running Forward KL  | 141       |
| Running Reverse KL  | 1.12e+03  |
| Running Update Time | 10        |
-----------------------------------
--2024-08-11 05:55:51.979771 UTC---
| Itration            | 11        |
| PAGAR Loss          | 1.49e+04  |
| Real Det Return     | 599       |
| Real Sto Return     | -247      |
| Reward Loss         | -7.98e+05 |
| Running Env Steps   | 55000     |
| Running Forward KL  | 142       |
| Running Reverse KL  | 1.34e+03  |
| Running Update Time | 11        |
-----------------------------------
--2024-08-11 05:57:57.710971 UTC---
| Itration            | 12        |
| PAGAR Loss          | -3.02e+05 |
| Real Det Return     | 687       |
| Real Sto Return     | -217      |
| Reward Loss         | -5.72e+05 |
| Running Env Steps   | 60000     |
| Running Forward KL  | 137       |
| Running Reverse KL  | 923       |
| Running Update Time | 12        |
-----------------------------------
--2024-08-11 06:00:03.384471 UTC---
| Itration            | 13        |
| PAGAR Loss          | 9.15e+04  |
| Real Det Return     | 786       |
| Real Sto Return     | -182      |
| Reward Loss         | -9.98e+05 |
| Running Env Steps   | 65000     |
| Running Forward KL  | 146       |
| Running Reverse KL  | 988       |
| Running Update Time | 13        |
-----------------------------------
--2024-08-11 06:02:18.779691 UTC---
| Itration            | 14        |
| PAGAR Loss          | 6.89e+04  |
| Real Det Return     | 787       |
| Real Sto Return     | -183      |
| Reward Loss         | -8.17e+05 |
| Running Env Steps   | 70000     |
| Running Forward KL  | 140       |
| Running Reverse KL  | 916       |
| Running Update Time | 14        |
-----------------------------------
--2024-08-11 06:04:39.309899 UTC---
| Itration            | 15        |
| PAGAR Loss          | -4.22e+05 |
| Real Det Return     | 713       |
| Real Sto Return     | -132      |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 75000     |
| Running Forward KL  | 147       |
| Running Reverse KL  | 1.85e+03  |
| Running Update Time | 15        |
-----------------------------------
--2024-08-11 06:07:05.006201 UTC---
| Itration            | 16        |
| PAGAR Loss          | 3.47e+05  |
| Real Det Return     | 810       |
| Real Sto Return     | -199      |
| Reward Loss         | -9.58e+05 |
| Running Env Steps   | 80000     |
| Running Forward KL  | 143       |
| Running Reverse KL  | 1.15e+03  |
| Running Update Time | 16        |
-----------------------------------
--2024-08-11 06:09:34.358264 UTC---
| Itration            | 17        |
| PAGAR Loss          | -1.69e+05 |
| Real Det Return     | 658       |
| Real Sto Return     | -259      |
| Reward Loss         | -9.35e+05 |
| Running Env Steps   | 85000     |
| Running Forward KL  | 141       |
| Running Reverse KL  | 757       |
| Running Update Time | 17        |
-----------------------------------
--2024-08-11 06:12:02.641761 UTC---
| Itration            | 18        |
| PAGAR Loss          | 5.41e+05  |
| Real Det Return     | 830       |
| Real Sto Return     | -222      |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 90000     |
| Running Forward KL  | 142       |
| Running Reverse KL  | 1.38e+03  |
| Running Update Time | 18        |
-----------------------------------
--2024-08-11 06:14:32.870286 UTC---
| Itration            | 19        |
| PAGAR Loss          | -1.15e+04 |
| Real Det Return     | 839       |
| Real Sto Return     | -199      |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 95000     |
| Running Forward KL  | 140       |
| Running Reverse KL  | 660       |
| Running Update Time | 19        |
-----------------------------------
--2024-08-11 06:17:04.036989 UTC---
| Itration            | 20        |
| PAGAR Loss          | 6.4e+04   |
| Real Det Return     | 755       |
| Real Sto Return     | -257      |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 100000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 673       |
| Running Update Time | 20        |
-----------------------------------
--2024-08-11 06:19:37.278639 UTC--
| Itration            | 21       |
| PAGAR Loss          | 3.32e+06 |
| Real Det Return     | 785      |
| Real Sto Return     | -252     |
| Reward Loss         | -1.4e+06 |
| Running Env Steps   | 105000   |
| Running Forward KL  | 142      |
| Running Reverse KL  | 509      |
| Running Update Time | 21       |
----------------------------------
--2024-08-11 06:22:10.148048 UTC---
| Itration            | 22        |
| PAGAR Loss          | -2.65e+05 |
| Real Det Return     | 688       |
| Real Sto Return     | -256      |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 110000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 1.17e+03  |
| Running Update Time | 22        |
-----------------------------------
--2024-08-11 06:24:42.854678 UTC---
| Itration            | 23        |
| PAGAR Loss          | -1.47e+06 |
| Real Det Return     | 777       |
| Real Sto Return     | -155      |
| Reward Loss         | -1.61e+06 |
| Running Env Steps   | 115000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 511       |
| Running Update Time | 23        |
-----------------------------------
--2024-08-11 06:27:16.309763 UTC---
| Itration            | 24        |
| PAGAR Loss          | -2.31e+04 |
| Real Det Return     | 623       |
| Real Sto Return     | -202      |
| Reward Loss         | -1.47e+06 |
| Running Env Steps   | 120000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 820       |
| Running Update Time | 24        |
-----------------------------------
--2024-08-11 06:29:51.964221 UTC---
| Itration            | 25        |
| PAGAR Loss          | 7.47e+04  |
| Real Det Return     | 832       |
| Real Sto Return     | -241      |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 125000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 336       |
| Running Update Time | 25        |
-----------------------------------
--2024-08-11 06:32:26.425354 UTC---
| Itration            | 26        |
| PAGAR Loss          | 2.09e+05  |
| Real Det Return     | 679       |
| Real Sto Return     | -217      |
| Reward Loss         | -1.74e+06 |
| Running Env Steps   | 130000    |
| Running Forward KL  | 141       |
| Running Reverse KL  | 847       |
| Running Update Time | 26        |
-----------------------------------
--2024-08-11 06:35:04.995037 UTC---
| Itration            | 27        |
| PAGAR Loss          | 3.4e+04   |
| Real Det Return     | 598       |
| Real Sto Return     | -295      |
| Reward Loss         | -1.74e+06 |
| Running Env Steps   | 135000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 597       |
| Running Update Time | 27        |
-----------------------------------
--2024-08-11 06:37:39.930321 UTC---
| Itration            | 28        |
| PAGAR Loss          | -3.82e+05 |
| Real Det Return     | 719       |
| Real Sto Return     | -216      |
| Reward Loss         | -2.04e+06 |
| Running Env Steps   | 140000    |
| Running Forward KL  | 143       |
| Running Reverse KL  | 1.1e+03   |
| Running Update Time | 28        |
-----------------------------------
--2024-08-11 06:40:16.551478 UTC---
| Itration            | 29        |
| PAGAR Loss          | 7.28e+04  |
| Real Det Return     | 701       |
| Real Sto Return     | -214      |
| Reward Loss         | -1.93e+06 |
| Running Env Steps   | 145000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 437       |
| Running Update Time | 29        |
-----------------------------------
--2024-08-11 06:42:54.647389 UTC---
| Itration            | 30        |
| PAGAR Loss          | 8.61e+04  |
| Real Det Return     | 762       |
| Real Sto Return     | -205      |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 150000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 384       |
| Running Update Time | 30        |
-----------------------------------
--2024-08-11 06:45:33.006738 UTC---
| Itration            | 31        |
| PAGAR Loss          | -8.43e+05 |
| Real Det Return     | 588       |
| Real Sto Return     | -231      |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 155000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 801       |
| Running Update Time | 31        |
-----------------------------------
--2024-08-11 06:48:10.279283 UTC---
| Itration            | 32        |
| PAGAR Loss          | 1.07e+05  |
| Real Det Return     | 657       |
| Real Sto Return     | -204      |
| Reward Loss         | -2.27e+06 |
| Running Env Steps   | 160000    |
| Running Forward KL  | 141       |
| Running Reverse KL  | 659       |
| Running Update Time | 32        |
-----------------------------------
--2024-08-11 06:50:47.659103 UTC---
| Itration            | 33        |
| PAGAR Loss          | 5.01e+04  |
| Real Det Return     | 676       |
| Real Sto Return     | -212      |
| Reward Loss         | -2.32e+06 |
| Running Env Steps   | 165000    |
| Running Forward KL  | 141       |
| Running Reverse KL  | 688       |
| Running Update Time | 33        |
-----------------------------------
--2024-08-11 06:53:26.828853 UTC---
| Itration            | 34        |
| PAGAR Loss          | 2.24e+05  |
| Real Det Return     | 609       |
| Real Sto Return     | -237      |
| Reward Loss         | -2.27e+06 |
| Running Env Steps   | 170000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 667       |
| Running Update Time | 34        |
-----------------------------------
--2024-08-11 06:56:07.810187 UTC---
| Itration            | 35        |
| PAGAR Loss          | -9.12e+04 |
| Real Det Return     | 638       |
| Real Sto Return     | -253      |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 175000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 282       |
| Running Update Time | 35        |
-----------------------------------
--2024-08-11 06:58:47.874953 UTC---
| Itration            | 36        |
| PAGAR Loss          | -1.53e+05 |
| Real Det Return     | 761       |
| Real Sto Return     | -261      |
| Reward Loss         | -2.9e+06  |
| Running Env Steps   | 180000    |
| Running Forward KL  | 142       |
| Running Reverse KL  | 848       |
| Running Update Time | 36        |
-----------------------------------
--2024-08-11 07:01:27.839714 UTC---
| Itration            | 37        |
| PAGAR Loss          | -3.12e+04 |
| Real Det Return     | 720       |
| Real Sto Return     | -217      |
| Reward Loss         | -2.83e+06 |
| Running Env Steps   | 185000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 952       |
| Running Update Time | 37        |
-----------------------------------
--2024-08-11 07:04:10.495805 UTC---
| Itration            | 38        |
| PAGAR Loss          | 1.76e+04  |
| Real Det Return     | 741       |
| Real Sto Return     | -279      |
| Reward Loss         | -2.62e+06 |
| Running Env Steps   | 190000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 414       |
| Running Update Time | 38        |
-----------------------------------
--2024-08-11 07:06:53.603362 UTC---
| Itration            | 39        |
| PAGAR Loss          | -4.69e+06 |
| Real Det Return     | 796       |
| Real Sto Return     | -268      |
| Reward Loss         | -2.76e+06 |
| Running Env Steps   | 195000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 490       |
| Running Update Time | 39        |
-----------------------------------
--2024-08-11 07:09:36.417631 UTC---
| Itration            | 40        |
| PAGAR Loss          | 3.46e+06  |
| Real Det Return     | 686       |
| Real Sto Return     | -247      |
| Reward Loss         | -2.83e+06 |
| Running Env Steps   | 200000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 311       |
| Running Update Time | 40        |
-----------------------------------
--2024-08-11 07:12:16.786547 UTC---
| Itration            | 41        |
| PAGAR Loss          | nan       |
| Real Det Return     | 670       |
| Real Sto Return     | -244      |
| Reward Loss         | -2.99e+06 |
| Running Env Steps   | 205000    |
| Running Forward KL  | 142       |
| Running Reverse KL  | 910       |
| Running Update Time | 41        |
-----------------------------------
--2024-08-11 07:15:01.080922 UTC---
| Itration            | 42        |
| PAGAR Loss          | 1.02e+05  |
| Real Det Return     | 726       |
| Real Sto Return     | -297      |
| Reward Loss         | -2.92e+06 |
| Running Env Steps   | 210000    |
| Running Forward KL  | 142       |
| Running Reverse KL  | 738       |
| Running Update Time | 42        |
-----------------------------------
--2024-08-11 07:17:48.787170 UTC---
| Itration            | 43        |
| PAGAR Loss          | -6.31e+04 |
| Real Det Return     | 619       |
| Real Sto Return     | -291      |
| Reward Loss         | -3.03e+06 |
| Running Env Steps   | 215000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 513       |
| Running Update Time | 43        |
-----------------------------------
--2024-08-11 07:20:34.375953 UTC---
| Itration            | 44        |
| PAGAR Loss          | -1.11e+03 |
| Real Det Return     | 572       |
| Real Sto Return     | -296      |
| Reward Loss         | -3.19e+06 |
| Running Env Steps   | 220000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 499       |
| Running Update Time | 44        |
-----------------------------------
--2024-08-11 07:23:19.963904 UTC---
| Itration            | 45        |
| PAGAR Loss          | -3.31e+03 |
| Real Det Return     | 700       |
| Real Sto Return     | -228      |
| Reward Loss         | -3.27e+06 |
| Running Env Steps   | 225000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 636       |
| Running Update Time | 45        |
-----------------------------------
--2024-08-11 07:26:07.094922 UTC--
| Itration            | 46       |
| PAGAR Loss          | 9.49e+04 |
| Real Det Return     | 608      |
| Real Sto Return     | -248     |
| Reward Loss         | -3.3e+06 |
| Running Env Steps   | 230000   |
| Running Forward KL  | 139      |
| Running Reverse KL  | 304      |
| Running Update Time | 46       |
----------------------------------
--2024-08-11 07:28:54.262018 UTC---
| Itration            | 47        |
| PAGAR Loss          | nan       |
| Real Det Return     | 653       |
| Real Sto Return     | -259      |
| Reward Loss         | -3.26e+06 |
| Running Env Steps   | 235000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 511       |
| Running Update Time | 47        |
-----------------------------------
--2024-08-11 07:31:40.100754 UTC---
| Itration            | 48        |
| PAGAR Loss          | -3.1e+04  |
| Real Det Return     | 586       |
| Real Sto Return     | -214      |
| Reward Loss         | -3.42e+06 |
| Running Env Steps   | 240000    |
| Running Forward KL  | 142       |
| Running Reverse KL  | 379       |
| Running Update Time | 48        |
-----------------------------------
--2024-08-11 07:34:28.023728 UTC---
| Itration            | 49        |
| PAGAR Loss          | 1e+05     |
| Real Det Return     | 665       |
| Real Sto Return     | -262      |
| Reward Loss         | -3.57e+06 |
| Running Env Steps   | 245000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 209       |
| Running Update Time | 49        |
-----------------------------------
--2024-08-11 07:37:11.341708 UTC---
| Itration            | 50        |
| PAGAR Loss          | -8.23e+03 |
| Real Det Return     | 680       |
| Real Sto Return     | -247      |
| Reward Loss         | -3.87e+06 |
| Running Env Steps   | 250000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 729       |
| Running Update Time | 50        |
-----------------------------------
--2024-08-11 07:39:55.111692 UTC---
| Itration            | 51        |
| PAGAR Loss          | -9.88e+04 |
| Real Det Return     | 764       |
| Real Sto Return     | -226      |
| Reward Loss         | -3.83e+06 |
| Running Env Steps   | 255000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 439       |
| Running Update Time | 51        |
-----------------------------------
--2024-08-11 07:42:39.550291 UTC--
| Itration            | 52       |
| PAGAR Loss          | 2.85e+04 |
| Real Det Return     | 644      |
| Real Sto Return     | -275     |
| Reward Loss         | -3.9e+06 |
| Running Env Steps   | 260000   |
| Running Forward KL  | 138      |
| Running Reverse KL  | 171      |
| Running Update Time | 52       |
----------------------------------
--2024-08-11 07:45:24.692939 UTC---
| Itration            | 53        |
| PAGAR Loss          | nan       |
| Real Det Return     | 631       |
| Real Sto Return     | -243      |
| Reward Loss         | -4.03e+06 |
| Running Env Steps   | 265000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 626       |
| Running Update Time | 53        |
-----------------------------------
--2024-08-11 07:48:08.600554 UTC---
| Itration            | 54        |
| PAGAR Loss          | 7.15e+05  |
| Real Det Return     | 721       |
| Real Sto Return     | -223      |
| Reward Loss         | -4.08e+06 |
| Running Env Steps   | 270000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 335       |
| Running Update Time | 54        |
-----------------------------------
--2024-08-11 07:50:54.470028 UTC---
| Itration            | 55        |
| PAGAR Loss          | 2.47e+03  |
| Real Det Return     | 705       |
| Real Sto Return     | -300      |
| Reward Loss         | -4.13e+06 |
| Running Env Steps   | 275000    |
| Running Forward KL  | 141       |
| Running Reverse KL  | 224       |
| Running Update Time | 55        |
-----------------------------------
--2024-08-11 07:53:40.214815 UTC---
| Itration            | 56        |
| PAGAR Loss          | nan       |
| Real Det Return     | 611       |
| Real Sto Return     | -262      |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 280000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 46.3      |
| Running Update Time | 56        |
-----------------------------------
--2024-08-11 07:56:26.817727 UTC---
| Itration            | 57        |
| PAGAR Loss          | 6.21e+03  |
| Real Det Return     | 663       |
| Real Sto Return     | -267      |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 285000    |
| Running Forward KL  | 142       |
| Running Reverse KL  | 268       |
| Running Update Time | 57        |
-----------------------------------
--2024-08-11 07:59:12.402043 UTC---
| Itration            | 58        |
| PAGAR Loss          | -3.51e+03 |
| Real Det Return     | 651       |
| Real Sto Return     | -237      |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 290000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 375       |
| Running Update Time | 58        |
-----------------------------------
--2024-08-11 08:01:58.698992 UTC---
| Itration            | 59        |
| PAGAR Loss          | 9.21e+04  |
| Real Det Return     | 631       |
| Real Sto Return     | -257      |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 295000    |
| Running Forward KL  | 136       |
| Running Reverse KL  | 280       |
| Running Update Time | 59        |
-----------------------------------
--2024-08-11 08:04:46.287505 UTC---
| Itration            | 60        |
| PAGAR Loss          | -1.21e+04 |
| Real Det Return     | 692       |
| Real Sto Return     | -265      |
| Reward Loss         | -4.66e+06 |
| Running Env Steps   | 300000    |
| Running Forward KL  | 141       |
| Running Reverse KL  | 303       |
| Running Update Time | 60        |
-----------------------------------
--2024-08-11 08:07:32.038411 UTC---
| Itration            | 61        |
| PAGAR Loss          | nan       |
| Real Det Return     | 710       |
| Real Sto Return     | -279      |
| Reward Loss         | -4.79e+06 |
| Running Env Steps   | 305000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 826       |
| Running Update Time | 61        |
-----------------------------------
--2024-08-11 08:10:19.688547 UTC---
| Itration            | 62        |
| PAGAR Loss          | -8.74e+04 |
| Real Det Return     | 769       |
| Real Sto Return     | -251      |
| Reward Loss         | -4.76e+06 |
| Running Env Steps   | 310000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 330       |
| Running Update Time | 62        |
-----------------------------------
--2024-08-11 08:13:07.381245 UTC---
| Itration            | 63        |
| PAGAR Loss          | nan       |
| Real Det Return     | 655       |
| Real Sto Return     | -244      |
| Reward Loss         | -4.85e+06 |
| Running Env Steps   | 315000    |
| Running Forward KL  | 140       |
| Running Reverse KL  | 530       |
| Running Update Time | 63        |
-----------------------------------
--2024-08-11 08:15:58.278792 UTC---
| Itration            | 64        |
| PAGAR Loss          | 2.66e+05  |
| Real Det Return     | 688       |
| Real Sto Return     | -252      |
| Reward Loss         | -4.96e+06 |
| Running Env Steps   | 320000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 300       |
| Running Update Time | 64        |
-----------------------------------
--2024-08-11 08:18:49.302039 UTC---
| Itration            | 65        |
| PAGAR Loss          | 9.39e+06  |
| Real Det Return     | 823       |
| Real Sto Return     | -244      |
| Reward Loss         | -5.01e+06 |
| Running Env Steps   | 325000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 736       |
| Running Update Time | 65        |
-----------------------------------
--2024-08-11 08:21:39.046410 UTC---
| Itration            | 66        |
| PAGAR Loss          | 1.42e+04  |
| Real Det Return     | 720       |
| Real Sto Return     | -175      |
| Reward Loss         | -5.18e+06 |
| Running Env Steps   | 330000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 495       |
| Running Update Time | 66        |
-----------------------------------
--2024-08-11 08:24:28.059700 UTC---
| Itration            | 67        |
| PAGAR Loss          | -5.51e+05 |
| Real Det Return     | 708       |
| Real Sto Return     | -237      |
| Reward Loss         | -5.39e+06 |
| Running Env Steps   | 335000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 1.11e+03  |
| Running Update Time | 67        |
-----------------------------------
--2024-08-11 08:27:19.313030 UTC---
| Itration            | 68        |
| PAGAR Loss          | 2.65e+03  |
| Real Det Return     | 788       |
| Real Sto Return     | -225      |
| Reward Loss         | -5.29e+06 |
| Running Env Steps   | 340000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 189       |
| Running Update Time | 68        |
-----------------------------------
--2024-08-11 08:30:10.910115 UTC---
| Itration            | 69        |
| PAGAR Loss          | nan       |
| Real Det Return     | 688       |
| Real Sto Return     | -226      |
| Reward Loss         | -5.44e+06 |
| Running Env Steps   | 345000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 516       |
| Running Update Time | 69        |
-----------------------------------
--2024-08-11 08:33:01.922077 UTC---
| Itration            | 70        |
| PAGAR Loss          | -1.37e+04 |
| Real Det Return     | 686       |
| Real Sto Return     | -203      |
| Reward Loss         | -5.64e+06 |
| Running Env Steps   | 350000    |
| Running Forward KL  | 141       |
| Running Reverse KL  | 447       |
| Running Update Time | 70        |
-----------------------------------
--2024-08-11 08:35:53.208034 UTC---
| Itration            | 71        |
| PAGAR Loss          | 8.72e+04  |
| Real Det Return     | 765       |
| Real Sto Return     | -218      |
| Reward Loss         | -5.36e+06 |
| Running Env Steps   | 355000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 455       |
| Running Update Time | 71        |
-----------------------------------
--2024-08-11 08:38:46.277312 UTC---
| Itration            | 72        |
| PAGAR Loss          | -3.68e+04 |
| Real Det Return     | 790       |
| Real Sto Return     | -266      |
| Reward Loss         | -5.62e+06 |
| Running Env Steps   | 360000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 185       |
| Running Update Time | 72        |
-----------------------------------
--2024-08-11 08:41:38.235150 UTC---
| Itration            | 73        |
| PAGAR Loss          | 1.3e+05   |
| Real Det Return     | 657       |
| Real Sto Return     | -243      |
| Reward Loss         | -5.74e+06 |
| Running Env Steps   | 365000    |
| Running Forward KL  | 143       |
| Running Reverse KL  | 205       |
| Running Update Time | 73        |
-----------------------------------
--2024-08-11 08:44:32.087903 UTC---
| Itration            | 74        |
| PAGAR Loss          | 3.58e+05  |
| Real Det Return     | 715       |
| Real Sto Return     | -269      |
| Reward Loss         | -5.89e+06 |
| Running Env Steps   | 370000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 242       |
| Running Update Time | 74        |
-----------------------------------
--2024-08-11 08:47:25.047231 UTC---
| Itration            | 75        |
| PAGAR Loss          | -1.4e+06  |
| Real Det Return     | 700       |
| Real Sto Return     | -258      |
| Reward Loss         | -6.06e+06 |
| Running Env Steps   | 375000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 380       |
| Running Update Time | 75        |
-----------------------------------
--2024-08-11 08:50:18.660858 UTC--
| Itration            | 76       |
| PAGAR Loss          | 2.92e+04 |
| Real Det Return     | 789      |
| Real Sto Return     | -230     |
| Reward Loss         | -6.1e+06 |
| Running Env Steps   | 380000   |
| Running Forward KL  | 135      |
| Running Reverse KL  | 46.6     |
| Running Update Time | 76       |
----------------------------------
--2024-08-11 08:53:07.805897 UTC---
| Itration            | 77        |
| PAGAR Loss          | 2.92e+04  |
| Real Det Return     | 774       |
| Real Sto Return     | -194      |
| Reward Loss         | -6.29e+06 |
| Running Env Steps   | 385000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 45.7      |
| Running Update Time | 77        |
-----------------------------------
--2024-08-11 08:55:57.874330 UTC---
| Itration            | 78        |
| PAGAR Loss          | -5.71e+05 |
| Real Det Return     | 753       |
| Real Sto Return     | -227      |
| Reward Loss         | -6.59e+06 |
| Running Env Steps   | 390000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 503       |
| Running Update Time | 78        |
-----------------------------------
--2024-08-11 08:58:48.947455 UTC---
| Itration            | 79        |
| PAGAR Loss          | 5.61e+03  |
| Real Det Return     | 817       |
| Real Sto Return     | -250      |
| Reward Loss         | -6.25e+06 |
| Running Env Steps   | 395000    |
| Running Forward KL  | 136       |
| Running Reverse KL  | 399       |
| Running Update Time | 79        |
-----------------------------------
--2024-08-11 09:01:40.760365 UTC---
| Itration            | 80        |
| PAGAR Loss          | 3.36e+03  |
| Real Det Return     | 776       |
| Real Sto Return     | -227      |
| Reward Loss         | -6.52e+06 |
| Running Env Steps   | 400000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 273       |
| Running Update Time | 80        |
-----------------------------------
--2024-08-11 09:04:31.228205 UTC---
| Itration            | 81        |
| PAGAR Loss          | -2.29e+05 |
| Real Det Return     | 764       |
| Real Sto Return     | -233      |
| Reward Loss         | -6.64e+06 |
| Running Env Steps   | 405000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 1.09e+03  |
| Running Update Time | 81        |
-----------------------------------
--2024-08-11 09:07:22.040742 UTC---
| Itration            | 82        |
| PAGAR Loss          | 1.41e+04  |
| Real Det Return     | 779       |
| Real Sto Return     | -266      |
| Reward Loss         | -6.86e+06 |
| Running Env Steps   | 410000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 211       |
| Running Update Time | 82        |
-----------------------------------
--2024-08-11 09:10:12.564641 UTC---
| Itration            | 83        |
| PAGAR Loss          | nan       |
| Real Det Return     | 808       |
| Real Sto Return     | -218      |
| Reward Loss         | -6.66e+06 |
| Running Env Steps   | 415000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 641       |
| Running Update Time | 83        |
-----------------------------------
--2024-08-11 09:13:04.109093 UTC---
| Itration            | 84        |
| PAGAR Loss          | 3.55e+03  |
| Real Det Return     | 862       |
| Real Sto Return     | -219      |
| Reward Loss         | -7.12e+06 |
| Running Env Steps   | 420000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 45.4      |
| Running Update Time | 84        |
-----------------------------------
--2024-08-11 09:15:56.155109 UTC--
| Itration            | 85       |
| PAGAR Loss          | 1.19e+04 |
| Real Det Return     | 738      |
| Real Sto Return     | -262     |
| Reward Loss         | -7.2e+06 |
| Running Env Steps   | 425000   |
| Running Forward KL  | 138      |
| Running Reverse KL  | 275      |
| Running Update Time | 85       |
----------------------------------
--2024-08-11 09:18:48.491210 UTC--
| Itration            | 86       |
| PAGAR Loss          | 501      |
| Real Det Return     | 725      |
| Real Sto Return     | -228     |
| Reward Loss         | -7.1e+06 |
| Running Env Steps   | 430000   |
| Running Forward KL  | 136      |
| Running Reverse KL  | 277      |
| Running Update Time | 86       |
----------------------------------
--2024-08-11 09:21:39.902562 UTC---
| Itration            | 87        |
| PAGAR Loss          | nan       |
| Real Det Return     | 696       |
| Real Sto Return     | -210      |
| Reward Loss         | -7.21e+06 |
| Running Env Steps   | 435000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 271       |
| Running Update Time | 87        |
-----------------------------------
--2024-08-11 09:24:32.436416 UTC---
| Itration            | 88        |
| PAGAR Loss          | 2.71e+05  |
| Real Det Return     | 763       |
| Real Sto Return     | -216      |
| Reward Loss         | -7.46e+06 |
| Running Env Steps   | 440000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 179       |
| Running Update Time | 88        |
-----------------------------------
--2024-08-11 09:27:24.803820 UTC---
| Itration            | 89        |
| PAGAR Loss          | 5.84e+04  |
| Real Det Return     | 727       |
| Real Sto Return     | -210      |
| Reward Loss         | -7.48e+06 |
| Running Env Steps   | 445000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 199       |
| Running Update Time | 89        |
-----------------------------------
--2024-08-11 09:30:17.391400 UTC---
| Itration            | 90        |
| PAGAR Loss          | nan       |
| Real Det Return     | 742       |
| Real Sto Return     | -210      |
| Reward Loss         | -7.91e+06 |
| Running Env Steps   | 450000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 140       |
| Running Update Time | 90        |
-----------------------------------
--2024-08-11 09:33:10.383943 UTC---
| Itration            | 91        |
| PAGAR Loss          | -3.41e+04 |
| Real Det Return     | 606       |
| Real Sto Return     | -244      |
| Reward Loss         | -7.93e+06 |
| Running Env Steps   | 455000    |
| Running Forward KL  | 132       |
| Running Reverse KL  | 257       |
| Running Update Time | 91        |
-----------------------------------
--2024-08-11 09:36:02.094644 UTC---
| Itration            | 92        |
| PAGAR Loss          | -1.72e+03 |
| Real Det Return     | 564       |
| Real Sto Return     | -168      |
| Reward Loss         | -7.74e+06 |
| Running Env Steps   | 460000    |
| Running Forward KL  | 135       |
| Running Reverse KL  | 302       |
| Running Update Time | 92        |
-----------------------------------
--2024-08-11 09:38:54.131227 UTC---
| Itration            | 93        |
| PAGAR Loss          | -8.21e+03 |
| Real Det Return     | 782       |
| Real Sto Return     | -178      |
| Reward Loss         | -7.77e+06 |
| Running Env Steps   | 465000    |
| Running Forward KL  | 135       |
| Running Reverse KL  | 333       |
| Running Update Time | 93        |
-----------------------------------
--2024-08-11 09:41:47.191661 UTC---
| Itration            | 94        |
| PAGAR Loss          | 1.93e+05  |
| Real Det Return     | 803       |
| Real Sto Return     | -277      |
| Reward Loss         | -8.19e+06 |
| Running Env Steps   | 470000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 45.8      |
| Running Update Time | 94        |
-----------------------------------
--2024-08-11 09:44:41.211038 UTC---
| Itration            | 95        |
| PAGAR Loss          | -1.82e+03 |
| Real Det Return     | 715       |
| Real Sto Return     | -198      |
| Reward Loss         | -8.25e+06 |
| Running Env Steps   | 475000    |
| Running Forward KL  | 135       |
| Running Reverse KL  | 48.4      |
| Running Update Time | 95        |
-----------------------------------
--2024-08-11 09:47:35.025408 UTC---
| Itration            | 96        |
| PAGAR Loss          | 1.45e+06  |
| Real Det Return     | 583       |
| Real Sto Return     | -197      |
| Reward Loss         | -8.24e+06 |
| Running Env Steps   | 480000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 342       |
| Running Update Time | 96        |
-----------------------------------
--2024-08-11 09:50:26.210714 UTC--
| Itration            | 97       |
| PAGAR Loss          | nan      |
| Real Det Return     | 761      |
| Real Sto Return     | -197     |
| Reward Loss         | -8.1e+06 |
| Running Env Steps   | 485000   |
| Running Forward KL  | 133      |
| Running Reverse KL  | 1.02e+03 |
| Running Update Time | 97       |
----------------------------------
--2024-08-11 09:53:19.897771 UTC---
| Itration            | 98        |
| PAGAR Loss          | 5.2e+04   |
| Real Det Return     | 796       |
| Real Sto Return     | -177      |
| Reward Loss         | -8.71e+06 |
| Running Env Steps   | 490000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 208       |
| Running Update Time | 98        |
-----------------------------------
--2024-08-11 09:56:13.430641 UTC---
| Itration            | 99        |
| PAGAR Loss          | 1.64e+05  |
| Real Det Return     | 746       |
| Real Sto Return     | -216      |
| Reward Loss         | -8.39e+06 |
| Running Env Steps   | 495000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 604       |
| Running Update Time | 99        |
-----------------------------------
--2024-08-11 09:59:07.252611 UTC---
| Itration            | 100       |
| PAGAR Loss          | nan       |
| Real Det Return     | 765       |
| Real Sto Return     | -156      |
| Reward Loss         | -8.98e+06 |
| Running Env Steps   | 500000    |
| Running Forward KL  | 139       |
| Running Reverse KL  | 481       |
| Running Update Time | 100       |
-----------------------------------
--2024-08-11 10:02:00.293514 UTC---
| Itration            | 101       |
| PAGAR Loss          | -661      |
| Real Det Return     | 735       |
| Real Sto Return     | -208      |
| Reward Loss         | -9.26e+06 |
| Running Env Steps   | 505000    |
| Running Forward KL  | 138       |
| Running Reverse KL  | 487       |
| Running Update Time | 101       |
-----------------------------------
--2024-08-11 10:04:53.548555 UTC--
| Itration            | 102      |
| PAGAR Loss          | nan      |
| Real Det Return     | 757      |
| Real Sto Return     | -172     |
| Reward Loss         | -9.2e+06 |
| Running Env Steps   | 510000   |
| Running Forward KL  | 137      |
| Running Reverse KL  | 422      |
| Running Update Time | 102      |
----------------------------------
--2024-08-11 10:07:48.267445 UTC---
| Itration            | 103       |
| PAGAR Loss          | 5.26e+05  |
| Real Det Return     | 752       |
| Real Sto Return     | -261      |
| Reward Loss         | -8.93e+06 |
| Running Env Steps   | 515000    |
| Running Forward KL  | 134       |
| Running Reverse KL  | 263       |
| Running Update Time | 103       |
-----------------------------------
--2024-08-11 10:10:41.842971 UTC---
| Itration            | 104       |
| PAGAR Loss          | nan       |
| Real Det Return     | 720       |
| Real Sto Return     | -195      |
| Reward Loss         | -9.36e+06 |
| Running Env Steps   | 520000    |
| Running Forward KL  | 136       |
| Running Reverse KL  | 269       |
| Running Update Time | 104       |
-----------------------------------
--2024-08-11 10:13:35.613651 UTC---
| Itration            | 105       |
| PAGAR Loss          | -5.78e+04 |
| Real Det Return     | 613       |
| Real Sto Return     | -133      |
| Reward Loss         | -9.04e+06 |
| Running Env Steps   | 525000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 862       |
| Running Update Time | 105       |
-----------------------------------
--2024-08-11 10:16:29.887976 UTC---
| Itration            | 106       |
| PAGAR Loss          | 1.78e+06  |
| Real Det Return     | 776       |
| Real Sto Return     | -136      |
| Reward Loss         | -9.75e+06 |
| Running Env Steps   | 530000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 130       |
| Running Update Time | 106       |
-----------------------------------
--2024-08-11 10:19:23.666864 UTC---
| Itration            | 107       |
| PAGAR Loss          | -4.24e+03 |
| Real Det Return     | 747       |
| Real Sto Return     | -195      |
| Reward Loss         | -9.62e+06 |
| Running Env Steps   | 535000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 433       |
| Running Update Time | 107       |
-----------------------------------
--2024-08-11 10:22:17.772429 UTC---
| Itration            | 108       |
| PAGAR Loss          | nan       |
| Real Det Return     | 800       |
| Real Sto Return     | -132      |
| Reward Loss         | -9.69e+06 |
| Running Env Steps   | 540000    |
| Running Forward KL  | 133       |
| Running Reverse KL  | 938       |
| Running Update Time | 108       |
-----------------------------------
--2024-08-11 10:25:13.871516 UTC---
| Itration            | 109       |
| PAGAR Loss          | -5.91e+03 |
| Real Det Return     | 472       |
| Real Sto Return     | -203      |
| Reward Loss         | -9.63e+06 |
| Running Env Steps   | 545000    |
| Running Forward KL  | 129       |
| Running Reverse KL  | 77.5      |
| Running Update Time | 109       |
-----------------------------------
--2024-08-11 10:28:07.821386 UTC---
| Itration            | 110       |
| PAGAR Loss          | 1.14e+04  |
| Real Det Return     | 588       |
| Real Sto Return     | -170      |
| Reward Loss         | -9.76e+06 |
| Running Env Steps   | 550000    |
| Running Forward KL  | 137       |
| Running Reverse KL  | 947       |
| Running Update Time | 110       |
-----------------------------------
--2024-08-11 10:31:00.764101 UTC---
| Itration            | 111       |
| PAGAR Loss          | 1.2e+06   |
| Real Det Return     | 653       |
| Real Sto Return     | -125      |
| Reward Loss         | -1.03e+07 |
| Running Env Steps   | 555000    |
| Running Forward KL  | 132       |
| Running Reverse KL  | 43.9      |
| Running Update Time | 111       |
-----------------------------------
--2024-08-11 10:33:56.126867 UTC---
| Itration            | 112       |
| PAGAR Loss          | -1.29e+05 |
| Real Det Return     | 497       |
| Real Sto Return     | -96.7     |
| Reward Loss         | -9.42e+06 |
| Running Env Steps   | 560000    |
| Running Forward KL  | 129       |
| Running Reverse KL  | 819       |
| Running Update Time | 112       |
-----------------------------------
--2024-08-11 10:36:52.895984 UTC---
| Itration            | 113       |
| PAGAR Loss          | -1.66e+04 |
| Real Det Return     | 770       |
| Real Sto Return     | -205      |
| Reward Loss         | -1.04e+07 |
| Running Env Steps   | 565000    |
| Running Forward KL  | 130       |
| Running Reverse KL  | 468       |
| Running Update Time | 113       |
-----------------------------------
--2024-08-11 10:39:50.069674 UTC---
| Itration            | 114       |
| PAGAR Loss          | 1.87e+04  |
| Real Det Return     | 459       |
| Real Sto Return     | -78.7     |
| Reward Loss         | -1.02e+07 |
| Running Env Steps   | 570000    |
| Running Forward KL  | 131       |
| Running Reverse KL  | 287       |
| Running Update Time | 114       |
-----------------------------------
--2024-08-11 10:42:45.517863 UTC---
| Itration            | 115       |
| PAGAR Loss          | 3.67e+05  |
| Real Det Return     | 725       |
| Real Sto Return     | -82.1     |
| Reward Loss         | -1.04e+07 |
| Running Env Steps   | 575000    |
| Running Forward KL  | 125       |
| Running Reverse KL  | 221       |
| Running Update Time | 115       |
-----------------------------------
--2024-08-11 10:45:43.709848 UTC---
| Itration            | 116       |
| PAGAR Loss          | -3.83e+03 |
| Real Det Return     | 551       |
| Real Sto Return     | -102      |
| Reward Loss         | -1.06e+07 |
| Running Env Steps   | 580000    |
| Running Forward KL  | 129       |
| Running Reverse KL  | 238       |
| Running Update Time | 116       |
-----------------------------------
--2024-08-11 10:48:41.003797 UTC---
| Itration            | 117       |
| PAGAR Loss          | nan       |
| Real Det Return     | 745       |
| Real Sto Return     | -96       |
| Reward Loss         | -1.08e+07 |
| Running Env Steps   | 585000    |
| Running Forward KL  | 130       |
| Running Reverse KL  | 717       |
| Running Update Time | 117       |
-----------------------------------
--2024-08-11 10:51:39.149654 UTC---
| Itration            | 118       |
| PAGAR Loss          | 2.36e+05  |
| Real Det Return     | 553       |
| Real Sto Return     | -138      |
| Reward Loss         | -1.08e+07 |
| Running Env Steps   | 590000    |
| Running Forward KL  | 130       |
| Running Reverse KL  | 504       |
| Running Update Time | 118       |
-----------------------------------
--2024-08-11 10:54:37.507803 UTC---
| Itration            | 119       |
| PAGAR Loss          | 7.04e+07  |
| Real Det Return     | 418       |
| Real Sto Return     | -138      |
| Reward Loss         | -1.13e+07 |
| Running Env Steps   | 595000    |
| Running Forward KL  | 129       |
| Running Reverse KL  | 46.9      |
| Running Update Time | 119       |
-----------------------------------
--2024-08-11 10:57:32.967947 UTC---
| Itration            | 120       |
| PAGAR Loss          | nan       |
| Real Det Return     | 596       |
| Real Sto Return     | -106      |
| Reward Loss         | -1.14e+07 |
| Running Env Steps   | 600000    |
| Running Forward KL  | 128       |
| Running Reverse KL  | 897       |
| Running Update Time | 120       |
-----------------------------------
--2024-08-11 11:00:30.278997 UTC---
| Itration            | 121       |
| PAGAR Loss          | -4.02e+03 |
| Real Det Return     | 621       |
| Real Sto Return     | -63.8     |
| Reward Loss         | -1.1e+07  |
| Running Env Steps   | 605000    |
| Running Forward KL  | 126       |
| Running Reverse KL  | 958       |
| Running Update Time | 121       |
-----------------------------------
--2024-08-11 11:03:29.082714 UTC---
| Itration            | 122       |
| PAGAR Loss          | nan       |
| Real Det Return     | 662       |
| Real Sto Return     | -79.4     |
| Reward Loss         | -1.14e+07 |
| Running Env Steps   | 610000    |
| Running Forward KL  | 122       |
| Running Reverse KL  | 444       |
| Running Update Time | 122       |
-----------------------------------
--2024-08-11 11:06:29.435085 UTC---
| Itration            | 123       |
| PAGAR Loss          | -3.64e+03 |
| Real Det Return     | 463       |
| Real Sto Return     | 14.8      |
| Reward Loss         | -1.13e+07 |
| Running Env Steps   | 615000    |
| Running Forward KL  | 125       |
| Running Reverse KL  | 289       |
| Running Update Time | 123       |
-----------------------------------
--2024-08-11 11:09:27.759912 UTC---
| Itration            | 124       |
| PAGAR Loss          | -1.78e+05 |
| Real Det Return     | 547       |
| Real Sto Return     | 27.7      |
| Reward Loss         | -1.13e+07 |
| Running Env Steps   | 620000    |
| Running Forward KL  | 131       |
| Running Reverse KL  | 795       |
| Running Update Time | 124       |
-----------------------------------
--2024-08-11 11:12:25.908452 UTC---
| Itration            | 125       |
| PAGAR Loss          | -5.19e+06 |
| Real Det Return     | 522       |
| Real Sto Return     | -80       |
| Reward Loss         | -1.24e+07 |
| Running Env Steps   | 625000    |
| Running Forward KL  | 124       |
| Running Reverse KL  | 784       |
| Running Update Time | 125       |
-----------------------------------
--2024-08-11 11:15:30.487841 UTC---
| Itration            | 126       |
| PAGAR Loss          | nan       |
| Real Det Return     | 528       |
| Real Sto Return     | -104      |
| Reward Loss         | -1.16e+07 |
| Running Env Steps   | 630000    |
| Running Forward KL  | 122       |
| Running Reverse KL  | 847       |
| Running Update Time | 126       |
-----------------------------------
--2024-08-11 11:18:35.131683 UTC---
| Itration            | 127       |
| PAGAR Loss          | nan       |
| Real Det Return     | 162       |
| Real Sto Return     | 11        |
| Reward Loss         | -1.14e+07 |
| Running Env Steps   | 635000    |
| Running Forward KL  | 118       |
| Running Reverse KL  | 469       |
| Running Update Time | 127       |
-----------------------------------
--2024-08-11 11:21:38.936014 UTC---
| Itration            | 128       |
| PAGAR Loss          | 3.87e+04  |
| Real Det Return     | 586       |
| Real Sto Return     | -110      |
| Reward Loss         | -1.22e+07 |
| Running Env Steps   | 640000    |
| Running Forward KL  | 127       |
| Running Reverse KL  | 292       |
| Running Update Time | 128       |
-----------------------------------
--2024-08-11 11:24:41.566445 UTC---
| Itration            | 129       |
| PAGAR Loss          | 2.18e+04  |
| Real Det Return     | 438       |
| Real Sto Return     | -23.1     |
| Reward Loss         | -1.27e+07 |
| Running Env Steps   | 645000    |
| Running Forward KL  | 122       |
| Running Reverse KL  | 258       |
| Running Update Time | 129       |
-----------------------------------
--2024-08-11 11:27:48.034592 UTC---
| Itration            | 130       |
| PAGAR Loss          | -1.88e+04 |
| Real Det Return     | 493       |
| Real Sto Return     | 15.6      |
| Reward Loss         | -1.26e+07 |
| Running Env Steps   | 650000    |
| Running Forward KL  | 124       |
| Running Reverse KL  | 188       |
| Running Update Time | 130       |
-----------------------------------
--2024-08-11 11:30:53.183012 UTC---
| Itration            | 131       |
| PAGAR Loss          | -8e+04    |
| Real Det Return     | 344       |
| Real Sto Return     | 29.7      |
| Reward Loss         | -1.29e+07 |
| Running Env Steps   | 655000    |
| Running Forward KL  | 118       |
| Running Reverse KL  | 193       |
| Running Update Time | 131       |
-----------------------------------
--2024-08-11 11:33:59.639274 UTC---
| Itration            | 132       |
| PAGAR Loss          | -1.55e+06 |
| Real Det Return     | 298       |
| Real Sto Return     | 146       |
| Reward Loss         | -1.14e+07 |
| Running Env Steps   | 660000    |
| Running Forward KL  | 110       |
| Running Reverse KL  | 133       |
| Running Update Time | 132       |
-----------------------------------
--2024-08-11 11:37:07.283375 UTC---
| Itration            | 133       |
| PAGAR Loss          | -1.59e+05 |
| Real Det Return     | 332       |
| Real Sto Return     | 176       |
| Reward Loss         | -1.21e+07 |
| Running Env Steps   | 665000    |
| Running Forward KL  | 117       |
| Running Reverse KL  | 312       |
| Running Update Time | 133       |
-----------------------------------
--2024-08-11 11:40:09.165633 UTC---
| Itration            | 134       |
| PAGAR Loss          | 1.93e+05  |
| Real Det Return     | 439       |
| Real Sto Return     | 11.1      |
| Reward Loss         | -1.26e+07 |
| Running Env Steps   | 670000    |
| Running Forward KL  | 114       |
| Running Reverse KL  | 528       |
| Running Update Time | 134       |
-----------------------------------
--2024-08-11 11:43:14.872567 UTC---
| Itration            | 135       |
| PAGAR Loss          | -4.03e+04 |
| Real Det Return     | 96.8      |
| Real Sto Return     | 106       |
| Reward Loss         | -1.19e+07 |
| Running Env Steps   | 675000    |
| Running Forward KL  | 109       |
| Running Reverse KL  | 342       |
| Running Update Time | 135       |
-----------------------------------
--2024-08-11 11:46:22.946357 UTC---
| Itration            | 136       |
| PAGAR Loss          | -1.78e+04 |
| Real Det Return     | 374       |
| Real Sto Return     | 127       |
| Reward Loss         | -1.22e+07 |
| Running Env Steps   | 680000    |
| Running Forward KL  | 116       |
| Running Reverse KL  | 294       |
| Running Update Time | 136       |
-----------------------------------
--2024-08-11 11:49:28.064035 UTC--
| Itration            | 137      |
| PAGAR Loss          | 2.3e+05  |
| Real Det Return     | 544      |
| Real Sto Return     | 86.3     |
| Reward Loss         | -1.1e+07 |
| Running Env Steps   | 685000   |
| Running Forward KL  | 106      |
| Running Reverse KL  | 408      |
| Running Update Time | 137      |
----------------------------------
--2024-08-11 11:52:35.180419 UTC---
| Itration            | 138       |
| PAGAR Loss          | 2.95e+06  |
| Real Det Return     | 456       |
| Real Sto Return     | 198       |
| Reward Loss         | -1.17e+07 |
| Running Env Steps   | 690000    |
| Running Forward KL  | 107       |
| Running Reverse KL  | 553       |
| Running Update Time | 138       |
-----------------------------------
--2024-08-11 11:55:39.556836 UTC---
| Itration            | 139       |
| PAGAR Loss          | -2.93e+05 |
| Real Det Return     | 426       |
| Real Sto Return     | 261       |
| Reward Loss         | -1.17e+07 |
| Running Env Steps   | 695000    |
| Running Forward KL  | 106       |
| Running Reverse KL  | 576       |
| Running Update Time | 139       |
-----------------------------------
--2024-08-11 11:58:44.678062 UTC--
| Itration            | 140      |
| PAGAR Loss          | 4.5e+05  |
| Real Det Return     | 500      |
| Real Sto Return     | 356      |
| Reward Loss         | -1.2e+07 |
| Running Env Steps   | 700000   |
| Running Forward KL  | 101      |
| Running Reverse KL  | 435      |
| Running Update Time | 140      |
----------------------------------
--2024-08-11 12:01:50.850240 UTC---
| Itration            | 141       |
| PAGAR Loss          | nan       |
| Real Det Return     | 288       |
| Real Sto Return     | 216       |
| Reward Loss         | -1.27e+07 |
| Running Env Steps   | 705000    |
| Running Forward KL  | 104       |
| Running Reverse KL  | 786       |
| Running Update Time | 141       |
-----------------------------------
--2024-08-11 12:04:57.465350 UTC---
| Itration            | 142       |
| PAGAR Loss          | -9.4e+06  |
| Real Det Return     | 483       |
| Real Sto Return     | 252       |
| Reward Loss         | -1.32e+07 |
| Running Env Steps   | 710000    |
| Running Forward KL  | 97.9      |
| Running Reverse KL  | 231       |
| Running Update Time | 142       |
-----------------------------------
--2024-08-11 12:08:02.612218 UTC---
| Itration            | 143       |
| PAGAR Loss          | -1.8e+05  |
| Real Det Return     | 943       |
| Real Sto Return     | 271       |
| Reward Loss         | -1.19e+07 |
| Running Env Steps   | 715000    |
| Running Forward KL  | 99.2      |
| Running Reverse KL  | 830       |
| Running Update Time | 143       |
-----------------------------------
--2024-08-11 12:11:07.216657 UTC---
| Itration            | 144       |
| PAGAR Loss          | -4.27e+05 |
| Real Det Return     | 719       |
| Real Sto Return     | 34.5      |
| Reward Loss         | -1.23e+07 |
| Running Env Steps   | 720000    |
| Running Forward KL  | 100       |
| Running Reverse KL  | 846       |
| Running Update Time | 144       |
-----------------------------------
--2024-08-11 12:14:10.425878 UTC---
| Itration            | 145       |
| PAGAR Loss          | 2.11e+05  |
| Real Det Return     | 1.12e+03  |
| Real Sto Return     | 229       |
| Reward Loss         | -1.32e+07 |
| Running Env Steps   | 725000    |
| Running Forward KL  | 99.9      |
| Running Reverse KL  | 354       |
| Running Update Time | 145       |
-----------------------------------
--2024-08-11 12:17:14.910506 UTC---
| Itration            | 146       |
| PAGAR Loss          | -8.77e+04 |
| Real Det Return     | 1.31e+03  |
| Real Sto Return     | 315       |
| Reward Loss         | -1.06e+07 |
| Running Env Steps   | 730000    |
| Running Forward KL  | 94.8      |
| Running Reverse KL  | 879       |
| Running Update Time | 146       |
-----------------------------------
--2024-08-11 12:20:18.733934 UTC---
| Itration            | 147       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.76e+03  |
| Real Sto Return     | 352       |
| Reward Loss         | -1.12e+07 |
| Running Env Steps   | 735000    |
| Running Forward KL  | 98.1      |
| Running Reverse KL  | 1.45e+03  |
| Running Update Time | 147       |
-----------------------------------
--2024-08-11 12:23:25.717447 UTC---
| Itration            | 148       |
| PAGAR Loss          | -3.89e+04 |
| Real Det Return     | 994       |
| Real Sto Return     | 565       |
| Reward Loss         | -1.18e+07 |
| Running Env Steps   | 740000    |
| Running Forward KL  | 91        |
| Running Reverse KL  | 309       |
| Running Update Time | 148       |
-----------------------------------
--2024-08-11 12:26:32.122274 UTC---
| Itration            | 149       |
| PAGAR Loss          | 4.99e+06  |
| Real Det Return     | 2.07e+03  |
| Real Sto Return     | 278       |
| Reward Loss         | -1.22e+07 |
| Running Env Steps   | 745000    |
| Running Forward KL  | 92.2      |
| Running Reverse KL  | 771       |
| Running Update Time | 149       |
-----------------------------------
--2024-08-11 12:29:35.991600 UTC---
| Itration            | 150       |
| PAGAR Loss          | -4.84e+06 |
| Real Det Return     | 2.04e+03  |
| Real Sto Return     | 635       |
| Reward Loss         | -1.08e+07 |
| Running Env Steps   | 750000    |
| Running Forward KL  | 85.4      |
| Running Reverse KL  | 900       |
| Running Update Time | 150       |
-----------------------------------
--2024-08-11 12:32:41.478466 UTC---
| Itration            | 151       |
| PAGAR Loss          | -8.86e+04 |
| Real Det Return     | 1.86e+03  |
| Real Sto Return     | 466       |
| Reward Loss         | -1.27e+07 |
| Running Env Steps   | 755000    |
| Running Forward KL  | 86.8      |
| Running Reverse KL  | 244       |
| Running Update Time | 151       |
-----------------------------------
--2024-08-11 12:35:49.558803 UTC---
| Itration            | 152       |
| PAGAR Loss          | 3.22e+04  |
| Real Det Return     | 1.17e+03  |
| Real Sto Return     | 809       |
| Reward Loss         | -1.09e+07 |
| Running Env Steps   | 760000    |
| Running Forward KL  | 82.4      |
| Running Reverse KL  | 98.7      |
| Running Update Time | 152       |
-----------------------------------
--2024-08-11 12:38:56.859654 UTC---
| Itration            | 153       |
| PAGAR Loss          | 2.96e+04  |
| Real Det Return     | 2.18e+03  |
| Real Sto Return     | 877       |
| Reward Loss         | -1.26e+07 |
| Running Env Steps   | 765000    |
| Running Forward KL  | 86.8      |
| Running Reverse KL  | 297       |
| Running Update Time | 153       |
-----------------------------------
--2024-08-11 12:42:02.292376 UTC---
| Itration            | 154       |
| PAGAR Loss          | -1.25e+06 |
| Real Det Return     | 1.86e+03  |
| Real Sto Return     | 661       |
| Reward Loss         | -1.22e+07 |
| Running Env Steps   | 770000    |
| Running Forward KL  | 83.2      |
| Running Reverse KL  | 759       |
| Running Update Time | 154       |
-----------------------------------
--2024-08-11 12:45:08.441674 UTC--
| Itration            | 155      |
| PAGAR Loss          | nan      |
| Real Det Return     | 1.99e+03 |
| Real Sto Return     | 898      |
| Reward Loss         | -1.1e+07 |
| Running Env Steps   | 775000   |
| Running Forward KL  | 80       |
| Running Reverse KL  | 260      |
| Running Update Time | 155      |
----------------------------------
--2024-08-11 12:48:15.504126 UTC---
| Itration            | 156       |
| PAGAR Loss          | -4.74e+04 |
| Real Det Return     | 2.44e+03  |
| Real Sto Return     | 782       |
| Reward Loss         | -1.24e+07 |
| Running Env Steps   | 780000    |
| Running Forward KL  | 84.4      |
| Running Reverse KL  | 262       |
| Running Update Time | 156       |
-----------------------------------
--2024-08-11 12:51:23.082469 UTC---
| Itration            | 157       |
| PAGAR Loss          | 1.59e+06  |
| Real Det Return     | 1.95e+03  |
| Real Sto Return     | 1.21e+03  |
| Reward Loss         | -1.18e+07 |
| Running Env Steps   | 785000    |
| Running Forward KL  | 81        |
| Running Reverse KL  | 250       |
| Running Update Time | 157       |
-----------------------------------
--2024-08-11 12:54:30.505290 UTC--
| Itration            | 158      |
| PAGAR Loss          | 5.31e+04 |
| Real Det Return     | 1.96e+03 |
| Real Sto Return     | 1.27e+03 |
| Reward Loss         | -9.2e+06 |
| Running Env Steps   | 790000   |
| Running Forward KL  | 69.7     |
| Running Reverse KL  | 578      |
| Running Update Time | 158      |
----------------------------------
--2024-08-11 12:57:37.011766 UTC---
| Itration            | 159       |
| PAGAR Loss          | 1.7e+06   |
| Real Det Return     | 2.35e+03  |
| Real Sto Return     | 927       |
| Reward Loss         | -1.04e+07 |
| Running Env Steps   | 795000    |
| Running Forward KL  | 73.1      |
| Running Reverse KL  | 483       |
| Running Update Time | 159       |
-----------------------------------
--2024-08-11 13:00:42.338295 UTC---
| Itration            | 160       |
| PAGAR Loss          | 3.95e+04  |
| Real Det Return     | 2.31e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -9.91e+06 |
| Running Env Steps   | 800000    |
| Running Forward KL  | 68.8      |
| Running Reverse KL  | 480       |
| Running Update Time | 160       |
-----------------------------------
--2024-08-11 13:03:41.760332 UTC---
| Itration            | 161       |
| PAGAR Loss          | -1.83e+07 |
| Real Det Return     | 1.31e+03  |
| Real Sto Return     | 547       |
| Reward Loss         | -1.38e+07 |
| Running Env Steps   | 805000    |
| Running Forward KL  | 82.7      |
| Running Reverse KL  | 1.27e+03  |
| Running Update Time | 161       |
-----------------------------------
--2024-08-11 13:06:43.274781 UTC---
| Itration            | 162       |
| PAGAR Loss          | 4.43e+07  |
| Real Det Return     | 2.6e+03   |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -1.07e+07 |
| Running Env Steps   | 810000    |
| Running Forward KL  | 72.7      |
| Running Reverse KL  | 1.13e+03  |
| Running Update Time | 162       |
-----------------------------------
--2024-08-11 13:09:44.717809 UTC---
| Itration            | 163       |
| PAGAR Loss          | 6.02e+04  |
| Real Det Return     | 1.55e+03  |
| Real Sto Return     | 742       |
| Reward Loss         | -7.08e+06 |
| Running Env Steps   | 815000    |
| Running Forward KL  | 62.8      |
| Running Reverse KL  | 1.13e+03  |
| Running Update Time | 163       |
-----------------------------------
--2024-08-11 13:12:49.962273 UTC---
| Itration            | 164       |
| PAGAR Loss          | 2.39e+08  |
| Real Det Return     | 1.95e+03  |
| Real Sto Return     | 1.32e+03  |
| Reward Loss         | -1.08e+07 |
| Running Env Steps   | 820000    |
| Running Forward KL  | 70.7      |
| Running Reverse KL  | 801       |
| Running Update Time | 164       |
-----------------------------------
--2024-08-11 13:15:52.777270 UTC---
| Itration            | 165       |
| PAGAR Loss          | -1.39e+07 |
| Real Det Return     | 2.75e+03  |
| Real Sto Return     | 1.35e+03  |
| Reward Loss         | -8.73e+06 |
| Running Env Steps   | 825000    |
| Running Forward KL  | 62.3      |
| Running Reverse KL  | 982       |
| Running Update Time | 165       |
-----------------------------------
--2024-08-11 13:19:01.087804 UTC---
| Itration            | 166       |
| PAGAR Loss          | 2.39e+06  |
| Real Det Return     | 2.56e+03  |
| Real Sto Return     | 1.68e+03  |
| Reward Loss         | -9.36e+06 |
| Running Env Steps   | 830000    |
| Running Forward KL  | 64.5      |
| Running Reverse KL  | 44.3      |
| Running Update Time | 166       |
-----------------------------------
--2024-08-11 13:22:07.868842 UTC--
| Itration            | 167      |
| PAGAR Loss          | 1.39e+07 |
| Real Det Return     | 2.86e+03 |
| Real Sto Return     | 1.81e+03 |
| Reward Loss         | -9.2e+06 |
| Running Env Steps   | 835000   |
| Running Forward KL  | 66.2     |
| Running Reverse KL  | 311      |
| Running Update Time | 167      |
----------------------------------
--2024-08-11 13:25:10.317489 UTC---
| Itration            | 168       |
| PAGAR Loss          | -5.73e+06 |
| Real Det Return     | 2.53e+03  |
| Real Sto Return     | 1.47e+03  |
| Reward Loss         | -8.38e+06 |
| Running Env Steps   | 840000    |
| Running Forward KL  | 57.7      |
| Running Reverse KL  | 748       |
| Running Update Time | 168       |
-----------------------------------
--2024-08-11 13:28:09.490976 UTC---
| Itration            | 169       |
| PAGAR Loss          | -7.13e+06 |
| Real Det Return     | 1.69e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -1.09e+07 |
| Running Env Steps   | 845000    |
| Running Forward KL  | 68.1      |
| Running Reverse KL  | 1.34e+03  |
| Running Update Time | 169       |
-----------------------------------
--2024-08-11 13:31:11.801059 UTC---
| Itration            | 170       |
| PAGAR Loss          | -1.97e+05 |
| Real Det Return     | 2.4e+03   |
| Real Sto Return     | 1.48e+03  |
| Reward Loss         | -1.05e+07 |
| Running Env Steps   | 850000    |
| Running Forward KL  | 60.5      |
| Running Reverse KL  | 695       |
| Running Update Time | 170       |
-----------------------------------
--2024-08-11 13:34:15.732654 UTC---
| Itration            | 171       |
| PAGAR Loss          | -8.51e+05 |
| Real Det Return     | 2.38e+03  |
| Real Sto Return     | 1.48e+03  |
| Reward Loss         | -9.52e+06 |
| Running Env Steps   | 855000    |
| Running Forward KL  | 56.7      |
| Running Reverse KL  | 665       |
| Running Update Time | 171       |
-----------------------------------
--2024-08-11 13:37:21.788927 UTC---
| Itration            | 172       |
| PAGAR Loss          | 1.53e+07  |
| Real Det Return     | 3.16e+03  |
| Real Sto Return     | 1.89e+03  |
| Reward Loss         | -8.09e+06 |
| Running Env Steps   | 860000    |
| Running Forward KL  | 60.1      |
| Running Reverse KL  | 755       |
| Running Update Time | 172       |
-----------------------------------
--2024-08-11 13:40:20.587724 UTC---
| Itration            | 173       |
| PAGAR Loss          | -5.53e+07 |
| Real Det Return     | 1.53e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -8.46e+06 |
| Running Env Steps   | 865000    |
| Running Forward KL  | 54.9      |
| Running Reverse KL  | 974       |
| Running Update Time | 173       |
-----------------------------------
--2024-08-11 13:43:22.008040 UTC---
| Itration            | 174       |
| PAGAR Loss          | 9.69e+07  |
| Real Det Return     | 2.75e+03  |
| Real Sto Return     | 1.44e+03  |
| Reward Loss         | -8.99e+06 |
| Running Env Steps   | 870000    |
| Running Forward KL  | 50.2      |
| Running Reverse KL  | 614       |
| Running Update Time | 174       |
-----------------------------------
--2024-08-11 13:46:31.846315 UTC---
| Itration            | 175       |
| PAGAR Loss          | -6.08e+04 |
| Real Det Return     | 2.74e+03  |
| Real Sto Return     | 1.86e+03  |
| Reward Loss         | -1.06e+07 |
| Running Env Steps   | 875000    |
| Running Forward KL  | 68.2      |
| Running Reverse KL  | 264       |
| Running Update Time | 175       |
-----------------------------------
--2024-08-11 13:49:31.395945 UTC---
| Itration            | 176       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.73e+03  |
| Real Sto Return     | 1.23e+03  |
| Reward Loss         | -1.03e+07 |
| Running Env Steps   | 880000    |
| Running Forward KL  | 64.9      |
| Running Reverse KL  | 1.35e+03  |
| Running Update Time | 176       |
-----------------------------------
--2024-08-11 13:52:31.097364 UTC---
| Itration            | 177       |
| PAGAR Loss          | 1.1e+08   |
| Real Det Return     | 2.78e+03  |
| Real Sto Return     | 1.6e+03   |
| Reward Loss         | -7.29e+06 |
| Running Env Steps   | 885000    |
| Running Forward KL  | 54        |
| Running Reverse KL  | 1.02e+03  |
| Running Update Time | 177       |
-----------------------------------
--2024-08-11 13:55:40.730889 UTC---
| Itration            | 178       |
| PAGAR Loss          | -5.4e+05  |
| Real Det Return     | 3.2e+03   |
| Real Sto Return     | 2.52e+03  |
| Reward Loss         | -8.83e+06 |
| Running Env Steps   | 890000    |
| Running Forward KL  | 56.9      |
| Running Reverse KL  | 147       |
| Running Update Time | 178       |
-----------------------------------
--2024-08-11 13:58:48.521347 UTC--
| Itration            | 179      |
| PAGAR Loss          | 1.37e+06 |
| Real Det Return     | 3.72e+03 |
| Real Sto Return     | 2.21e+03 |
| Reward Loss         | -7.3e+06 |
| Running Env Steps   | 895000   |
| Running Forward KL  | 47.3     |
| Running Reverse KL  | 24.3     |
| Running Update Time | 179      |
----------------------------------
--2024-08-11 14:01:56.727501 UTC---
| Itration            | 180       |
| PAGAR Loss          | 8.46e+06  |
| Real Det Return     | 3.59e+03  |
| Real Sto Return     | 2.45e+03  |
| Reward Loss         | -7.43e+06 |
| Running Env Steps   | 900000    |
| Running Forward KL  | 49.4      |
| Running Reverse KL  | 254       |
| Running Update Time | 180       |
-----------------------------------
--2024-08-11 14:04:57.646257 UTC---
| Itration            | 181       |
| PAGAR Loss          | -1.64e+08 |
| Real Det Return     | 2.17e+03  |
| Real Sto Return     | 1.8e+03   |
| Reward Loss         | -7.66e+06 |
| Running Env Steps   | 905000    |
| Running Forward KL  | 48.9      |
| Running Reverse KL  | 1.07e+03  |
| Running Update Time | 181       |
-----------------------------------
--2024-08-11 14:08:03.557719 UTC---
| Itration            | 182       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.74e+03  |
| Real Sto Return     | 2.49e+03  |
| Reward Loss         | -7.38e+06 |
| Running Env Steps   | 910000    |
| Running Forward KL  | 44.8      |
| Running Reverse KL  | 56.8      |
| Running Update Time | 182       |
-----------------------------------
--2024-08-11 14:11:11.459086 UTC---
| Itration            | 183       |
| PAGAR Loss          | 3.87e+06  |
| Real Det Return     | 4e+03     |
| Real Sto Return     | 3.33e+03  |
| Reward Loss         | -6.14e+06 |
| Running Env Steps   | 915000    |
| Running Forward KL  | 40.2      |
| Running Reverse KL  | 156       |
| Running Update Time | 183       |
-----------------------------------
--2024-08-11 14:14:12.823077 UTC---
| Itration            | 184       |
| PAGAR Loss          | -4.37e+05 |
| Real Det Return     | 2.63e+03  |
| Real Sto Return     | 2.02e+03  |
| Reward Loss         | -5.19e+06 |
| Running Env Steps   | 920000    |
| Running Forward KL  | 41.3      |
| Running Reverse KL  | 472       |
| Running Update Time | 184       |
-----------------------------------
--2024-08-11 14:17:21.962648 UTC--
| Itration            | 185      |
| PAGAR Loss          | 2.86e+05 |
| Real Det Return     | 4.01e+03 |
| Real Sto Return     | 3.21e+03 |
| Reward Loss         | -6.8e+06 |
| Running Env Steps   | 925000   |
| Running Forward KL  | 43.6     |
| Running Reverse KL  | 246      |
| Running Update Time | 185      |
----------------------------------
--2024-08-11 14:20:30.608921 UTC---
| Itration            | 186       |
| PAGAR Loss          | -1.12e+05 |
| Real Det Return     | 4.06e+03  |
| Real Sto Return     | 3.33e+03  |
| Reward Loss         | -6.45e+06 |
| Running Env Steps   | 930000    |
| Running Forward KL  | 39.2      |
| Running Reverse KL  | 566       |
| Running Update Time | 186       |
-----------------------------------
--2024-08-11 14:23:34.608392 UTC---
| Itration            | 187       |
| PAGAR Loss          | -4.05e+06 |
| Real Det Return     | 3.23e+03  |
| Real Sto Return     | 2.04e+03  |
| Reward Loss         | -8.12e+06 |
| Running Env Steps   | 935000    |
| Running Forward KL  | 45.5      |
| Running Reverse KL  | 1.29e+03  |
| Running Update Time | 187       |
-----------------------------------
--2024-08-11 14:26:43.373206 UTC---
| Itration            | 188       |
| PAGAR Loss          | -1.29e+06 |
| Real Det Return     | 3.9e+03   |
| Real Sto Return     | 2.9e+03   |
| Reward Loss         | -7.01e+06 |
| Running Env Steps   | 940000    |
| Running Forward KL  | 44.7      |
| Running Reverse KL  | 184       |
| Running Update Time | 188       |
-----------------------------------
--2024-08-11 14:29:50.358196 UTC---
| Itration            | 189       |
| PAGAR Loss          | -2.29e+06 |
| Real Det Return     | 4.04e+03  |
| Real Sto Return     | 2.28e+03  |
| Reward Loss         | -6.61e+06 |
| Running Env Steps   | 945000    |
| Running Forward KL  | 40.3      |
| Running Reverse KL  | 595       |
| Running Update Time | 189       |
-----------------------------------
--2024-08-11 14:32:56.888618 UTC---
| Itration            | 190       |
| PAGAR Loss          | 2.61e+06  |
| Real Det Return     | 3.94e+03  |
| Real Sto Return     | 2.97e+03  |
| Reward Loss         | -4.81e+06 |
| Running Env Steps   | 950000    |
| Running Forward KL  | 33        |
| Running Reverse KL  | 228       |
| Running Update Time | 190       |
-----------------------------------
--2024-08-11 14:36:01.778496 UTC---
| Itration            | 191       |
| PAGAR Loss          | 6.56e+06  |
| Real Det Return     | 3.46e+03  |
| Real Sto Return     | 3.19e+03  |
| Reward Loss         | -4.71e+06 |
| Running Env Steps   | 955000    |
| Running Forward KL  | 38.5      |
| Running Reverse KL  | 531       |
| Running Update Time | 191       |
-----------------------------------
--2024-08-11 14:39:05.179478 UTC---
| Itration            | 192       |
| PAGAR Loss          | -2.34e+06 |
| Real Det Return     | 3.22e+03  |
| Real Sto Return     | 1.98e+03  |
| Reward Loss         | -6.25e+06 |
| Running Env Steps   | 960000    |
| Running Forward KL  | 42.1      |
| Running Reverse KL  | 704       |
| Running Update Time | 192       |
-----------------------------------
--2024-08-11 14:42:14.961185 UTC---
| Itration            | 193       |
| PAGAR Loss          | 1.3e+06   |
| Real Det Return     | 4.11e+03  |
| Real Sto Return     | 3.7e+03   |
| Reward Loss         | -4.64e+06 |
| Running Env Steps   | 965000    |
| Running Forward KL  | 32.7      |
| Running Reverse KL  | 45.2      |
| Running Update Time | 193       |
-----------------------------------
--2024-08-11 14:45:18.390968 UTC---
| Itration            | 194       |
| PAGAR Loss          | 3.98e+06  |
| Real Det Return     | 3.37e+03  |
| Real Sto Return     | 2.11e+03  |
| Reward Loss         | -6.69e+06 |
| Running Env Steps   | 970000    |
| Running Forward KL  | 38.7      |
| Running Reverse KL  | 1.08e+03  |
| Running Update Time | 194       |
-----------------------------------
--2024-08-11 14:48:25.119401 UTC---
| Itration            | 195       |
| PAGAR Loss          | -3.55e+08 |
| Real Det Return     | 4.04e+03  |
| Real Sto Return     | 3.68e+03  |
| Reward Loss         | -5.24e+06 |
| Running Env Steps   | 975000    |
| Running Forward KL  | 34.8      |
| Running Reverse KL  | 540       |
| Running Update Time | 195       |
-----------------------------------
--2024-08-11 14:51:31.601413 UTC---
| Itration            | 196       |
| PAGAR Loss          | 6.81e+06  |
| Real Det Return     | 4.24e+03  |
| Real Sto Return     | 3.01e+03  |
| Reward Loss         | -5.13e+06 |
| Running Env Steps   | 980000    |
| Running Forward KL  | 33.6      |
| Running Reverse KL  | 61.5      |
| Running Update Time | 196       |
-----------------------------------
--2024-08-11 14:54:37.576641 UTC---
| Itration            | 197       |
| PAGAR Loss          | 1.23e+07  |
| Real Det Return     | 4.25e+03  |
| Real Sto Return     | 3.33e+03  |
| Reward Loss         | -4.79e+06 |
| Running Env Steps   | 985000    |
| Running Forward KL  | 35.4      |
| Running Reverse KL  | 844       |
| Running Update Time | 197       |
-----------------------------------
--2024-08-11 14:57:47.042082 UTC---
| Itration            | 198       |
| PAGAR Loss          | 2.41e+07  |
| Real Det Return     | 4.24e+03  |
| Real Sto Return     | 3.79e+03  |
| Reward Loss         | -4.73e+06 |
| Running Env Steps   | 990000    |
| Running Forward KL  | 34.7      |
| Running Reverse KL  | 29.5      |
| Running Update Time | 198       |
-----------------------------------
--2024-08-11 15:00:55.499974 UTC---
| Itration            | 199       |
| PAGAR Loss          | 1.18e+07  |
| Real Det Return     | 4.38e+03  |
| Real Sto Return     | 3.7e+03   |
| Reward Loss         | -5.51e+06 |
| Running Env Steps   | 995000    |
| Running Forward KL  | 32.2      |
| Running Reverse KL  | 362       |
| Running Update Time | 199       |
-----------------------------------
--2024-08-11 15:04:04.363167 UTC---
| Itration            | 200       |
| PAGAR Loss          | -8.48e+06 |
| Real Det Return     | 4.5e+03   |
| Real Sto Return     | 3.65e+03  |
| Reward Loss         | -3.85e+06 |
| Running Env Steps   | 1000000   |
| Running Forward KL  | 28.7      |
| Running Reverse KL  | 97.4      |
| Running Update Time | 200       |
-----------------------------------
--2024-08-11 15:07:11.320365 UTC---
| Itration            | 201       |
| PAGAR Loss          | -2.64e+07 |
| Real Det Return     | 3.59e+03  |
| Real Sto Return     | 3.33e+03  |
| Reward Loss         | -5.56e+06 |
| Running Env Steps   | 1005000   |
| Running Forward KL  | 34.9      |
| Running Reverse KL  | 575       |
| Running Update Time | 201       |
-----------------------------------
--2024-08-11 15:10:15.867426 UTC---
| Itration            | 202       |
| PAGAR Loss          | 4.22e+07  |
| Real Det Return     | 3.55e+03  |
| Real Sto Return     | 2.74e+03  |
| Reward Loss         | -7.21e+06 |
| Running Env Steps   | 1010000   |
| Running Forward KL  | 45.7      |
| Running Reverse KL  | 1.65e+03  |
| Running Update Time | 202       |
-----------------------------------
--2024-08-11 15:13:23.093233 UTC---
| Itration            | 203       |
| PAGAR Loss          | 3.54e+06  |
| Real Det Return     | 4.31e+03  |
| Real Sto Return     | 3.14e+03  |
| Reward Loss         | -4.45e+06 |
| Running Env Steps   | 1015000   |
| Running Forward KL  | 30.4      |
| Running Reverse KL  | 155       |
| Running Update Time | 203       |
-----------------------------------
--2024-08-11 15:16:34.015023 UTC---
| Itration            | 204       |
| PAGAR Loss          | 1.4e+07   |
| Real Det Return     | 4.23e+03  |
| Real Sto Return     | 3.81e+03  |
| Reward Loss         | -5.24e+06 |
| Running Env Steps   | 1020000   |
| Running Forward KL  | 34.6      |
| Running Reverse KL  | 25.5      |
| Running Update Time | 204       |
-----------------------------------
--2024-08-11 15:19:42.814285 UTC---
| Itration            | 205       |
| PAGAR Loss          | 1.86e+07  |
| Real Det Return     | 4.54e+03  |
| Real Sto Return     | 3.88e+03  |
| Reward Loss         | -5.19e+06 |
| Running Env Steps   | 1025000   |
| Running Forward KL  | 36        |
| Running Reverse KL  | 25.3      |
| Running Update Time | 205       |
-----------------------------------
--2024-08-11 15:22:52.802703 UTC---
| Itration            | 206       |
| PAGAR Loss          | 1.06e+07  |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.22e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 1030000   |
| Running Forward KL  | 27.8      |
| Running Reverse KL  | 15.9      |
| Running Update Time | 206       |
-----------------------------------
--2024-08-11 15:26:02.923381 UTC---
| Itration            | 207       |
| PAGAR Loss          | 1.38e+07  |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -5.12e+06 |
| Running Env Steps   | 1035000   |
| Running Forward KL  | 31        |
| Running Reverse KL  | 221       |
| Running Update Time | 207       |
-----------------------------------
--2024-08-11 15:29:13.167236 UTC---
| Itration            | 208       |
| PAGAR Loss          | 6.68e+07  |
| Real Det Return     | 4.5e+03   |
| Real Sto Return     | 3.39e+03  |
| Reward Loss         | -5.63e+06 |
| Running Env Steps   | 1040000   |
| Running Forward KL  | 35.9      |
| Running Reverse KL  | 187       |
| Running Update Time | 208       |
-----------------------------------
--2024-08-11 15:32:24.566916 UTC---
| Itration            | 209       |
| PAGAR Loss          | 2.16e+07  |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.05e+03  |
| Reward Loss         | -3.99e+06 |
| Running Env Steps   | 1045000   |
| Running Forward KL  | 28.7      |
| Running Reverse KL  | 20.3      |
| Running Update Time | 209       |
-----------------------------------
--2024-08-11 15:35:36.525773 UTC---
| Itration            | 210       |
| PAGAR Loss          | 4.09e+07  |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.2e+03   |
| Reward Loss         | -5.09e+06 |
| Running Env Steps   | 1050000   |
| Running Forward KL  | 34.2      |
| Running Reverse KL  | 260       |
| Running Update Time | 210       |
-----------------------------------
--2024-08-11 15:38:46.776309 UTC---
| Itration            | 211       |
| PAGAR Loss          | 2.3e+06   |
| Real Det Return     | 4.9e+03   |
| Real Sto Return     | 3.47e+03  |
| Reward Loss         | -3.59e+06 |
| Running Env Steps   | 1055000   |
| Running Forward KL  | 26.8      |
| Running Reverse KL  | 80.3      |
| Running Update Time | 211       |
-----------------------------------
--2024-08-11 15:41:58.161420 UTC---
| Itration            | 212       |
| PAGAR Loss          | -1.37e+07 |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 3.82e+03  |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 1060000   |
| Running Forward KL  | 30.9      |
| Running Reverse KL  | 394       |
| Running Update Time | 212       |
-----------------------------------
--2024-08-11 15:45:08.867978 UTC---
| Itration            | 213       |
| PAGAR Loss          | 1.81e+07  |
| Real Det Return     | 4.24e+03  |
| Real Sto Return     | 3.81e+03  |
| Reward Loss         | -6.26e+06 |
| Running Env Steps   | 1065000   |
| Running Forward KL  | 34.6      |
| Running Reverse KL  | 272       |
| Running Update Time | 213       |
-----------------------------------
--2024-08-11 15:48:20.270426 UTC---
| Itration            | 214       |
| PAGAR Loss          | 3.15e+07  |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -4.84e+06 |
| Running Env Steps   | 1070000   |
| Running Forward KL  | 31.7      |
| Running Reverse KL  | 25.1      |
| Running Update Time | 214       |
-----------------------------------
--2024-08-11 15:51:29.923777 UTC---
| Itration            | 215       |
| PAGAR Loss          | 1.03e+07  |
| Real Det Return     | 4.68e+03  |
| Real Sto Return     | 3.21e+03  |
| Reward Loss         | -4.62e+06 |
| Running Env Steps   | 1075000   |
| Running Forward KL  | 32.8      |
| Running Reverse KL  | 27.7      |
| Running Update Time | 215       |
-----------------------------------
--2024-08-11 15:54:41.451162 UTC--
| Itration            | 216      |
| PAGAR Loss          | 4.4e+07  |
| Real Det Return     | 4.95e+03 |
| Real Sto Return     | 4.29e+03 |
| Reward Loss         | -3.5e+06 |
| Running Env Steps   | 1080000  |
| Running Forward KL  | 27.3     |
| Running Reverse KL  | 16.5     |
| Running Update Time | 216      |
----------------------------------
--2024-08-11 15:57:44.021143 UTC---
| Itration            | 217       |
| PAGAR Loss          | 3.55e+07  |
| Real Det Return     | 2.74e+03  |
| Real Sto Return     | 2.35e+03  |
| Reward Loss         | -5.62e+06 |
| Running Env Steps   | 1085000   |
| Running Forward KL  | 40        |
| Running Reverse KL  | 1.11e+03  |
| Running Update Time | 217       |
-----------------------------------
--2024-08-11 16:00:52.362201 UTC---
| Itration            | 218       |
| PAGAR Loss          | -8.68e+06 |
| Real Det Return     | 4.45e+03  |
| Real Sto Return     | 3.42e+03  |
| Reward Loss         | -6.34e+06 |
| Running Env Steps   | 1090000   |
| Running Forward KL  | 36.5      |
| Running Reverse KL  | 984       |
| Running Update Time | 218       |
-----------------------------------
--2024-08-11 16:04:00.874617 UTC---
| Itration            | 219       |
| PAGAR Loss          | 1.56e+07  |
| Real Det Return     | 4.5e+03   |
| Real Sto Return     | 3.23e+03  |
| Reward Loss         | -5.94e+06 |
| Running Env Steps   | 1095000   |
| Running Forward KL  | 32.4      |
| Running Reverse KL  | 556       |
| Running Update Time | 219       |
-----------------------------------
--2024-08-11 16:07:12.007779 UTC---
| Itration            | 220       |
| PAGAR Loss          | 2.07e+07  |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -4.76e+06 |
| Running Env Steps   | 1100000   |
| Running Forward KL  | 31.5      |
| Running Reverse KL  | 273       |
| Running Update Time | 220       |
-----------------------------------
--2024-08-11 16:10:21.517909 UTC---
| Itration            | 221       |
| PAGAR Loss          | 5.77e+06  |
| Real Det Return     | 4.55e+03  |
| Real Sto Return     | 3.4e+03   |
| Reward Loss         | -4.88e+06 |
| Running Env Steps   | 1105000   |
| Running Forward KL  | 31.4      |
| Running Reverse KL  | 289       |
| Running Update Time | 221       |
-----------------------------------
--2024-08-11 16:13:28.171883 UTC---
| Itration            | 222       |
| PAGAR Loss          | -1.31e+08 |
| Real Det Return     | 4.1e+03   |
| Real Sto Return     | 3.08e+03  |
| Reward Loss         | -5.15e+06 |
| Running Env Steps   | 1110000   |
| Running Forward KL  | 37.8      |
| Running Reverse KL  | 1.12e+03  |
| Running Update Time | 222       |
-----------------------------------
--2024-08-11 16:16:36.699801 UTC---
| Itration            | 223       |
| PAGAR Loss          | 4.51e+07  |
| Real Det Return     | 4.36e+03  |
| Real Sto Return     | 3.39e+03  |
| Reward Loss         | -5.73e+06 |
| Running Env Steps   | 1115000   |
| Running Forward KL  | 33.2      |
| Running Reverse KL  | 288       |
| Running Update Time | 223       |
-----------------------------------
--2024-08-11 16:19:47.815128 UTC---
| Itration            | 224       |
| PAGAR Loss          | 1.85e+07  |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -5.17e+06 |
| Running Env Steps   | 1120000   |
| Running Forward KL  | 32.8      |
| Running Reverse KL  | 482       |
| Running Update Time | 224       |
-----------------------------------
--2024-08-11 16:22:58.878994 UTC---
| Itration            | 225       |
| PAGAR Loss          | -1.48e+05 |
| Real Det Return     | 4.69e+03  |
| Real Sto Return     | 4.13e+03  |
| Reward Loss         | -4.29e+06 |
| Running Env Steps   | 1125000   |
| Running Forward KL  | 28.5      |
| Running Reverse KL  | 269       |
| Running Update Time | 225       |
-----------------------------------
--2024-08-11 16:26:00.892265 UTC--
| Itration            | 226      |
| PAGAR Loss          | nan      |
| Real Det Return     | 2.65e+03 |
| Real Sto Return     | 2.36e+03 |
| Reward Loss         | -6.9e+06 |
| Running Env Steps   | 1130000  |
| Running Forward KL  | 43.3     |
| Running Reverse KL  | 1.63e+03 |
| Running Update Time | 226      |
----------------------------------
--2024-08-11 16:29:09.658969 UTC---
| Itration            | 227       |
| PAGAR Loss          | 3.54e+07  |
| Real Det Return     | 4.6e+03   |
| Real Sto Return     | 3.62e+03  |
| Reward Loss         | -5.23e+06 |
| Running Env Steps   | 1135000   |
| Running Forward KL  | 32.3      |
| Running Reverse KL  | 140       |
| Running Update Time | 227       |
-----------------------------------
--2024-08-11 16:32:20.265958 UTC---
| Itration            | 228       |
| PAGAR Loss          | 1.03e+08  |
| Real Det Return     | 4.61e+03  |
| Real Sto Return     | 4.21e+03  |
| Reward Loss         | -5.43e+06 |
| Running Env Steps   | 1140000   |
| Running Forward KL  | 31.4      |
| Running Reverse KL  | 267       |
| Running Update Time | 228       |
-----------------------------------
--2024-08-11 16:35:30.136148 UTC---
| Itration            | 229       |
| PAGAR Loss          | 9.46e+06  |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 3.79e+03  |
| Reward Loss         | -4.28e+06 |
| Running Env Steps   | 1145000   |
| Running Forward KL  | 28.7      |
| Running Reverse KL  | 13.7      |
| Running Update Time | 229       |
-----------------------------------
--2024-08-11 16:38:33.082036 UTC---
| Itration            | 230       |
| PAGAR Loss          | 6.18e+05  |
| Real Det Return     | 4.3e+03   |
| Real Sto Return     | 3.45e+03  |
| Reward Loss         | -5.55e+06 |
| Running Env Steps   | 1150000   |
| Running Forward KL  | 29.7      |
| Running Reverse KL  | 545       |
| Running Update Time | 230       |
-----------------------------------
--2024-08-11 16:41:38.119011 UTC---
| Itration            | 231       |
| PAGAR Loss          | -7.61e+07 |
| Real Det Return     | 4.02e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -4.82e+06 |
| Running Env Steps   | 1155000   |
| Running Forward KL  | 35.5      |
| Running Reverse KL  | 339       |
| Running Update Time | 231       |
-----------------------------------
--2024-08-11 16:44:45.014354 UTC---
| Itration            | 232       |
| PAGAR Loss          | 2.18e+05  |
| Real Det Return     | 4.55e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -4.55e+06 |
| Running Env Steps   | 1160000   |
| Running Forward KL  | 30.9      |
| Running Reverse KL  | 26.7      |
| Running Update Time | 232       |
-----------------------------------
--2024-08-11 16:47:51.603743 UTC---
| Itration            | 233       |
| PAGAR Loss          | 6.71e+07  |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.37e+03  |
| Reward Loss         | -3.85e+06 |
| Running Env Steps   | 1165000   |
| Running Forward KL  | 30.8      |
| Running Reverse KL  | 18        |
| Running Update Time | 233       |
-----------------------------------
--2024-08-11 16:50:57.654289 UTC--
| Itration            | 234      |
| PAGAR Loss          | 2.86e+09 |
| Real Det Return     | 4.5e+03  |
| Real Sto Return     | 4.35e+03 |
| Reward Loss         | -4.1e+06 |
| Running Env Steps   | 1170000  |
| Running Forward KL  | 30.9     |
| Running Reverse KL  | 197      |
| Running Update Time | 234      |
----------------------------------
--2024-08-11 16:54:03.698388 UTC---
| Itration            | 235       |
| PAGAR Loss          | 2.3e+07   |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 3.88e+03  |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 1175000   |
| Running Forward KL  | 33.9      |
| Running Reverse KL  | 25.5      |
| Running Update Time | 235       |
-----------------------------------
--2024-08-11 16:57:09.476131 UTC---
| Itration            | 236       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.1e+03   |
| Reward Loss         | -5.75e+06 |
| Running Env Steps   | 1180000   |
| Running Forward KL  | 30.6      |
| Running Reverse KL  | 485       |
| Running Update Time | 236       |
-----------------------------------
--2024-08-11 17:00:16.049535 UTC---
| Itration            | 237       |
| PAGAR Loss          | 1.29e+07  |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -4.37e+06 |
| Running Env Steps   | 1185000   |
| Running Forward KL  | 35.1      |
| Running Reverse KL  | 679       |
| Running Update Time | 237       |
-----------------------------------
--2024-08-11 17:03:19.865271 UTC---
| Itration            | 238       |
| PAGAR Loss          | -5.26e+07 |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 3.19e+03  |
| Reward Loss         | -4.9e+06  |
| Running Env Steps   | 1190000   |
| Running Forward KL  | 32.2      |
| Running Reverse KL  | 302       |
| Running Update Time | 238       |
-----------------------------------
--2024-08-11 17:06:25.836241 UTC---
| Itration            | 239       |
| PAGAR Loss          | 3.32e+07  |
| Real Det Return     | 4.75e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -4.74e+06 |
| Running Env Steps   | 1195000   |
| Running Forward KL  | 31.8      |
| Running Reverse KL  | 20.3      |
| Running Update Time | 239       |
-----------------------------------
--2024-08-11 17:09:29.782843 UTC---
| Itration            | 240       |
| PAGAR Loss          | 8.37e+06  |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 3.25e+03  |
| Reward Loss         | -4.71e+06 |
| Running Env Steps   | 1200000   |
| Running Forward KL  | 31.1      |
| Running Reverse KL  | 354       |
| Running Update Time | 240       |
-----------------------------------
--2024-08-11 17:12:34.869004 UTC---
| Itration            | 241       |
| PAGAR Loss          | 3.95e+06  |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -4.29e+06 |
| Running Env Steps   | 1205000   |
| Running Forward KL  | 30        |
| Running Reverse KL  | 158       |
| Running Update Time | 241       |
-----------------------------------
--2024-08-11 17:15:40.842949 UTC---
| Itration            | 242       |
| PAGAR Loss          | 1.21e+08  |
| Real Det Return     | 4.64e+03  |
| Real Sto Return     | 3.63e+03  |
| Reward Loss         | -3.78e+06 |
| Running Env Steps   | 1210000   |
| Running Forward KL  | 29.8      |
| Running Reverse KL  | 198       |
| Running Update Time | 242       |
-----------------------------------
--2024-08-11 17:18:45.234214 UTC---
| Itration            | 243       |
| PAGAR Loss          | 3.84e+07  |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 3.44e+03  |
| Reward Loss         | -4.34e+06 |
| Running Env Steps   | 1215000   |
| Running Forward KL  | 34.7      |
| Running Reverse KL  | 346       |
| Running Update Time | 243       |
-----------------------------------
--2024-08-11 17:21:51.691467 UTC---
| Itration            | 244       |
| PAGAR Loss          | 6.21e+06  |
| Real Det Return     | 4.16e+03  |
| Real Sto Return     | 4.02e+03  |
| Reward Loss         | -4.23e+06 |
| Running Env Steps   | 1220000   |
| Running Forward KL  | 32.7      |
| Running Reverse KL  | 30.7      |
| Running Update Time | 244       |
-----------------------------------
--2024-08-11 17:24:54.428074 UTC---
| Itration            | 245       |
| PAGAR Loss          | -3.34e+06 |
| Real Det Return     | 4.27e+03  |
| Real Sto Return     | 3.31e+03  |
| Reward Loss         | -9.06e+06 |
| Running Env Steps   | 1225000   |
| Running Forward KL  | 40.1      |
| Running Reverse KL  | 1.31e+03  |
| Running Update Time | 245       |
-----------------------------------
--2024-08-11 17:28:00.477082 UTC---
| Itration            | 246       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -4.86e+06 |
| Running Env Steps   | 1230000   |
| Running Forward KL  | 30.5      |
| Running Reverse KL  | 268       |
| Running Update Time | 246       |
-----------------------------------
--2024-08-11 17:31:07.268946 UTC---
| Itration            | 247       |
| PAGAR Loss          | -2.65e+09 |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 1235000   |
| Running Forward KL  | 30.7      |
| Running Reverse KL  | 209       |
| Running Update Time | 247       |
-----------------------------------
--2024-08-11 17:34:12.269514 UTC---
| Itration            | 248       |
| PAGAR Loss          | 2.01e+07  |
| Real Det Return     | 4.42e+03  |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -5.99e+06 |
| Running Env Steps   | 1240000   |
| Running Forward KL  | 32.1      |
| Running Reverse KL  | 436       |
| Running Update Time | 248       |
-----------------------------------
--2024-08-11 17:37:18.666039 UTC---
| Itration            | 249       |
| PAGAR Loss          | 5.32e+06  |
| Real Det Return     | 4.68e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -4.12e+06 |
| Running Env Steps   | 1245000   |
| Running Forward KL  | 27.5      |
| Running Reverse KL  | 144       |
| Running Update Time | 249       |
-----------------------------------
--2024-08-11 17:40:24.714804 UTC---
| Itration            | 250       |
| PAGAR Loss          | 4.77e+06  |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 3.76e+03  |
| Reward Loss         | -4.83e+06 |
| Running Env Steps   | 1250000   |
| Running Forward KL  | 28        |
| Running Reverse KL  | 219       |
| Running Update Time | 250       |
-----------------------------------
--2024-08-11 17:43:29.407058 UTC---
| Itration            | 251       |
| PAGAR Loss          | 5.38e+08  |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 3.9e+03   |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 1255000   |
| Running Forward KL  | 31.9      |
| Running Reverse KL  | 658       |
| Running Update Time | 251       |
-----------------------------------
--2024-08-11 17:46:35.194334 UTC---
| Itration            | 252       |
| PAGAR Loss          | 1.48e+07  |
| Real Det Return     | 4.84e+03  |
| Real Sto Return     | 3.91e+03  |
| Reward Loss         | -3.77e+06 |
| Running Env Steps   | 1260000   |
| Running Forward KL  | 29.5      |
| Running Reverse KL  | 601       |
| Running Update Time | 252       |
-----------------------------------
--2024-08-11 17:49:40.391199 UTC---
| Itration            | 253       |
| PAGAR Loss          | 8.21e+07  |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 3.9e+03   |
| Reward Loss         | -3.23e+06 |
| Running Env Steps   | 1265000   |
| Running Forward KL  | 34.1      |
| Running Reverse KL  | 688       |
| Running Update Time | 253       |
-----------------------------------
--2024-08-11 17:52:44.764847 UTC---
| Itration            | 254       |
| PAGAR Loss          | -2.24e+05 |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -3.63e+06 |
| Running Env Steps   | 1270000   |
| Running Forward KL  | 29.1      |
| Running Reverse KL  | 554       |
| Running Update Time | 254       |
-----------------------------------
--2024-08-11 17:55:50.639252 UTC---
| Itration            | 255       |
| PAGAR Loss          | -3.08e+08 |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 4.11e+03  |
| Reward Loss         | -5.36e+06 |
| Running Env Steps   | 1275000   |
| Running Forward KL  | 33.1      |
| Running Reverse KL  | 736       |
| Running Update Time | 255       |
-----------------------------------
--2024-08-11 17:58:57.369858 UTC---
| Itration            | 256       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -2.54e+06 |
| Running Env Steps   | 1280000   |
| Running Forward KL  | 23.5      |
| Running Reverse KL  | 13.2      |
| Running Update Time | 256       |
-----------------------------------
--2024-08-11 18:02:01.065731 UTC---
| Itration            | 257       |
| PAGAR Loss          | -3.67e+06 |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 3.99e+03  |
| Reward Loss         | -6.57e+06 |
| Running Env Steps   | 1285000   |
| Running Forward KL  | 28.3      |
| Running Reverse KL  | 729       |
| Running Update Time | 257       |
-----------------------------------
--2024-08-11 18:05:05.555964 UTC---
| Itration            | 258       |
| PAGAR Loss          | 9.95e+07  |
| Real Det Return     | 4.64e+03  |
| Real Sto Return     | 3.65e+03  |
| Reward Loss         | -4.39e+06 |
| Running Env Steps   | 1290000   |
| Running Forward KL  | 30.6      |
| Running Reverse KL  | 514       |
| Running Update Time | 258       |
-----------------------------------
--2024-08-11 18:08:11.470113 UTC---
| Itration            | 259       |
| PAGAR Loss          | -1.87e+07 |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 3.73e+03  |
| Reward Loss         | -3.89e+06 |
| Running Env Steps   | 1295000   |
| Running Forward KL  | 23.3      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 259       |
-----------------------------------
--2024-08-11 18:11:19.035225 UTC---
| Itration            | 260       |
| PAGAR Loss          | 1.16e+10  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -3.74e+06 |
| Running Env Steps   | 1300000   |
| Running Forward KL  | 25.4      |
| Running Reverse KL  | 16.4      |
| Running Update Time | 260       |
-----------------------------------
--2024-08-11 18:14:25.545931 UTC---
| Itration            | 261       |
| PAGAR Loss          | 7.85e+07  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.27e+03  |
| Reward Loss         | -3.68e+06 |
| Running Env Steps   | 1305000   |
| Running Forward KL  | 27.9      |
| Running Reverse KL  | 253       |
| Running Update Time | 261       |
-----------------------------------
--2024-08-11 18:17:28.908566 UTC---
| Itration            | 262       |
| PAGAR Loss          | 3.33e+07  |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 2.88e+03  |
| Reward Loss         | -4.64e+06 |
| Running Env Steps   | 1310000   |
| Running Forward KL  | 30.7      |
| Running Reverse KL  | 622       |
| Running Update Time | 262       |
-----------------------------------
--2024-08-11 18:20:34.957916 UTC---
| Itration            | 263       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.44e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -4.79e+06 |
| Running Env Steps   | 1315000   |
| Running Forward KL  | 26.2      |
| Running Reverse KL  | 202       |
| Running Update Time | 263       |
-----------------------------------
--2024-08-11 18:23:40.796213 UTC---
| Itration            | 264       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 4.2e+03   |
| Reward Loss         | -3.15e+06 |
| Running Env Steps   | 1320000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 292       |
| Running Update Time | 264       |
-----------------------------------
--2024-08-11 18:26:46.128525 UTC---
| Itration            | 265       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.06e+03  |
| Reward Loss         | -4.45e+06 |
| Running Env Steps   | 1325000   |
| Running Forward KL  | 30.1      |
| Running Reverse KL  | 337       |
| Running Update Time | 265       |
-----------------------------------
--2024-08-11 18:29:52.229152 UTC---
| Itration            | 266       |
| PAGAR Loss          | -8.29e+06 |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 3.86e+03  |
| Reward Loss         | -3.99e+06 |
| Running Env Steps   | 1330000   |
| Running Forward KL  | 27.2      |
| Running Reverse KL  | 50.4      |
| Running Update Time | 266       |
-----------------------------------
--2024-08-11 18:32:56.698524 UTC---
| Itration            | 267       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 3.62e+03  |
| Reward Loss         | -4.01e+06 |
| Running Env Steps   | 1335000   |
| Running Forward KL  | 29.3      |
| Running Reverse KL  | 426       |
| Running Update Time | 267       |
-----------------------------------
--2024-08-11 18:36:00.049616 UTC--
| Itration            | 268      |
| PAGAR Loss          | -2.7e+07 |
| Real Det Return     | 3.87e+03 |
| Real Sto Return     | 3.67e+03 |
| Reward Loss         | -3.2e+06 |
| Running Env Steps   | 1340000  |
| Running Forward KL  | 27.2     |
| Running Reverse KL  | 514      |
| Running Update Time | 268      |
----------------------------------
--2024-08-11 18:39:03.673693 UTC---
| Itration            | 269       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 1345000   |
| Running Forward KL  | 26.3      |
| Running Reverse KL  | 223       |
| Running Update Time | 269       |
-----------------------------------
--2024-08-11 18:42:08.759077 UTC---
| Itration            | 270       |
| PAGAR Loss          | 2.06e+08  |
| Real Det Return     | 4.35e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -3.19e+06 |
| Running Env Steps   | 1350000   |
| Running Forward KL  | 25.7      |
| Running Reverse KL  | 20.6      |
| Running Update Time | 270       |
-----------------------------------
--2024-08-11 18:45:15.150769 UTC---
| Itration            | 271       |
| PAGAR Loss          | 4.13e+06  |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -3.11e+06 |
| Running Env Steps   | 1355000   |
| Running Forward KL  | 25.4      |
| Running Reverse KL  | 16.6      |
| Running Update Time | 271       |
-----------------------------------
--2024-08-11 18:48:21.783612 UTC---
| Itration            | 272       |
| PAGAR Loss          | 3.82e+09  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -3.06e+06 |
| Running Env Steps   | 1360000   |
| Running Forward KL  | 26        |
| Running Reverse KL  | 112       |
| Running Update Time | 272       |
-----------------------------------
--2024-08-11 18:51:24.050148 UTC---
| Itration            | 273       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.44e+03  |
| Real Sto Return     | 3.6e+03   |
| Reward Loss         | -5.04e+06 |
| Running Env Steps   | 1365000   |
| Running Forward KL  | 27.3      |
| Running Reverse KL  | 239       |
| Running Update Time | 273       |
-----------------------------------
--2024-08-11 18:54:27.672300 UTC---
| Itration            | 274       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 4.1e+03   |
| Reward Loss         | -2.74e+06 |
| Running Env Steps   | 1370000   |
| Running Forward KL  | 25.1      |
| Running Reverse KL  | 486       |
| Running Update Time | 274       |
-----------------------------------
--2024-08-11 18:57:31.265474 UTC---
| Itration            | 275       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.94e+03  |
| Real Sto Return     | 3.27e+03  |
| Reward Loss         | -4.51e+06 |
| Running Env Steps   | 1375000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 496       |
| Running Update Time | 275       |
-----------------------------------
--2024-08-11 19:00:36.440317 UTC---
| Itration            | 276       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.36e+03  |
| Real Sto Return     | 4.37e+03  |
| Reward Loss         | -6.34e+06 |
| Running Env Steps   | 1380000   |
| Running Forward KL  | 29.1      |
| Running Reverse KL  | 883       |
| Running Update Time | 276       |
-----------------------------------
--2024-08-11 19:03:42.571234 UTC---
| Itration            | 277       |
| PAGAR Loss          | 1.23e+07  |
| Real Det Return     | 4.64e+03  |
| Real Sto Return     | 4.09e+03  |
| Reward Loss         | -3.12e+06 |
| Running Env Steps   | 1385000   |
| Running Forward KL  | 25.8      |
| Running Reverse KL  | 223       |
| Running Update Time | 277       |
-----------------------------------
--2024-08-11 19:06:49.316586 UTC---
| Itration            | 278       |
| PAGAR Loss          | -9.22e+08 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.27e+03  |
| Reward Loss         | -3.05e+06 |
| Running Env Steps   | 1390000   |
| Running Forward KL  | 24.5      |
| Running Reverse KL  | 109       |
| Running Update Time | 278       |
-----------------------------------
--2024-08-11 19:09:56.804385 UTC---
| Itration            | 279       |
| PAGAR Loss          | -2.66e+07 |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.42e+03  |
| Reward Loss         | -2.96e+06 |
| Running Env Steps   | 1395000   |
| Running Forward KL  | 23        |
| Running Reverse KL  | 10.4      |
| Running Update Time | 279       |
-----------------------------------
--2024-08-11 19:13:02.725609 UTC---
| Itration            | 280       |
| PAGAR Loss          | -7.19e+08 |
| Real Det Return     | 4.94e+03  |
| Real Sto Return     | 3.74e+03  |
| Reward Loss         | -5.54e+06 |
| Running Env Steps   | 1400000   |
| Running Forward KL  | 28.4      |
| Running Reverse KL  | 589       |
| Running Update Time | 280       |
-----------------------------------
--2024-08-11 19:16:07.847165 UTC---
| Itration            | 281       |
| PAGAR Loss          | 2.46e+09  |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.14e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 1405000   |
| Running Forward KL  | 27.8      |
| Running Reverse KL  | 1e+03     |
| Running Update Time | 281       |
-----------------------------------
--2024-08-11 19:19:11.929624 UTC---
| Itration            | 282       |
| PAGAR Loss          | 2.9e+10   |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 3.33e+03  |
| Reward Loss         | -3.04e+06 |
| Running Env Steps   | 1410000   |
| Running Forward KL  | 22.8      |
| Running Reverse KL  | 107       |
| Running Update Time | 282       |
-----------------------------------
--2024-08-11 19:22:18.072456 UTC---
| Itration            | 283       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.35e+03  |
| Real Sto Return     | 4.1e+03   |
| Reward Loss         | -3.29e+06 |
| Running Env Steps   | 1415000   |
| Running Forward KL  | 23.7      |
| Running Reverse KL  | 286       |
| Running Update Time | 283       |
-----------------------------------
--2024-08-11 19:25:25.056423 UTC---
| Itration            | 284       |
| PAGAR Loss          | -3.06e+07 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -4.95e+06 |
| Running Env Steps   | 1420000   |
| Running Forward KL  | 23.1      |
| Running Reverse KL  | 576       |
| Running Update Time | 284       |
-----------------------------------
--2024-08-11 19:28:33.581823 UTC---
| Itration            | 285       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -2.03e+06 |
| Running Env Steps   | 1425000   |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 11.1      |
| Running Update Time | 285       |
-----------------------------------
--2024-08-11 19:31:42.091540 UTC--
| Itration            | 286      |
| PAGAR Loss          | 3.61e+08 |
| Real Det Return     | 5.49e+03 |
| Real Sto Return     | 4.9e+03  |
| Reward Loss         | -2.5e+06 |
| Running Env Steps   | 1430000  |
| Running Forward KL  | 19.9     |
| Running Reverse KL  | 8.92     |
| Running Update Time | 286      |
----------------------------------
--2024-08-11 19:34:49.279022 UTC---
| Itration            | 287       |
| PAGAR Loss          | -1.7e+06  |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 4.14e+03  |
| Reward Loss         | -3.08e+06 |
| Running Env Steps   | 1435000   |
| Running Forward KL  | 25.2      |
| Running Reverse KL  | 243       |
| Running Update Time | 287       |
-----------------------------------
--2024-08-11 19:37:55.089587 UTC---
| Itration            | 288       |
| PAGAR Loss          | 2.48e+07  |
| Real Det Return     | 5e+03     |
| Real Sto Return     | 3.55e+03  |
| Reward Loss         | -2.98e+06 |
| Running Env Steps   | 1440000   |
| Running Forward KL  | 25.2      |
| Running Reverse KL  | 484       |
| Running Update Time | 288       |
-----------------------------------
--2024-08-11 19:41:03.549233 UTC---
| Itration            | 289       |
| PAGAR Loss          | -2.62e+07 |
| Real Det Return     | 5e+03     |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -2.61e+06 |
| Running Env Steps   | 1445000   |
| Running Forward KL  | 22.7      |
| Running Reverse KL  | 10.8      |
| Running Update Time | 289       |
-----------------------------------
--2024-08-11 19:44:11.531705 UTC---
| Itration            | 290       |
| PAGAR Loss          | 1.87e+07  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.09e+03  |
| Reward Loss         | -3.05e+06 |
| Running Env Steps   | 1450000   |
| Running Forward KL  | 23.3      |
| Running Reverse KL  | 81.1      |
| Running Update Time | 290       |
-----------------------------------
--2024-08-11 19:47:19.348621 UTC---
| Itration            | 291       |
| PAGAR Loss          | -2.71e+07 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 1455000   |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 171       |
| Running Update Time | 291       |
-----------------------------------
--2024-08-11 19:50:27.954810 UTC---
| Itration            | 292       |
| PAGAR Loss          | 3.83e+08  |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -3.08e+06 |
| Running Env Steps   | 1460000   |
| Running Forward KL  | 24.6      |
| Running Reverse KL  | 35.6      |
| Running Update Time | 292       |
-----------------------------------
--2024-08-11 19:53:35.737837 UTC---
| Itration            | 293       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.29e+03  |
| Reward Loss         | -6.13e+06 |
| Running Env Steps   | 1465000   |
| Running Forward KL  | 22.9      |
| Running Reverse KL  | 409       |
| Running Update Time | 293       |
-----------------------------------
--2024-08-11 19:56:42.800575 UTC---
| Itration            | 294       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -6.33e+06 |
| Running Env Steps   | 1470000   |
| Running Forward KL  | 26.7      |
| Running Reverse KL  | 444       |
| Running Update Time | 294       |
-----------------------------------
--2024-08-11 19:59:45.500158 UTC---
| Itration            | 295       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 2.31e+03  |
| Reward Loss         | -6.69e+06 |
| Running Env Steps   | 1475000   |
| Running Forward KL  | 31.3      |
| Running Reverse KL  | 772       |
| Running Update Time | 295       |
-----------------------------------
--2024-08-11 20:02:52.913868 UTC--
| Itration            | 296      |
| PAGAR Loss          | 1.79e+09 |
| Real Det Return     | 4.85e+03 |
| Real Sto Return     | 4.57e+03 |
| Reward Loss         | -3.8e+06 |
| Running Env Steps   | 1480000  |
| Running Forward KL  | 23.7     |
| Running Reverse KL  | 51.1     |
| Running Update Time | 296      |
----------------------------------
--2024-08-11 20:06:01.648174 UTC---
| Itration            | 297       |
| PAGAR Loss          | 1.5e+10   |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -3.06e+06 |
| Running Env Steps   | 1485000   |
| Running Forward KL  | 22.4      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 297       |
-----------------------------------
--2024-08-11 20:09:10.133072 UTC---
| Itration            | 298       |
| PAGAR Loss          | -2.51e+07 |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 3.95e+03  |
| Reward Loss         | -3.69e+06 |
| Running Env Steps   | 1490000   |
| Running Forward KL  | 22.4      |
| Running Reverse KL  | 245       |
| Running Update Time | 298       |
-----------------------------------
--2024-08-11 20:12:16.294918 UTC---
| Itration            | 299       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.94e+03  |
| Real Sto Return     | 3.86e+03  |
| Reward Loss         | -4.02e+06 |
| Running Env Steps   | 1495000   |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 147       |
| Running Update Time | 299       |
-----------------------------------
--2024-08-11 20:15:23.412494 UTC---
| Itration            | 300       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 3.43e+03  |
| Reward Loss         | -3.21e+06 |
| Running Env Steps   | 1500000   |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 89.6      |
| Running Update Time | 300       |
-----------------------------------
--2024-08-11 20:18:34.906526 UTC---
| Itration            | 301       |
| PAGAR Loss          | -1.62e+10 |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -4.21e+06 |
| Running Env Steps   | 1505000   |
| Running Forward KL  | 23.9      |
| Running Reverse KL  | 295       |
| Running Update Time | 301       |
-----------------------------------
--2024-08-11 20:21:43.433398 UTC---
| Itration            | 302       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 1510000   |
| Running Forward KL  | 29        |
| Running Reverse KL  | 198       |
| Running Update Time | 302       |
-----------------------------------
--2024-08-11 20:24:47.352433 UTC---
| Itration            | 303       |
| PAGAR Loss          | 1.78e+10  |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 3.48e+03  |
| Reward Loss         | -7.65e+06 |
| Running Env Steps   | 1515000   |
| Running Forward KL  | 31.4      |
| Running Reverse KL  | 1.14e+03  |
| Running Update Time | 303       |
-----------------------------------
--2024-08-11 20:27:53.765298 UTC--
| Itration            | 304      |
| PAGAR Loss          | 1.97e+09 |
| Real Det Return     | 5e+03    |
| Real Sto Return     | 3.74e+03 |
| Reward Loss         | -6.9e+06 |
| Running Env Steps   | 1520000  |
| Running Forward KL  | 27.7     |
| Running Reverse KL  | 629      |
| Running Update Time | 304      |
----------------------------------
--2024-08-11 20:31:01.347389 UTC---
| Itration            | 305       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.54e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -5.33e+06 |
| Running Env Steps   | 1525000   |
| Running Forward KL  | 23        |
| Running Reverse KL  | 326       |
| Running Update Time | 305       |
-----------------------------------
--2024-08-11 20:34:07.555703 UTC---
| Itration            | 306       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 3.75e+03  |
| Reward Loss         | -3.42e+06 |
| Running Env Steps   | 1530000   |
| Running Forward KL  | 28.6      |
| Running Reverse KL  | 873       |
| Running Update Time | 306       |
-----------------------------------
--2024-08-11 20:37:13.084680 UTC---
| Itration            | 307       |
| PAGAR Loss          | -4.64e+10 |
| Real Det Return     | 4.22e+03  |
| Real Sto Return     | 3.37e+03  |
| Reward Loss         | -6.3e+06  |
| Running Env Steps   | 1535000   |
| Running Forward KL  | 28.9      |
| Running Reverse KL  | 813       |
| Running Update Time | 307       |
-----------------------------------
--2024-08-11 20:40:20.612057 UTC---
| Itration            | 308       |
| PAGAR Loss          | -6.26e+07 |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 3.69e+03  |
| Reward Loss         | -4.67e+06 |
| Running Env Steps   | 1540000   |
| Running Forward KL  | 26.8      |
| Running Reverse KL  | 265       |
| Running Update Time | 308       |
-----------------------------------
--2024-08-11 20:43:23.077222 UTC---
| Itration            | 309       |
| PAGAR Loss          | -3.12e+06 |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 3.23e+03  |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 1545000   |
| Running Forward KL  | 26.3      |
| Running Reverse KL  | 814       |
| Running Update Time | 309       |
-----------------------------------
--2024-08-11 20:46:28.943723 UTC---
| Itration            | 310       |
| PAGAR Loss          | -2.25e+08 |
| Real Det Return     | 4.81e+03  |
| Real Sto Return     | 3.41e+03  |
| Reward Loss         | -4.6e+06  |
| Running Env Steps   | 1550000   |
| Running Forward KL  | 27.1      |
| Running Reverse KL  | 442       |
| Running Update Time | 310       |
-----------------------------------
--2024-08-11 20:49:35.832655 UTC---
| Itration            | 311       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 3.81e+03  |
| Reward Loss         | -5.01e+06 |
| Running Env Steps   | 1555000   |
| Running Forward KL  | 23.6      |
| Running Reverse KL  | 560       |
| Running Update Time | 311       |
-----------------------------------
--2024-08-11 20:52:40.116397 UTC---
| Itration            | 312       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.17e+03  |
| Real Sto Return     | 3.46e+03  |
| Reward Loss         | -3.66e+06 |
| Running Env Steps   | 1560000   |
| Running Forward KL  | 29.2      |
| Running Reverse KL  | 459       |
| Running Update Time | 312       |
-----------------------------------
--2024-08-11 20:55:46.127510 UTC---
| Itration            | 313       |
| PAGAR Loss          | -3.4e+11  |
| Real Det Return     | 4.96e+03  |
| Real Sto Return     | 3.47e+03  |
| Reward Loss         | -7.26e+06 |
| Running Env Steps   | 1565000   |
| Running Forward KL  | 27.8      |
| Running Reverse KL  | 686       |
| Running Update Time | 313       |
-----------------------------------
--2024-08-11 20:58:52.890728 UTC---
| Itration            | 314       |
| PAGAR Loss          | -7.6e+13  |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -3.57e+06 |
| Running Env Steps   | 1570000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 384       |
| Running Update Time | 314       |
-----------------------------------
--2024-08-11 21:01:56.045696 UTC---
| Itration            | 315       |
| PAGAR Loss          | 1.56e+07  |
| Real Det Return     | 4.01e+03  |
| Real Sto Return     | 2.7e+03   |
| Reward Loss         | -4.04e+06 |
| Running Env Steps   | 1575000   |
| Running Forward KL  | 24.6      |
| Running Reverse KL  | 475       |
| Running Update Time | 315       |
-----------------------------------
--2024-08-11 21:05:03.376764 UTC---
| Itration            | 316       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -7.26e+06 |
| Running Env Steps   | 1580000   |
| Running Forward KL  | 34.2      |
| Running Reverse KL  | 1.17e+03  |
| Running Update Time | 316       |
-----------------------------------
--2024-08-11 21:08:11.337255 UTC---
| Itration            | 317       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.21e+03  |
| Reward Loss         | -2.95e+06 |
| Running Env Steps   | 1585000   |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 174       |
| Running Update Time | 317       |
-----------------------------------
--2024-08-11 21:11:19.363234 UTC---
| Itration            | 318       |
| PAGAR Loss          | -2.58e+07 |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -4.89e+06 |
| Running Env Steps   | 1590000   |
| Running Forward KL  | 25        |
| Running Reverse KL  | 244       |
| Running Update Time | 318       |
-----------------------------------
--2024-08-11 21:14:27.978543 UTC--
| Itration            | 319      |
| PAGAR Loss          | 8.48e+10 |
| Real Det Return     | 5.32e+03 |
| Real Sto Return     | 4.8e+03  |
| Reward Loss         | -2.1e+06 |
| Running Env Steps   | 1595000  |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 13.7     |
| Running Update Time | 319      |
----------------------------------
--2024-08-11 21:17:34.661747 UTC---
| Itration            | 320       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -5.02e+06 |
| Running Env Steps   | 1600000   |
| Running Forward KL  | 26.2      |
| Running Reverse KL  | 813       |
| Running Update Time | 320       |
-----------------------------------
--2024-08-11 21:20:42.109530 UTC---
| Itration            | 321       |
| PAGAR Loss          | 6.52e+08  |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -2.71e+06 |
| Running Env Steps   | 1605000   |
| Running Forward KL  | 23.2      |
| Running Reverse KL  | 464       |
| Running Update Time | 321       |
-----------------------------------
--2024-08-11 21:23:50.116309 UTC---
| Itration            | 322       |
| PAGAR Loss          | -7.71e+11 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -3.34e+06 |
| Running Env Steps   | 1610000   |
| Running Forward KL  | 22.1      |
| Running Reverse KL  | 213       |
| Running Update Time | 322       |
-----------------------------------
--2024-08-11 21:26:58.545274 UTC---
| Itration            | 323       |
| PAGAR Loss          | -3.61e+07 |
| Real Det Return     | 4.93e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -2.77e+06 |
| Running Env Steps   | 1615000   |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 10.5      |
| Running Update Time | 323       |
-----------------------------------
--2024-08-11 21:30:06.277714 UTC--
| Itration            | 324      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.34e+03 |
| Real Sto Return     | 4.56e+03 |
| Reward Loss         | -2.5e+06 |
| Running Env Steps   | 1620000  |
| Running Forward KL  | 24.2     |
| Running Reverse KL  | 13.2     |
| Running Update Time | 324      |
----------------------------------
--2024-08-11 21:33:12.587906 UTC---
| Itration            | 325       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.76e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -5.43e+06 |
| Running Env Steps   | 1625000   |
| Running Forward KL  | 22        |
| Running Reverse KL  | 477       |
| Running Update Time | 325       |
-----------------------------------
--2024-08-11 21:36:17.949641 UTC---
| Itration            | 326       |
| PAGAR Loss          | -2.25e+10 |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -4.61e+06 |
| Running Env Steps   | 1630000   |
| Running Forward KL  | 27.4      |
| Running Reverse KL  | 508       |
| Running Update Time | 326       |
-----------------------------------
--2024-08-11 21:39:22.879949 UTC---
| Itration            | 327       |
| PAGAR Loss          | -1.6e+12  |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -3.36e+06 |
| Running Env Steps   | 1635000   |
| Running Forward KL  | 26        |
| Running Reverse KL  | 338       |
| Running Update Time | 327       |
-----------------------------------
--2024-08-11 21:42:30.311871 UTC---
| Itration            | 328       |
| PAGAR Loss          | -5.67e+07 |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.23e+03  |
| Reward Loss         | -6.53e+06 |
| Running Env Steps   | 1640000   |
| Running Forward KL  | 29.3      |
| Running Reverse KL  | 529       |
| Running Update Time | 328       |
-----------------------------------
--2024-08-11 21:45:38.533014 UTC---
| Itration            | 329       |
| PAGAR Loss          | 1.59e+09  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 1645000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 471       |
| Running Update Time | 329       |
-----------------------------------
--2024-08-11 21:48:44.090825 UTC---
| Itration            | 330       |
| PAGAR Loss          | -1.07e+08 |
| Real Det Return     | 4.83e+03  |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -6.56e+06 |
| Running Env Steps   | 1650000   |
| Running Forward KL  | 29.3      |
| Running Reverse KL  | 941       |
| Running Update Time | 330       |
-----------------------------------
--2024-08-11 21:51:52.299505 UTC---
| Itration            | 331       |
| PAGAR Loss          | -1.86e+06 |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -2.96e+06 |
| Running Env Steps   | 1655000   |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 9.7       |
| Running Update Time | 331       |
-----------------------------------
--2024-08-11 21:54:59.480884 UTC---
| Itration            | 332       |
| PAGAR Loss          | -7.74e+07 |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.33e+03  |
| Reward Loss         | -4.66e+06 |
| Running Env Steps   | 1660000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 262       |
| Running Update Time | 332       |
-----------------------------------
--2024-08-11 21:58:07.199463 UTC---
| Itration            | 333       |
| PAGAR Loss          | -4.23e+07 |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -4.2e+06  |
| Running Env Steps   | 1665000   |
| Running Forward KL  | 26.1      |
| Running Reverse KL  | 797       |
| Running Update Time | 333       |
-----------------------------------
--2024-08-11 22:01:15.765191 UTC---
| Itration            | 334       |
| PAGAR Loss          | -3.57e+07 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -2.4e+06  |
| Running Env Steps   | 1670000   |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 9.44      |
| Running Update Time | 334       |
-----------------------------------
--2024-08-11 22:04:24.153789 UTC---
| Itration            | 335       |
| PAGAR Loss          | -1.08e+08 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -4.3e+06  |
| Running Env Steps   | 1675000   |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 6.77      |
| Running Update Time | 335       |
-----------------------------------
--2024-08-11 22:07:32.527346 UTC---
| Itration            | 336       |
| PAGAR Loss          | -3.41e+06 |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -3.15e+06 |
| Running Env Steps   | 1680000   |
| Running Forward KL  | 22.4      |
| Running Reverse KL  | 207       |
| Running Update Time | 336       |
-----------------------------------
--2024-08-11 22:10:39.242842 UTC--
| Itration            | 337      |
| PAGAR Loss          | 6.3e+07  |
| Real Det Return     | 5.18e+03 |
| Real Sto Return     | 3.81e+03 |
| Reward Loss         | -6.2e+06 |
| Running Env Steps   | 1685000  |
| Running Forward KL  | 28.8     |
| Running Reverse KL  | 680      |
| Running Update Time | 337      |
----------------------------------
--2024-08-11 22:13:59.779292 UTC---
| Itration            | 338       |
| PAGAR Loss          | -7.57e+11 |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -2.72e+06 |
| Running Env Steps   | 1690000   |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 330       |
| Running Update Time | 338       |
-----------------------------------
--2024-08-11 22:17:37.056648 UTC---
| Itration            | 339       |
| PAGAR Loss          | -2e+07    |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 3.94e+03  |
| Reward Loss         | -3.04e+06 |
| Running Env Steps   | 1695000   |
| Running Forward KL  | 18.1      |
| Running Reverse KL  | 183       |
| Running Update Time | 339       |
-----------------------------------
--2024-08-11 22:21:11.747061 UTC---
| Itration            | 340       |
| PAGAR Loss          | 1.59e+07  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -2.72e+06 |
| Running Env Steps   | 1700000   |
| Running Forward KL  | 18.6      |
| Running Reverse KL  | 207       |
| Running Update Time | 340       |
-----------------------------------
--2024-08-11 22:24:38.230253 UTC--
| Itration            | 341      |
| PAGAR Loss          | 6.13e+09 |
| Real Det Return     | 5.08e+03 |
| Real Sto Return     | 4.04e+03 |
| Reward Loss         | -3.7e+06 |
| Running Env Steps   | 1705000  |
| Running Forward KL  | 21.9     |
| Running Reverse KL  | 206      |
| Running Update Time | 341      |
----------------------------------
--2024-08-11 22:28:05.169219 UTC---
| Itration            | 342       |
| PAGAR Loss          | 7.14e+10  |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.27e+03  |
| Reward Loss         | -2.18e+06 |
| Running Env Steps   | 1710000   |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 13.7      |
| Running Update Time | 342       |
-----------------------------------
--2024-08-11 22:31:20.284644 UTC---
| Itration            | 343       |
| PAGAR Loss          | -9.72e+10 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -4.51e+06 |
| Running Env Steps   | 1715000   |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 486       |
| Running Update Time | 343       |
-----------------------------------
--2024-08-11 22:34:50.865638 UTC---
| Itration            | 344       |
| PAGAR Loss          | -2.44e+09 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.06e+03  |
| Reward Loss         | -2.92e+06 |
| Running Env Steps   | 1720000   |
| Running Forward KL  | 22.1      |
| Running Reverse KL  | 169       |
| Running Update Time | 344       |
-----------------------------------
--2024-08-11 22:38:07.290571 UTC--
| Itration            | 345      |
| PAGAR Loss          | nan      |
| Real Det Return     | 4.87e+03 |
| Real Sto Return     | 4.25e+03 |
| Reward Loss         | -8.2e+06 |
| Running Env Steps   | 1725000  |
| Running Forward KL  | 27.8     |
| Running Reverse KL  | 651      |
| Running Update Time | 345      |
----------------------------------
--2024-08-11 22:41:21.406265 UTC---
| Itration            | 346       |
| PAGAR Loss          | 3.16e+08  |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -4.93e+06 |
| Running Env Steps   | 1730000   |
| Running Forward KL  | 22.6      |
| Running Reverse KL  | 595       |
| Running Update Time | 346       |
-----------------------------------
--2024-08-11 22:44:35.647689 UTC---
| Itration            | 347       |
| PAGAR Loss          | -6.98e+10 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -5.76e+06 |
| Running Env Steps   | 1735000   |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 513       |
| Running Update Time | 347       |
-----------------------------------
--2024-08-11 22:47:51.159852 UTC---
| Itration            | 348       |
| PAGAR Loss          | -4.23e+07 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -2.63e+06 |
| Running Env Steps   | 1740000   |
| Running Forward KL  | 18.5      |
| Running Reverse KL  | 7.04      |
| Running Update Time | 348       |
-----------------------------------
--2024-08-11 22:51:05.445009 UTC---
| Itration            | 349       |
| PAGAR Loss          | 5.6e+09   |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -5.44e+06 |
| Running Env Steps   | 1745000   |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 203       |
| Running Update Time | 349       |
-----------------------------------
--2024-08-11 22:54:18.277782 UTC---
| Itration            | 350       |
| PAGAR Loss          | 1.07e+07  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.28e+03  |
| Reward Loss         | -2.71e+06 |
| Running Env Steps   | 1750000   |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 9.06      |
| Running Update Time | 350       |
-----------------------------------
--2024-08-11 22:57:32.386240 UTC---
| Itration            | 351       |
| PAGAR Loss          | -2.3e+07  |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 1755000   |
| Running Forward KL  | 15.7      |
| Running Reverse KL  | 5.21      |
| Running Update Time | 351       |
-----------------------------------
--2024-08-11 23:00:57.278175 UTC---
| Itration            | 352       |
| PAGAR Loss          | 4.25e+08  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -2.96e+06 |
| Running Env Steps   | 1760000   |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 165       |
| Running Update Time | 352       |
-----------------------------------
--2024-08-11 23:04:10.240814 UTC---
| Itration            | 353       |
| PAGAR Loss          | 3.6e+07   |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -2.02e+06 |
| Running Env Steps   | 1765000   |
| Running Forward KL  | 22.9      |
| Running Reverse KL  | 15.8      |
| Running Update Time | 353       |
-----------------------------------
--2024-08-11 23:07:22.233197 UTC---
| Itration            | 354       |
| PAGAR Loss          | -2.01e+07 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 1770000   |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 354       |
-----------------------------------
--2024-08-11 23:10:35.158974 UTC---
| Itration            | 355       |
| PAGAR Loss          | -7.69e+07 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.23e+03  |
| Reward Loss         | -2.66e+06 |
| Running Env Steps   | 1775000   |
| Running Forward KL  | 24.3      |
| Running Reverse KL  | 17        |
| Running Update Time | 355       |
-----------------------------------
--2024-08-11 23:13:45.135903 UTC---
| Itration            | 356       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.75e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -6.83e+06 |
| Running Env Steps   | 1780000   |
| Running Forward KL  | 26.2      |
| Running Reverse KL  | 906       |
| Running Update Time | 356       |
-----------------------------------
--2024-08-11 23:16:56.379755 UTC---
| Itration            | 357       |
| PAGAR Loss          | -2.52e+07 |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -2.5e+06  |
| Running Env Steps   | 1785000   |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 161       |
| Running Update Time | 357       |
-----------------------------------
--2024-08-11 23:20:05.145416 UTC---
| Itration            | 358       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -2.45e+06 |
| Running Env Steps   | 1790000   |
| Running Forward KL  | 18.7      |
| Running Reverse KL  | 8.74      |
| Running Update Time | 358       |
-----------------------------------
--2024-08-11 23:23:13.822384 UTC---
| Itration            | 359       |
| PAGAR Loss          | -4.38e+07 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -2.82e+06 |
| Running Env Steps   | 1795000   |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 187       |
| Running Update Time | 359       |
-----------------------------------
--2024-08-11 23:26:22.658896 UTC---
| Itration            | 360       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 4.44e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 1800000   |
| Running Forward KL  | 25.5      |
| Running Reverse KL  | 237       |
| Running Update Time | 360       |
-----------------------------------
--2024-08-11 23:29:31.483590 UTC---
| Itration            | 361       |
| PAGAR Loss          | -5.87e+07 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.95e+06 |
| Running Env Steps   | 1805000   |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 74.4      |
| Running Update Time | 361       |
-----------------------------------
--2024-08-11 23:32:41.012306 UTC--
| Itration            | 362      |
| PAGAR Loss          | 2.15e+10 |
| Real Det Return     | 5.29e+03 |
| Real Sto Return     | 4.75e+03 |
| Reward Loss         | -1.7e+06 |
| Running Env Steps   | 1810000  |
| Running Forward KL  | 20.2     |
| Running Reverse KL  | 8.69     |
| Running Update Time | 362      |
----------------------------------
--2024-08-11 23:35:49.361353 UTC---
| Itration            | 363       |
| PAGAR Loss          | -6.08e+07 |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -5.63e+06 |
| Running Env Steps   | 1815000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 355       |
| Running Update Time | 363       |
-----------------------------------
--2024-08-11 23:38:56.846678 UTC---
| Itration            | 364       |
| PAGAR Loss          | -1.93e+10 |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 3.72e+03  |
| Reward Loss         | -3.74e+06 |
| Running Env Steps   | 1820000   |
| Running Forward KL  | 20        |
| Running Reverse KL  | 252       |
| Running Update Time | 364       |
-----------------------------------
--2024-08-11 23:42:05.508184 UTC---
| Itration            | 365       |
| PAGAR Loss          | 4e+08     |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 1825000   |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 5.08      |
| Running Update Time | 365       |
-----------------------------------
--2024-08-11 23:45:13.546050 UTC---
| Itration            | 366       |
| PAGAR Loss          | -1.22e+06 |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -2.27e+06 |
| Running Env Steps   | 1830000   |
| Running Forward KL  | 22.6      |
| Running Reverse KL  | 11.6      |
| Running Update Time | 366       |
-----------------------------------
--2024-08-11 23:48:20.993290 UTC---
| Itration            | 367       |
| PAGAR Loss          | -1.16e+09 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.02e+03  |
| Reward Loss         | -3.71e+06 |
| Running Env Steps   | 1835000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 402       |
| Running Update Time | 367       |
-----------------------------------
--2024-08-11 23:51:29.725121 UTC--
| Itration            | 368      |
| PAGAR Loss          | 1.1e+08  |
| Real Det Return     | 5.12e+03 |
| Real Sto Return     | 4.82e+03 |
| Reward Loss         | -1.8e+06 |
| Running Env Steps   | 1840000  |
| Running Forward KL  | 20.3     |
| Running Reverse KL  | 8.22     |
| Running Update Time | 368      |
----------------------------------
--2024-08-11 23:54:38.524058 UTC---
| Itration            | 369       |
| PAGAR Loss          | -7.64e+06 |
| Real Det Return     | 4.93e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 1845000   |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 17.5      |
| Running Update Time | 369       |
-----------------------------------
--2024-08-11 23:57:45.617457 UTC---
| Itration            | 370       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.55e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -8.45e+06 |
| Running Env Steps   | 1850000   |
| Running Forward KL  | 25.4      |
| Running Reverse KL  | 755       |
| Running Update Time | 370       |
-----------------------------------
--2024-08-12 00:00:48.581976 UTC---
| Itration            | 371       |
| PAGAR Loss          | -4.81e+09 |
| Real Det Return     | 3.35e+03  |
| Real Sto Return     | 3.35e+03  |
| Reward Loss         | -3.18e+06 |
| Running Env Steps   | 1855000   |
| Running Forward KL  | 26.4      |
| Running Reverse KL  | 528       |
| Running Update Time | 371       |
-----------------------------------
--2024-08-12 00:03:55.446969 UTC---
| Itration            | 372       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.24e+03  |
| Reward Loss         | -6.79e+06 |
| Running Env Steps   | 1860000   |
| Running Forward KL  | 24.8      |
| Running Reverse KL  | 380       |
| Running Update Time | 372       |
-----------------------------------
--2024-08-12 00:07:03.467189 UTC--
| Itration            | 373      |
| PAGAR Loss          | nan      |
| Real Det Return     | 4.84e+03 |
| Real Sto Return     | 4.62e+03 |
| Reward Loss         | -9e+06   |
| Running Env Steps   | 1865000  |
| Running Forward KL  | 29.1     |
| Running Reverse KL  | 1.01e+03 |
| Running Update Time | 373      |
----------------------------------
--2024-08-12 00:10:12.141545 UTC---
| Itration            | 374       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -5.64e+06 |
| Running Env Steps   | 1870000   |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 612       |
| Running Update Time | 374       |
-----------------------------------
--2024-08-12 00:13:21.226761 UTC---
| Itration            | 375       |
| PAGAR Loss          | -7.2e+07  |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -2.57e+06 |
| Running Env Steps   | 1875000   |
| Running Forward KL  | 22.7      |
| Running Reverse KL  | 365       |
| Running Update Time | 375       |
-----------------------------------
--2024-08-12 00:16:29.726483 UTC---
| Itration            | 376       |
| PAGAR Loss          | -8.2e+07  |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.81e+06 |
| Running Env Steps   | 1880000   |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 11.7      |
| Running Update Time | 376       |
-----------------------------------
--2024-08-12 00:19:38.529584 UTC---
| Itration            | 377       |
| PAGAR Loss          | 3.21e+12  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -2.82e+06 |
| Running Env Steps   | 1885000   |
| Running Forward KL  | 23.6      |
| Running Reverse KL  | 213       |
| Running Update Time | 377       |
-----------------------------------
--2024-08-12 00:22:47.177798 UTC---
| Itration            | 378       |
| PAGAR Loss          | 1.34e+09  |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 1890000   |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 11.2      |
| Running Update Time | 378       |
-----------------------------------
--2024-08-12 00:25:56.683505 UTC---
| Itration            | 379       |
| PAGAR Loss          | -6.3e+07  |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -2.15e+06 |
| Running Env Steps   | 1895000   |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 379       |
-----------------------------------
--2024-08-12 00:29:06.130068 UTC---
| Itration            | 380       |
| PAGAR Loss          | -2.61e+08 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -2.9e+06  |
| Running Env Steps   | 1900000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 150       |
| Running Update Time | 380       |
-----------------------------------
--2024-08-12 00:32:14.868845 UTC---
| Itration            | 381       |
| PAGAR Loss          | -1.04e+08 |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 1905000   |
| Running Forward KL  | 22.8      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 381       |
-----------------------------------
--2024-08-12 00:35:22.801465 UTC---
| Itration            | 382       |
| PAGAR Loss          | -5.77e+07 |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.17e+03  |
| Reward Loss         | -2.94e+06 |
| Running Env Steps   | 1910000   |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 169       |
| Running Update Time | 382       |
-----------------------------------
--2024-08-12 00:38:30.013339 UTC---
| Itration            | 383       |
| PAGAR Loss          | 2.91e+07  |
| Real Det Return     | 5.04e+03  |
| Real Sto Return     | 4.29e+03  |
| Reward Loss         | -2.26e+06 |
| Running Env Steps   | 1915000   |
| Running Forward KL  | 22.1      |
| Running Reverse KL  | 212       |
| Running Update Time | 383       |
-----------------------------------
--2024-08-12 00:41:37.603270 UTC---
| Itration            | 384       |
| PAGAR Loss          | -8.34e+11 |
| Real Det Return     | 5e+03     |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -4.08e+06 |
| Running Env Steps   | 1920000   |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 420       |
| Running Update Time | 384       |
-----------------------------------
--2024-08-12 00:44:46.486976 UTC---
| Itration            | 385       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 1925000   |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 8.03      |
| Running Update Time | 385       |
-----------------------------------
--2024-08-12 00:47:53.717874 UTC---
| Itration            | 386       |
| PAGAR Loss          | -8.9e+10  |
| Real Det Return     | 4.83e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -4.52e+06 |
| Running Env Steps   | 1930000   |
| Running Forward KL  | 22.5      |
| Running Reverse KL  | 552       |
| Running Update Time | 386       |
-----------------------------------
--2024-08-12 00:51:01.010956 UTC---
| Itration            | 387       |
| PAGAR Loss          | 5.99e+10  |
| Real Det Return     | 4.93e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -4.19e+06 |
| Running Env Steps   | 1935000   |
| Running Forward KL  | 23.4      |
| Running Reverse KL  | 573       |
| Running Update Time | 387       |
-----------------------------------
--2024-08-12 00:54:04.578697 UTC---
| Itration            | 388       |
| PAGAR Loss          | -9.78e+10 |
| Real Det Return     | 4.22e+03  |
| Real Sto Return     | 3.5e+03   |
| Reward Loss         | -3.4e+06  |
| Running Env Steps   | 1940000   |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 494       |
| Running Update Time | 388       |
-----------------------------------
--2024-08-12 00:57:10.029518 UTC---
| Itration            | 389       |
| PAGAR Loss          | 7.4e+08   |
| Real Det Return     | 4.77e+03  |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -6.24e+06 |
| Running Env Steps   | 1945000   |
| Running Forward KL  | 23.4      |
| Running Reverse KL  | 636       |
| Running Update Time | 389       |
-----------------------------------
--2024-08-12 01:00:18.490102 UTC---
| Itration            | 390       |
| PAGAR Loss          | -2.2e+07  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 1950000   |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 9.16      |
| Running Update Time | 390       |
-----------------------------------
--2024-08-12 01:03:27.687645 UTC---
| Itration            | 391       |
| PAGAR Loss          | 1.67e+10  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.12e+06 |
| Running Env Steps   | 1955000   |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 9.68      |
| Running Update Time | 391       |
-----------------------------------
--2024-08-12 01:06:35.064827 UTC---
| Itration            | 392       |
| PAGAR Loss          | -3.2e+07  |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -1.36e+06 |
| Running Env Steps   | 1960000   |
| Running Forward KL  | 18.9      |
| Running Reverse KL  | 7.14      |
| Running Update Time | 392       |
-----------------------------------
--2024-08-12 01:09:43.107220 UTC---
| Itration            | 393       |
| PAGAR Loss          | -1.59e+08 |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -4.38e+06 |
| Running Env Steps   | 1965000   |
| Running Forward KL  | 25.5      |
| Running Reverse KL  | 204       |
| Running Update Time | 393       |
-----------------------------------
--2024-08-12 01:12:48.300071 UTC---
| Itration            | 394       |
| PAGAR Loss          | -1.84e+08 |
| Real Det Return     | 4.4e+03   |
| Real Sto Return     | 3.72e+03  |
| Reward Loss         | -2.43e+06 |
| Running Env Steps   | 1970000   |
| Running Forward KL  | 21.9      |
| Running Reverse KL  | 112       |
| Running Update Time | 394       |
-----------------------------------
--2024-08-12 01:15:54.774620 UTC---
| Itration            | 395       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.77e+03  |
| Real Sto Return     | 4.23e+03  |
| Reward Loss         | -1.93e+06 |
| Running Env Steps   | 1975000   |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 9.79      |
| Running Update Time | 395       |
-----------------------------------
--2024-08-12 01:18:59.420845 UTC---
| Itration            | 396       |
| PAGAR Loss          | 2.12e+09  |
| Real Det Return     | 4.41e+03  |
| Real Sto Return     | 3.25e+03  |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 1980000   |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 64.3      |
| Running Update Time | 396       |
-----------------------------------
--2024-08-12 01:22:06.863424 UTC---
| Itration            | 397       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -3.37e+06 |
| Running Env Steps   | 1985000   |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 249       |
| Running Update Time | 397       |
-----------------------------------
--2024-08-12 01:25:15.122594 UTC---
| Itration            | 398       |
| PAGAR Loss          | -1.2e+10  |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -3.82e+06 |
| Running Env Steps   | 1990000   |
| Running Forward KL  | 23        |
| Running Reverse KL  | 463       |
| Running Update Time | 398       |
-----------------------------------
--2024-08-12 01:28:23.503393 UTC---
| Itration            | 399       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -1.98e+06 |
| Running Env Steps   | 1995000   |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 399       |
-----------------------------------
--2024-08-12 01:31:33.049117 UTC---
| Itration            | 400       |
| PAGAR Loss          | -3.28e+08 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 2000000   |
| Running Forward KL  | 25.2      |
| Running Reverse KL  | 7.06      |
| Running Update Time | 400       |
-----------------------------------
--2024-08-12 01:34:42.008527 UTC---
| Itration            | 401       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -3.87e+06 |
| Running Env Steps   | 2005000   |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 273       |
| Running Update Time | 401       |
-----------------------------------
--2024-08-12 01:37:51.120299 UTC---
| Itration            | 402       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -1.51e+06 |
| Running Env Steps   | 2010000   |
| Running Forward KL  | 17.4      |
| Running Reverse KL  | 10.3      |
| Running Update Time | 402       |
-----------------------------------
--2024-08-12 01:40:59.538682 UTC---
| Itration            | 403       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -2.76e+06 |
| Running Env Steps   | 2015000   |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 239       |
| Running Update Time | 403       |
-----------------------------------
--2024-08-12 01:44:06.714312 UTC--
| Itration            | 404      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.01e+03 |
| Real Sto Return     | 4.65e+03 |
| Reward Loss         | -3.6e+06 |
| Running Env Steps   | 2020000  |
| Running Forward KL  | 25.4     |
| Running Reverse KL  | 429      |
| Running Update Time | 404      |
----------------------------------
--2024-08-12 01:47:10.667377 UTC---
| Itration            | 405       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.32e+03  |
| Real Sto Return     | 3.37e+03  |
| Reward Loss         | -1.19e+07 |
| Running Env Steps   | 2025000   |
| Running Forward KL  | 27.8      |
| Running Reverse KL  | 1.04e+03  |
| Running Update Time | 405       |
-----------------------------------
--2024-08-12 01:50:19.738472 UTC---
| Itration            | 406       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -3.11e+06 |
| Running Env Steps   | 2030000   |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 183       |
| Running Update Time | 406       |
-----------------------------------
--2024-08-12 01:53:25.380693 UTC---
| Itration            | 407       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.16e+03  |
| Real Sto Return     | 3.64e+03  |
| Reward Loss         | -5.28e+06 |
| Running Env Steps   | 2035000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 669       |
| Running Update Time | 407       |
-----------------------------------
--2024-08-12 01:56:33.411461 UTC---
| Itration            | 408       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -2.16e+06 |
| Running Env Steps   | 2040000   |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 230       |
| Running Update Time | 408       |
-----------------------------------
--2024-08-12 01:59:39.980506 UTC---
| Itration            | 409       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.4e+03   |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 2045000   |
| Running Forward KL  | 17.2      |
| Running Reverse KL  | 139       |
| Running Update Time | 409       |
-----------------------------------
--2024-08-12 02:02:44.770032 UTC---
| Itration            | 410       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.76e+03  |
| Real Sto Return     | 4.21e+03  |
| Reward Loss         | -5.42e+06 |
| Running Env Steps   | 2050000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 645       |
| Running Update Time | 410       |
-----------------------------------
--2024-08-12 02:05:44.899844 UTC---
| Itration            | 411       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.21e+03  |
| Real Sto Return     | 2.91e+03  |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 2055000   |
| Running Forward KL  | 25.3      |
| Running Reverse KL  | 920       |
| Running Update Time | 411       |
-----------------------------------
--2024-08-12 02:08:51.408680 UTC---
| Itration            | 412       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -2.48e+06 |
| Running Env Steps   | 2060000   |
| Running Forward KL  | 22.2      |
| Running Reverse KL  | 286       |
| Running Update Time | 412       |
-----------------------------------
--2024-08-12 02:11:59.553003 UTC---
| Itration            | 413       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -2.34e+06 |
| Running Env Steps   | 2065000   |
| Running Forward KL  | 17        |
| Running Reverse KL  | 116       |
| Running Update Time | 413       |
-----------------------------------
--2024-08-12 02:15:08.116765 UTC--
| Itration            | 414      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.3e+03  |
| Real Sto Return     | 5.15e+03 |
| Reward Loss         | -2.8e+06 |
| Running Env Steps   | 2070000  |
| Running Forward KL  | 17.2     |
| Running Reverse KL  | 194      |
| Running Update Time | 414      |
----------------------------------
--2024-08-12 02:18:16.864932 UTC---
| Itration            | 415       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 2075000   |
| Running Forward KL  | 18.2      |
| Running Reverse KL  | 9.86      |
| Running Update Time | 415       |
-----------------------------------
--2024-08-12 02:21:24.561858 UTC---
| Itration            | 416       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.89e+03  |
| Real Sto Return     | 4.47e+03  |
| Reward Loss         | -2.26e+06 |
| Running Env Steps   | 2080000   |
| Running Forward KL  | 20        |
| Running Reverse KL  | 23.4      |
| Running Update Time | 416       |
-----------------------------------
--2024-08-12 02:24:32.157789 UTC---
| Itration            | 417       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -3.11e+06 |
| Running Env Steps   | 2085000   |
| Running Forward KL  | 21        |
| Running Reverse KL  | 270       |
| Running Update Time | 417       |
-----------------------------------
--2024-08-12 02:27:40.383051 UTC---
| Itration            | 418       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -2.28e+06 |
| Running Env Steps   | 2090000   |
| Running Forward KL  | 17.7      |
| Running Reverse KL  | 267       |
| Running Update Time | 418       |
-----------------------------------
--2024-08-12 02:30:49.415937 UTC---
| Itration            | 419       |
| PAGAR Loss          | 2.28e+08  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -3.32e+06 |
| Running Env Steps   | 2095000   |
| Running Forward KL  | 23.1      |
| Running Reverse KL  | 230       |
| Running Update Time | 419       |
-----------------------------------
--2024-08-12 02:33:58.452026 UTC---
| Itration            | 420       |
| PAGAR Loss          | 2.09e+08  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -1.72e+06 |
| Running Env Steps   | 2100000   |
| Running Forward KL  | 18        |
| Running Reverse KL  | 10.2      |
| Running Update Time | 420       |
-----------------------------------
--2024-08-12 02:37:05.758590 UTC---
| Itration            | 421       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.37e+03  |
| Reward Loss         | -4.92e+06 |
| Running Env Steps   | 2105000   |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 282       |
| Running Update Time | 421       |
-----------------------------------
--2024-08-12 02:40:14.621388 UTC---
| Itration            | 422       |
| PAGAR Loss          | 1.45e+09  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 2110000   |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 113       |
| Running Update Time | 422       |
-----------------------------------
--2024-08-12 02:43:20.440078 UTC---
| Itration            | 423       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -7.32e+06 |
| Running Env Steps   | 2115000   |
| Running Forward KL  | 26.3      |
| Running Reverse KL  | 813       |
| Running Update Time | 423       |
-----------------------------------
--2024-08-12 02:46:29.834931 UTC---
| Itration            | 424       |
| PAGAR Loss          | 9.32e+07  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -3.46e+06 |
| Running Env Steps   | 2120000   |
| Running Forward KL  | 20.5      |
| Running Reverse KL  | 171       |
| Running Update Time | 424       |
-----------------------------------
--2024-08-12 02:49:36.229293 UTC---
| Itration            | 425       |
| PAGAR Loss          | 3.72e+07  |
| Real Det Return     | 4.8e+03   |
| Real Sto Return     | 4.08e+03  |
| Reward Loss         | -6.37e+06 |
| Running Env Steps   | 2125000   |
| Running Forward KL  | 22.6      |
| Running Reverse KL  | 418       |
| Running Update Time | 425       |
-----------------------------------
--2024-08-12 02:52:42.676487 UTC---
| Itration            | 426       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.96e+03  |
| Real Sto Return     | 4.2e+03   |
| Reward Loss         | -4.08e+06 |
| Running Env Steps   | 2130000   |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 512       |
| Running Update Time | 426       |
-----------------------------------
--2024-08-12 02:55:50.747488 UTC---
| Itration            | 427       |
| PAGAR Loss          | 4.74e+07  |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 2135000   |
| Running Forward KL  | 19        |
| Running Reverse KL  | 8.91      |
| Running Update Time | 427       |
-----------------------------------
--2024-08-12 02:58:58.601559 UTC---
| Itration            | 428       |
| PAGAR Loss          | 8.25e+07  |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 2140000   |
| Running Forward KL  | 17.1      |
| Running Reverse KL  | 9.91      |
| Running Update Time | 428       |
-----------------------------------
--2024-08-12 03:02:05.329423 UTC---
| Itration            | 429       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.2e+03   |
| Reward Loss         | -6.91e+06 |
| Running Env Steps   | 2145000   |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 527       |
| Running Update Time | 429       |
-----------------------------------
--2024-08-12 03:05:12.109886 UTC---
| Itration            | 430       |
| PAGAR Loss          | 2.36e+12  |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 4.21e+03  |
| Reward Loss         | -4.43e+06 |
| Running Env Steps   | 2150000   |
| Running Forward KL  | 25.9      |
| Running Reverse KL  | 539       |
| Running Update Time | 430       |
-----------------------------------
--2024-08-12 03:08:19.523081 UTC---
| Itration            | 431       |
| PAGAR Loss          | 4.33e+09  |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -3.75e+06 |
| Running Env Steps   | 2155000   |
| Running Forward KL  | 17.2      |
| Running Reverse KL  | 244       |
| Running Update Time | 431       |
-----------------------------------
--2024-08-12 03:11:28.264955 UTC---
| Itration            | 432       |
| PAGAR Loss          | 6.29e+07  |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.28e+06 |
| Running Env Steps   | 2160000   |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 13.2      |
| Running Update Time | 432       |
-----------------------------------
--2024-08-12 03:14:37.454641 UTC---
| Itration            | 433       |
| PAGAR Loss          | 1.21e+10  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 2165000   |
| Running Forward KL  | 18.9      |
| Running Reverse KL  | 11.3      |
| Running Update Time | 433       |
-----------------------------------
--2024-08-12 03:17:44.621987 UTC---
| Itration            | 434       |
| PAGAR Loss          | -2.09e+07 |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -2.33e+06 |
| Running Env Steps   | 2170000   |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 144       |
| Running Update Time | 434       |
-----------------------------------
--2024-08-12 03:20:52.787933 UTC---
| Itration            | 435       |
| PAGAR Loss          | 2.45e+11  |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -5.64e+06 |
| Running Env Steps   | 2175000   |
| Running Forward KL  | 25        |
| Running Reverse KL  | 424       |
| Running Update Time | 435       |
-----------------------------------
--2024-08-12 03:24:02.447989 UTC---
| Itration            | 436       |
| PAGAR Loss          | 4.58e+08  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -1.33e+06 |
| Running Env Steps   | 2180000   |
| Running Forward KL  | 25.7      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 436       |
-----------------------------------
--2024-08-12 03:27:10.760855 UTC---
| Itration            | 437       |
| PAGAR Loss          | 2.09e+08  |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 4.22e+03  |
| Reward Loss         | -1.47e+06 |
| Running Env Steps   | 2185000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 15        |
| Running Update Time | 437       |
-----------------------------------
--2024-08-12 03:30:18.987091 UTC---
| Itration            | 438       |
| PAGAR Loss          | 2.47e+11  |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.44e+03  |
| Reward Loss         | -3.76e+06 |
| Running Env Steps   | 2190000   |
| Running Forward KL  | 22.6      |
| Running Reverse KL  | 314       |
| Running Update Time | 438       |
-----------------------------------
--2024-08-12 03:33:26.296373 UTC---
| Itration            | 439       |
| PAGAR Loss          | 4.56e+11  |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.37e+03  |
| Reward Loss         | -2.24e+06 |
| Running Env Steps   | 2195000   |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 136       |
| Running Update Time | 439       |
-----------------------------------
--2024-08-12 03:36:33.313401 UTC---
| Itration            | 440       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.39e+03  |
| Reward Loss         | -4.94e+06 |
| Running Env Steps   | 2200000   |
| Running Forward KL  | 21        |
| Running Reverse KL  | 307       |
| Running Update Time | 440       |
-----------------------------------
--2024-08-12 03:39:41.940397 UTC---
| Itration            | 441       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -3.41e+06 |
| Running Env Steps   | 2205000   |
| Running Forward KL  | 22.8      |
| Running Reverse KL  | 247       |
| Running Update Time | 441       |
-----------------------------------
--2024-08-12 03:42:50.221755 UTC---
| Itration            | 442       |
| PAGAR Loss          | 9.22e+08  |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 3.8e+03   |
| Reward Loss         | -2.48e+06 |
| Running Env Steps   | 2210000   |
| Running Forward KL  | 23.5      |
| Running Reverse KL  | 372       |
| Running Update Time | 442       |
-----------------------------------
--2024-08-12 03:45:58.641920 UTC---
| Itration            | 443       |
| PAGAR Loss          | 3.35e+07  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 2215000   |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 91.4      |
| Running Update Time | 443       |
-----------------------------------
--2024-08-12 03:49:03.587859 UTC---
| Itration            | 444       |
| PAGAR Loss          | 2.04e+08  |
| Real Det Return     | 4.56e+03  |
| Real Sto Return     | 3.6e+03   |
| Reward Loss         | -9.72e+06 |
| Running Env Steps   | 2220000   |
| Running Forward KL  | 26.4      |
| Running Reverse KL  | 1.1e+03   |
| Running Update Time | 444       |
-----------------------------------
--2024-08-12 03:52:09.399352 UTC---
| Itration            | 445       |
| PAGAR Loss          | 4.65e+07  |
| Real Det Return     | 4.35e+03  |
| Real Sto Return     | 3.63e+03  |
| Reward Loss         | -2.87e+06 |
| Running Env Steps   | 2225000   |
| Running Forward KL  | 22.2      |
| Running Reverse KL  | 598       |
| Running Update Time | 445       |
-----------------------------------
--2024-08-12 03:55:19.043243 UTC---
| Itration            | 446       |
| PAGAR Loss          | 2.99e+07  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 2230000   |
| Running Forward KL  | 17.9      |
| Running Reverse KL  | 25.2      |
| Running Update Time | 446       |
-----------------------------------
--2024-08-12 03:58:27.787822 UTC---
| Itration            | 447       |
| PAGAR Loss          | 1.31e+08  |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 4.44e+03  |
| Reward Loss         | -2.71e+06 |
| Running Env Steps   | 2235000   |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 263       |
| Running Update Time | 447       |
-----------------------------------
--2024-08-12 04:01:34.782374 UTC---
| Itration            | 448       |
| PAGAR Loss          | 4.93e+08  |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.95e+06 |
| Running Env Steps   | 2240000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 423       |
| Running Update Time | 448       |
-----------------------------------
--2024-08-12 04:04:40.345687 UTC---
| Itration            | 449       |
| PAGAR Loss          | 7.25e+07  |
| Real Det Return     | 3.85e+03  |
| Real Sto Return     | 4.48e+03  |
| Reward Loss         | -4.46e+06 |
| Running Env Steps   | 2245000   |
| Running Forward KL  | 23.1      |
| Running Reverse KL  | 378       |
| Running Update Time | 449       |
-----------------------------------
--2024-08-12 04:07:44.517390 UTC---
| Itration            | 450       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.63e+03  |
| Real Sto Return     | 4.2e+03   |
| Reward Loss         | -5.97e+06 |
| Running Env Steps   | 2250000   |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 683       |
| Running Update Time | 450       |
-----------------------------------
--2024-08-12 04:10:45.995532 UTC---
| Itration            | 451       |
| PAGAR Loss          | 1.64e+09  |
| Real Det Return     | 3.86e+03  |
| Real Sto Return     | 2.7e+03   |
| Reward Loss         | -6.46e+06 |
| Running Env Steps   | 2255000   |
| Running Forward KL  | 31.6      |
| Running Reverse KL  | 1.5e+03   |
| Running Update Time | 451       |
-----------------------------------
--2024-08-12 04:13:53.531121 UTC---
| Itration            | 452       |
| PAGAR Loss          | 1.58e+08  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -2.49e+06 |
| Running Env Steps   | 2260000   |
| Running Forward KL  | 23.6      |
| Running Reverse KL  | 19.2      |
| Running Update Time | 452       |
-----------------------------------
--2024-08-12 04:17:01.640330 UTC---
| Itration            | 453       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -2.04e+06 |
| Running Env Steps   | 2265000   |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 33.6      |
| Running Update Time | 453       |
-----------------------------------
--2024-08-12 04:20:07.618044 UTC---
| Itration            | 454       |
| PAGAR Loss          | 6.69e+09  |
| Real Det Return     | 5e+03     |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -3.36e+06 |
| Running Env Steps   | 2270000   |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 537       |
| Running Update Time | 454       |
-----------------------------------
--2024-08-12 04:23:14.384444 UTC---
| Itration            | 455       |
| PAGAR Loss          | -1.64e+09 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 2275000   |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 128       |
| Running Update Time | 455       |
-----------------------------------
--2024-08-12 04:26:22.750922 UTC---
| Itration            | 456       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 2280000   |
| Running Forward KL  | 21.9      |
| Running Reverse KL  | 119       |
| Running Update Time | 456       |
-----------------------------------
--2024-08-12 04:29:31.144389 UTC--
| Itration            | 457      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.28e+03 |
| Real Sto Return     | 5.05e+03 |
| Reward Loss         | -3.2e+06 |
| Running Env Steps   | 2285000  |
| Running Forward KL  | 25.2     |
| Running Reverse KL  | 494      |
| Running Update Time | 457      |
----------------------------------
--2024-08-12 04:32:39.629277 UTC---
| Itration            | 458       |
| PAGAR Loss          | -1.83e+08 |
| Real Det Return     | 5.8e+03   |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -2.4e+06  |
| Running Env Steps   | 2290000   |
| Running Forward KL  | 18        |
| Running Reverse KL  | 118       |
| Running Update Time | 458       |
-----------------------------------
--2024-08-12 04:35:48.644915 UTC---
| Itration            | 459       |
| PAGAR Loss          | 1.46e+08  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -2.23e+06 |
| Running Env Steps   | 2295000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 80.8      |
| Running Update Time | 459       |
-----------------------------------
--2024-08-12 04:38:57.173066 UTC---
| Itration            | 460       |
| PAGAR Loss          | -2.14e+07 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.94e+06 |
| Running Env Steps   | 2300000   |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 460       |
-----------------------------------
--2024-08-12 04:42:06.985525 UTC---
| Itration            | 461       |
| PAGAR Loss          | 3.22e+07  |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.43e+06 |
| Running Env Steps   | 2305000   |
| Running Forward KL  | 16.8      |
| Running Reverse KL  | 12.1      |
| Running Update Time | 461       |
-----------------------------------
--2024-08-12 04:45:15.600269 UTC---
| Itration            | 462       |
| PAGAR Loss          | 6.4e+07   |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 2310000   |
| Running Forward KL  | 20        |
| Running Reverse KL  | 244       |
| Running Update Time | 462       |
-----------------------------------
--2024-08-12 04:48:43.981620 UTC---
| Itration            | 463       |
| PAGAR Loss          | 5.53e+09  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -1.71e+06 |
| Running Env Steps   | 2315000   |
| Running Forward KL  | 23.1      |
| Running Reverse KL  | 490       |
| Running Update Time | 463       |
-----------------------------------
--2024-08-12 04:52:08.999588 UTC---
| Itration            | 464       |
| PAGAR Loss          | 2.9e+09   |
| Real Det Return     | 4.68e+03  |
| Real Sto Return     | 4.28e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 2320000   |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 579       |
| Running Update Time | 464       |
-----------------------------------
--2024-08-12 04:55:34.267004 UTC---
| Itration            | 465       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -1.28e+07 |
| Running Env Steps   | 2325000   |
| Running Forward KL  | 24.8      |
| Running Reverse KL  | 961       |
| Running Update Time | 465       |
-----------------------------------
--2024-08-12 04:58:41.971099 UTC---
| Itration            | 466       |
| PAGAR Loss          | 9.03e+07  |
| Real Det Return     | 4.15e+03  |
| Real Sto Return     | 3e+03     |
| Reward Loss         | -5.59e+06 |
| Running Env Steps   | 2330000   |
| Running Forward KL  | 27.9      |
| Running Reverse KL  | 734       |
| Running Update Time | 466       |
-----------------------------------
--2024-08-12 05:02:05.652402 UTC---
| Itration            | 467       |
| PAGAR Loss          | 8.33e+07  |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.28e+03  |
| Reward Loss         | -2.82e+06 |
| Running Env Steps   | 2335000   |
| Running Forward KL  | 18        |
| Running Reverse KL  | 250       |
| Running Update Time | 467       |
-----------------------------------
--2024-08-12 05:05:27.344283 UTC---
| Itration            | 468       |
| PAGAR Loss          | 1.33e+08  |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 3.22e+03  |
| Reward Loss         | -7.02e+06 |
| Running Env Steps   | 2340000   |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 437       |
| Running Update Time | 468       |
-----------------------------------
--2024-08-12 05:08:39.495461 UTC---
| Itration            | 469       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -4.56e+06 |
| Running Env Steps   | 2345000   |
| Running Forward KL  | 22.6      |
| Running Reverse KL  | 728       |
| Running Update Time | 469       |
-----------------------------------
--2024-08-12 05:11:52.698421 UTC---
| Itration            | 470       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -2.27e+06 |
| Running Env Steps   | 2350000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 248       |
| Running Update Time | 470       |
-----------------------------------
--2024-08-12 05:15:04.626143 UTC---
| Itration            | 471       |
| PAGAR Loss          | 1.74e+15  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -1.84e+06 |
| Running Env Steps   | 2355000   |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 471       |
-----------------------------------
--2024-08-12 05:18:16.180462 UTC---
| Itration            | 472       |
| PAGAR Loss          | 3.26e+07  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.22e+03  |
| Reward Loss         | -1.73e+06 |
| Running Env Steps   | 2360000   |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 14.7      |
| Running Update Time | 472       |
-----------------------------------
--2024-08-12 05:21:28.574144 UTC---
| Itration            | 473       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -4.72e+06 |
| Running Env Steps   | 2365000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 146       |
| Running Update Time | 473       |
-----------------------------------
--2024-08-12 05:24:41.856102 UTC---
| Itration            | 474       |
| PAGAR Loss          | 6.65e+07  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.87e+06 |
| Running Env Steps   | 2370000   |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 474       |
-----------------------------------
--2024-08-12 05:28:07.772744 UTC---
| Itration            | 475       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 2.86e+03  |
| Reward Loss         | -8.77e+06 |
| Running Env Steps   | 2375000   |
| Running Forward KL  | 28.2      |
| Running Reverse KL  | 401       |
| Running Update Time | 475       |
-----------------------------------
--2024-08-12 05:31:19.445364 UTC---
| Itration            | 476       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.36e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -1.37e+07 |
| Running Env Steps   | 2380000   |
| Running Forward KL  | 28.2      |
| Running Reverse KL  | 848       |
| Running Update Time | 476       |
-----------------------------------
--2024-08-12 05:34:31.273353 UTC---
| Itration            | 477       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.88e+03  |
| Real Sto Return     | 2.23e+03  |
| Reward Loss         | -1.65e+07 |
| Running Env Steps   | 2385000   |
| Running Forward KL  | 33.3      |
| Running Reverse KL  | 846       |
| Running Update Time | 477       |
-----------------------------------
--2024-08-12 05:37:40.963921 UTC---
| Itration            | 478       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.16e+03  |
| Real Sto Return     | 3.07e+03  |
| Reward Loss         | -8.22e+06 |
| Running Env Steps   | 2390000   |
| Running Forward KL  | 31.3      |
| Running Reverse KL  | 733       |
| Running Update Time | 478       |
-----------------------------------
--2024-08-12 05:40:40.385956 UTC---
| Itration            | 479       |
| PAGAR Loss          | nan       |
| Real Det Return     | 699       |
| Real Sto Return     | 328       |
| Reward Loss         | -2.98e+07 |
| Running Env Steps   | 2395000   |
| Running Forward KL  | 53.9      |
| Running Reverse KL  | 1.65e+03  |
| Running Update Time | 479       |
-----------------------------------
--2024-08-12 05:43:43.127501 UTC--
| Itration            | 480      |
| PAGAR Loss          | nan      |
| Real Det Return     | -610     |
| Real Sto Return     | -536     |
| Reward Loss         | -3.5e+07 |
| Running Env Steps   | 2400000  |
| Running Forward KL  | 87.6     |
| Running Reverse KL  | 1.7e+03  |
| Running Update Time | 480      |
----------------------------------
--2024-08-12 05:46:50.270225 UTC---
| Itration            | 481       |
| PAGAR Loss          | nan       |
| Real Det Return     | -1.63e+03 |
| Real Sto Return     | -1.44e+03 |
| Reward Loss         | -3.64e+07 |
| Running Env Steps   | 2405000   |
| Running Forward KL  | 94.2      |
| Running Reverse KL  | 852       |
| Running Update Time | 481       |
-----------------------------------
--2024-08-12 05:49:41.856087 UTC--
| Itration            | 482      |
| PAGAR Loss          | nan      |
| Real Det Return     | -248     |
| Real Sto Return     | -281     |
| Reward Loss         | -3.2e+07 |
| Running Env Steps   | 2410000  |
| Running Forward KL  | 139      |
| Running Reverse KL  | 2.3e+03  |
| Running Update Time | 482      |
----------------------------------
--2024-08-12 05:52:34.930128 UTC---
| Itration            | 483       |
| PAGAR Loss          | nan       |
| Real Det Return     | -376      |
| Real Sto Return     | -490      |
| Reward Loss         | -3.28e+07 |
| Running Env Steps   | 2415000   |
| Running Forward KL  | 124       |
| Running Reverse KL  | 2.28e+03  |
| Running Update Time | 483       |
-----------------------------------
--2024-08-12 05:55:40.572428 UTC---
| Itration            | 484       |
| PAGAR Loss          | nan       |
| Real Det Return     | -1.91e+03 |
| Real Sto Return     | -1.91e+03 |
| Reward Loss         | -3.2e+07  |
| Running Env Steps   | 2420000   |
| Running Forward KL  | 109       |
| Running Reverse KL  | 1.54e+03  |
| Running Update Time | 484       |
-----------------------------------
--2024-08-12 05:58:48.474827 UTC---
| Itration            | 485       |
| PAGAR Loss          | nan       |
| Real Det Return     | -1.74e+03 |
| Real Sto Return     | -1.28e+03 |
| Reward Loss         | -4.14e+07 |
| Running Env Steps   | 2425000   |
| Running Forward KL  | 102       |
| Running Reverse KL  | 935       |
| Running Update Time | 485       |
-----------------------------------
--2024-08-12 06:02:00.283096 UTC---
| Itration            | 486       |
| PAGAR Loss          | nan       |
| Real Det Return     | -913      |
| Real Sto Return     | -1.19e+03 |
| Reward Loss         | -4.07e+07 |
| Running Env Steps   | 2430000   |
| Running Forward KL  | 132       |
| Running Reverse KL  | 668       |
| Running Update Time | 486       |
-----------------------------------
--2024-08-12 06:05:09.916377 UTC---
| Itration            | 487       |
| PAGAR Loss          | nan       |
| Real Det Return     | -1.73e+03 |
| Real Sto Return     | -1.68e+03 |
| Reward Loss         | -4.46e+07 |
| Running Env Steps   | 2435000   |
| Running Forward KL  | 116       |
| Running Reverse KL  | 1.04e+03  |
| Running Update Time | 487       |
-----------------------------------
--2024-08-12 06:08:19.791089 UTC---
| Itration            | 488       |
| PAGAR Loss          | nan       |
| Real Det Return     | -1.52e+03 |
| Real Sto Return     | -1.7e+03  |
| Reward Loss         | -4.01e+07 |
| Running Env Steps   | 2440000   |
| Running Forward KL  | 106       |
| Running Reverse KL  | 1.29e+03  |
| Running Update Time | 488       |
-----------------------------------
--2024-08-12 06:11:26.870006 UTC---
| Itration            | 489       |
| PAGAR Loss          | nan       |
| Real Det Return     | -1.17e+03 |
| Real Sto Return     | -1.04e+03 |
| Reward Loss         | -3.46e+07 |
| Running Env Steps   | 2445000   |
| Running Forward KL  | 75.4      |
| Running Reverse KL  | 1.34e+03  |
| Running Update Time | 489       |
-----------------------------------
--2024-08-12 06:14:33.116818 UTC---
| Itration            | 490       |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.58e+03  |
| Real Sto Return     | 1.6e+03   |
| Reward Loss         | -1.93e+07 |
| Running Env Steps   | 2450000   |
| Running Forward KL  | 42.5      |
| Running Reverse KL  | 667       |
| Running Update Time | 490       |
-----------------------------------
--2024-08-12 06:17:39.999570 UTC---
| Itration            | 491       |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.03e+03  |
| Real Sto Return     | 1.84e+03  |
| Reward Loss         | -2.07e+07 |
| Running Env Steps   | 2455000   |
| Running Forward KL  | 42.1      |
| Running Reverse KL  | 971       |
| Running Update Time | 491       |
-----------------------------------
--2024-08-12 06:20:47.341759 UTC---
| Itration            | 492       |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.93e+03  |
| Real Sto Return     | 2.11e+03  |
| Reward Loss         | -1.29e+07 |
| Running Env Steps   | 2460000   |
| Running Forward KL  | 30.7      |
| Running Reverse KL  | 347       |
| Running Update Time | 492       |
-----------------------------------
--2024-08-12 06:23:54.398770 UTC---
| Itration            | 493       |
| PAGAR Loss          | 1.71e+07  |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 3.28e+03  |
| Reward Loss         | -1.19e+07 |
| Running Env Steps   | 2465000   |
| Running Forward KL  | 31.6      |
| Running Reverse KL  | 449       |
| Running Update Time | 493       |
-----------------------------------
--2024-08-12 06:27:05.396224 UTC---
| Itration            | 494       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.27e+03  |
| Real Sto Return     | 3.22e+03  |
| Reward Loss         | -2.33e+07 |
| Running Env Steps   | 2470000   |
| Running Forward KL  | 37.9      |
| Running Reverse KL  | 182       |
| Running Update Time | 494       |
-----------------------------------
--2024-08-12 06:30:14.007396 UTC---
| Itration            | 495       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.04e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -9.21e+06 |
| Running Env Steps   | 2475000   |
| Running Forward KL  | 31.7      |
| Running Reverse KL  | 234       |
| Running Update Time | 495       |
-----------------------------------
--2024-08-12 06:33:21.969993 UTC--
| Itration            | 496      |
| PAGAR Loss          | nan      |
| Real Det Return     | 4.59e+03 |
| Real Sto Return     | 3.79e+03 |
| Reward Loss         | -6.9e+06 |
| Running Env Steps   | 2480000  |
| Running Forward KL  | 24.4     |
| Running Reverse KL  | 102      |
| Running Update Time | 496      |
----------------------------------
--2024-08-12 06:36:29.511802 UTC---
| Itration            | 497       |
| PAGAR Loss          | -1.32e+08 |
| Real Det Return     | 3.87e+03  |
| Real Sto Return     | 3.28e+03  |
| Reward Loss         | -1.05e+07 |
| Running Env Steps   | 2485000   |
| Running Forward KL  | 34.4      |
| Running Reverse KL  | 909       |
| Running Update Time | 497       |
-----------------------------------
--2024-08-12 06:39:38.179630 UTC---
| Itration            | 498       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.49e+03  |
| Real Sto Return     | 3.03e+03  |
| Reward Loss         | -9.43e+06 |
| Running Env Steps   | 2490000   |
| Running Forward KL  | 28.3      |
| Running Reverse KL  | 604       |
| Running Update Time | 498       |
-----------------------------------
--2024-08-12 06:42:46.728286 UTC---
| Itration            | 499       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.13e+03  |
| Real Sto Return     | 2.22e+03  |
| Reward Loss         | -1.26e+07 |
| Running Env Steps   | 2495000   |
| Running Forward KL  | 31.5      |
| Running Reverse KL  | 418       |
| Running Update Time | 499       |
-----------------------------------
--2024-08-12 06:45:54.590868 UTC---
| Itration            | 500       |
| PAGAR Loss          | 3.84e+07  |
| Real Det Return     | 3.62e+03  |
| Real Sto Return     | 3.73e+03  |
| Reward Loss         | -6.51e+06 |
| Running Env Steps   | 2500000   |
| Running Forward KL  | 25        |
| Running Reverse KL  | 494       |
| Running Update Time | 500       |
-----------------------------------
--2024-08-12 06:48:53.314904 UTC---
| Itration            | 501       |
| PAGAR Loss          | -7e+08    |
| Real Det Return     | 2.03e+03  |
| Real Sto Return     | 1.64e+03  |
| Reward Loss         | -1.38e+07 |
| Running Env Steps   | 2505000   |
| Running Forward KL  | 37.5      |
| Running Reverse KL  | 1.47e+03  |
| Running Update Time | 501       |
-----------------------------------
--2024-08-12 06:51:59.420234 UTC---
| Itration            | 502       |
| PAGAR Loss          | -9.7e+07  |
| Real Det Return     | 3.94e+03  |
| Real Sto Return     | 3.53e+03  |
| Reward Loss         | -1.53e+07 |
| Running Env Steps   | 2510000   |
| Running Forward KL  | 34.1      |
| Running Reverse KL  | 1.01e+03  |
| Running Update Time | 502       |
-----------------------------------
--2024-08-12 06:55:08.019135 UTC---
| Itration            | 503       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.27e+03  |
| Real Sto Return     | 3.78e+03  |
| Reward Loss         | -1.02e+07 |
| Running Env Steps   | 2515000   |
| Running Forward KL  | 28.1      |
| Running Reverse KL  | 145       |
| Running Update Time | 503       |
-----------------------------------
--2024-08-12 06:58:16.810763 UTC---
| Itration            | 504       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.6e+03   |
| Real Sto Return     | 3.63e+03  |
| Reward Loss         | -1.51e+07 |
| Running Env Steps   | 2520000   |
| Running Forward KL  | 27        |
| Running Reverse KL  | 399       |
| Running Update Time | 504       |
-----------------------------------
--2024-08-12 07:01:24.878533 UTC---
| Itration            | 505       |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.96e+03  |
| Real Sto Return     | 2.75e+03  |
| Reward Loss         | -1.21e+07 |
| Running Env Steps   | 2525000   |
| Running Forward KL  | 30.2      |
| Running Reverse KL  | 367       |
| Running Update Time | 505       |
-----------------------------------
--2024-08-12 07:04:32.312117 UTC---
| Itration            | 506       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.49e+03  |
| Real Sto Return     | 310       |
| Reward Loss         | -3.86e+07 |
| Running Env Steps   | 2530000   |
| Running Forward KL  | 48.1      |
| Running Reverse KL  | 655       |
| Running Update Time | 506       |
-----------------------------------
--2024-08-12 07:07:38.059521 UTC---
| Itration            | 507       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.6e+03   |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -2.07e+07 |
| Running Env Steps   | 2535000   |
| Running Forward KL  | 33.1      |
| Running Reverse KL  | 758       |
| Running Update Time | 507       |
-----------------------------------
--2024-08-12 07:10:39.967709 UTC---
| Itration            | 508       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 790       |
| Reward Loss         | -3.76e+07 |
| Running Env Steps   | 2540000   |
| Running Forward KL  | 58.9      |
| Running Reverse KL  | 1.35e+03  |
| Running Update Time | 508       |
-----------------------------------
--2024-08-12 07:13:48.052052 UTC---
| Itration            | 509       |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.45e+03  |
| Real Sto Return     | 3.05e+03  |
| Reward Loss         | -1.61e+07 |
| Running Env Steps   | 2545000   |
| Running Forward KL  | 40.1      |
| Running Reverse KL  | 273       |
| Running Update Time | 509       |
-----------------------------------
--2024-08-12 07:16:53.717492 UTC---
| Itration            | 510       |
| PAGAR Loss          | -3.63e+08 |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 3.31e+03  |
| Reward Loss         | -7.25e+06 |
| Running Env Steps   | 2550000   |
| Running Forward KL  | 32.7      |
| Running Reverse KL  | 1.35e+03  |
| Running Update Time | 510       |
-----------------------------------
--2024-08-12 07:19:56.322525 UTC---
| Itration            | 511       |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.92e+03  |
| Real Sto Return     | 2.22e+03  |
| Reward Loss         | -1.17e+07 |
| Running Env Steps   | 2555000   |
| Running Forward KL  | 27.5      |
| Running Reverse KL  | 954       |
| Running Update Time | 511       |
-----------------------------------
--2024-08-12 07:23:09.375488 UTC---
| Itration            | 512       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -6.69e+06 |
| Running Env Steps   | 2560000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 251       |
| Running Update Time | 512       |
-----------------------------------
--2024-08-12 07:26:30.449243 UTC---
| Itration            | 513       |
| PAGAR Loss          | -3.22e+08 |
| Real Det Return     | 4.3e+03   |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -5.96e+06 |
| Running Env Steps   | 2565000   |
| Running Forward KL  | 24.1      |
| Running Reverse KL  | 247       |
| Running Update Time | 513       |
-----------------------------------
--2024-08-12 07:29:49.043549 UTC---
| Itration            | 514       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.4e+03   |
| Real Sto Return     | 3.12e+03  |
| Reward Loss         | -9.86e+06 |
| Running Env Steps   | 2570000   |
| Running Forward KL  | 28.5      |
| Running Reverse KL  | 492       |
| Running Update Time | 514       |
-----------------------------------
--2024-08-12 07:33:04.932745 UTC---
| Itration            | 515       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.81e+03  |
| Real Sto Return     | 4.52e+03  |
| Reward Loss         | -5.48e+06 |
| Running Env Steps   | 2575000   |
| Running Forward KL  | 22.9      |
| Running Reverse KL  | 141       |
| Running Update Time | 515       |
-----------------------------------
--2024-08-12 07:36:15.847024 UTC---
| Itration            | 516       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.4e+03   |
| Real Sto Return     | 3.86e+03  |
| Reward Loss         | -1.27e+07 |
| Running Env Steps   | 2580000   |
| Running Forward KL  | 26.9      |
| Running Reverse KL  | 524       |
| Running Update Time | 516       |
-----------------------------------
--2024-08-12 07:39:31.254142 UTC---
| Itration            | 517       |
| PAGAR Loss          | -1.92e+08 |
| Real Det Return     | 4.33e+03  |
| Real Sto Return     | 3.55e+03  |
| Reward Loss         | -5.55e+06 |
| Running Env Steps   | 2585000   |
| Running Forward KL  | 24.9      |
| Running Reverse KL  | 377       |
| Running Update Time | 517       |
-----------------------------------
--2024-08-12 07:42:40.757428 UTC---
| Itration            | 518       |
| PAGAR Loss          | -4.5e+08  |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -4.89e+06 |
| Running Env Steps   | 2590000   |
| Running Forward KL  | 22        |
| Running Reverse KL  | 78        |
| Running Update Time | 518       |
-----------------------------------
--2024-08-12 07:45:43.433375 UTC---
| Itration            | 519       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3e+03     |
| Real Sto Return     | 3.19e+03  |
| Reward Loss         | -1.17e+07 |
| Running Env Steps   | 2595000   |
| Running Forward KL  | 28        |
| Running Reverse KL  | 729       |
| Running Update Time | 519       |
-----------------------------------
--2024-08-12 07:48:50.064245 UTC---
| Itration            | 520       |
| PAGAR Loss          | -4.74e+08 |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.17e+03  |
| Reward Loss         | -6.31e+06 |
| Running Env Steps   | 2600000   |
| Running Forward KL  | 23        |
| Running Reverse KL  | 141       |
| Running Update Time | 520       |
-----------------------------------
--2024-08-12 07:51:58.626307 UTC---
| Itration            | 521       |
| PAGAR Loss          | -4.19e+08 |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -4.11e+06 |
| Running Env Steps   | 2605000   |
| Running Forward KL  | 23.7      |
| Running Reverse KL  | 16.3      |
| Running Update Time | 521       |
-----------------------------------
--2024-08-12 07:55:06.757020 UTC---
| Itration            | 522       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -5.57e+06 |
| Running Env Steps   | 2610000   |
| Running Forward KL  | 26.1      |
| Running Reverse KL  | 271       |
| Running Update Time | 522       |
-----------------------------------
--2024-08-12 07:58:20.583515 UTC---
| Itration            | 523       |
| PAGAR Loss          | -2.31e+08 |
| Real Det Return     | 4.41e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 2615000   |
| Running Forward KL  | 25.8      |
| Running Reverse KL  | 22.7      |
| Running Update Time | 523       |
-----------------------------------
--2024-08-12 08:01:43.968755 UTC---
| Itration            | 524       |
| PAGAR Loss          | -1.79e+10 |
| Real Det Return     | 4.28e+03  |
| Real Sto Return     | 4.06e+03  |
| Reward Loss         | -6.98e+06 |
| Running Env Steps   | 2620000   |
| Running Forward KL  | 26.4      |
| Running Reverse KL  | 896       |
| Running Update Time | 524       |
-----------------------------------
--2024-08-12 08:05:04.076628 UTC---
| Itration            | 525       |
| PAGAR Loss          | -4.1e+08  |
| Real Det Return     | 5.03e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 2625000   |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 68.5      |
| Running Update Time | 525       |
-----------------------------------
--2024-08-12 08:08:11.481539 UTC---
| Itration            | 526       |
| PAGAR Loss          | -2.2e+08  |
| Real Det Return     | 3.82e+03  |
| Real Sto Return     | 3.61e+03  |
| Reward Loss         | -6.07e+06 |
| Running Env Steps   | 2630000   |
| Running Forward KL  | 30.4      |
| Running Reverse KL  | 468       |
| Running Update Time | 526       |
-----------------------------------
--2024-08-12 08:11:34.596101 UTC---
| Itration            | 527       |
| PAGAR Loss          | -4.81e+08 |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 4.33e+03  |
| Reward Loss         | -4.41e+06 |
| Running Env Steps   | 2635000   |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 52.1      |
| Running Update Time | 527       |
-----------------------------------
--2024-08-12 08:14:59.771112 UTC---
| Itration            | 528       |
| PAGAR Loss          | -7.7e+09  |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 3.36e+03  |
| Reward Loss         | -6.66e+06 |
| Running Env Steps   | 2640000   |
| Running Forward KL  | 28.9      |
| Running Reverse KL  | 542       |
| Running Update Time | 528       |
-----------------------------------
--2024-08-12 08:18:27.948236 UTC---
| Itration            | 529       |
| PAGAR Loss          | -1.01e+09 |
| Real Det Return     | 4.67e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -2.47e+06 |
| Running Env Steps   | 2645000   |
| Running Forward KL  | 22.9      |
| Running Reverse KL  | 49.6      |
| Running Update Time | 529       |
-----------------------------------
--2024-08-12 08:21:58.134399 UTC---
| Itration            | 530       |
| PAGAR Loss          | -4.41e+08 |
| Real Det Return     | 4.96e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 2650000   |
| Running Forward KL  | 22.7      |
| Running Reverse KL  | 206       |
| Running Update Time | 530       |
-----------------------------------
--2024-08-12 08:25:26.671009 UTC---
| Itration            | 531       |
| PAGAR Loss          | -8.91e+08 |
| Real Det Return     | 4.9e+03   |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -4.97e+06 |
| Running Env Steps   | 2655000   |
| Running Forward KL  | 24.3      |
| Running Reverse KL  | 20.3      |
| Running Update Time | 531       |
-----------------------------------
--2024-08-12 08:28:58.246236 UTC---
| Itration            | 532       |
| PAGAR Loss          | 1.35e+08  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -2.31e+06 |
| Running Env Steps   | 2660000   |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 17.3      |
| Running Update Time | 532       |
-----------------------------------
--2024-08-12 08:32:19.355715 UTC---
| Itration            | 533       |
| PAGAR Loss          | -5.4e+08  |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.24e+03  |
| Reward Loss         | -3.65e+06 |
| Running Env Steps   | 2665000   |
| Running Forward KL  | 21        |
| Running Reverse KL  | 245       |
| Running Update Time | 533       |
-----------------------------------
--2024-08-12 08:35:36.279656 UTC---
| Itration            | 534       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -3.63e+06 |
| Running Env Steps   | 2670000   |
| Running Forward KL  | 24        |
| Running Reverse KL  | 134       |
| Running Update Time | 534       |
-----------------------------------
--2024-08-12 08:38:53.355420 UTC---
| Itration            | 535       |
| PAGAR Loss          | -4.62e+09 |
| Real Det Return     | 4.96e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -4.06e+06 |
| Running Env Steps   | 2675000   |
| Running Forward KL  | 22.4      |
| Running Reverse KL  | 87.9      |
| Running Update Time | 535       |
-----------------------------------
--2024-08-12 08:42:08.099173 UTC---
| Itration            | 536       |
| PAGAR Loss          | 8.34e+09  |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -3.63e+06 |
| Running Env Steps   | 2680000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 43.8      |
| Running Update Time | 536       |
-----------------------------------
--2024-08-12 08:45:19.888978 UTC---
| Itration            | 537       |
| PAGAR Loss          | -7.64e+08 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -2.98e+06 |
| Running Env Steps   | 2685000   |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 51.5      |
| Running Update Time | 537       |
-----------------------------------
--2024-08-12 08:48:26.388268 UTC---
| Itration            | 538       |
| PAGAR Loss          | -3.31e+08 |
| Real Det Return     | 4.2e+03   |
| Real Sto Return     | 3.76e+03  |
| Reward Loss         | -4.75e+06 |
| Running Env Steps   | 2690000   |
| Running Forward KL  | 22        |
| Running Reverse KL  | 343       |
| Running Update Time | 538       |
-----------------------------------
--2024-08-12 08:51:35.152593 UTC---
| Itration            | 539       |
| PAGAR Loss          | -7.48e+08 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -3.45e+06 |
| Running Env Steps   | 2695000   |
| Running Forward KL  | 24.4      |
| Running Reverse KL  | 278       |
| Running Update Time | 539       |
-----------------------------------
--2024-08-12 08:54:38.765036 UTC---
| Itration            | 540       |
| PAGAR Loss          | -3.37e+08 |
| Real Det Return     | 3.83e+03  |
| Real Sto Return     | 3.3e+03   |
| Reward Loss         | -8.43e+06 |
| Running Env Steps   | 2700000   |
| Running Forward KL  | 25.5      |
| Running Reverse KL  | 865       |
| Running Update Time | 540       |
-----------------------------------
--2024-08-12 08:57:47.844593 UTC---
| Itration            | 541       |
| PAGAR Loss          | 4.94e+07  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 2705000   |
| Running Forward KL  | 26.3      |
| Running Reverse KL  | 23.7      |
| Running Update Time | 541       |
-----------------------------------
--2024-08-12 09:00:54.696228 UTC---
| Itration            | 542       |
| PAGAR Loss          | -8.03e+07 |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 2710000   |
| Running Forward KL  | 25.2      |
| Running Reverse KL  | 413       |
| Running Update Time | 542       |
-----------------------------------
--2024-08-12 09:04:03.742588 UTC---
| Itration            | 543       |
| PAGAR Loss          | -7.69e+08 |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -4.48e+06 |
| Running Env Steps   | 2715000   |
| Running Forward KL  | 26        |
| Running Reverse KL  | 250       |
| Running Update Time | 543       |
-----------------------------------
--2024-08-12 09:07:12.074288 UTC---
| Itration            | 544       |
| PAGAR Loss          | -2.74e+08 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -3.33e+06 |
| Running Env Steps   | 2720000   |
| Running Forward KL  | 18.5      |
| Running Reverse KL  | 130       |
| Running Update Time | 544       |
-----------------------------------
--2024-08-12 09:10:20.770624 UTC---
| Itration            | 545       |
| PAGAR Loss          | -6.52e+08 |
| Real Det Return     | 4.76e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -5.14e+06 |
| Running Env Steps   | 2725000   |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 260       |
| Running Update Time | 545       |
-----------------------------------
--2024-08-12 09:13:29.482043 UTC---
| Itration            | 546       |
| PAGAR Loss          | -7.65e+07 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 2730000   |
| Running Forward KL  | 21        |
| Running Reverse KL  | 14.3      |
| Running Update Time | 546       |
-----------------------------------
--2024-08-12 09:16:34.957699 UTC---
| Itration            | 547       |
| PAGAR Loss          | -2.14e+07 |
| Real Det Return     | 4.42e+03  |
| Real Sto Return     | 3.5e+03   |
| Reward Loss         | -5.3e+06  |
| Running Env Steps   | 2735000   |
| Running Forward KL  | 25.4      |
| Running Reverse KL  | 359       |
| Running Update Time | 547       |
-----------------------------------
--2024-08-12 09:19:42.735851 UTC---
| Itration            | 548       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -7.18e+06 |
| Running Env Steps   | 2740000   |
| Running Forward KL  | 22.1      |
| Running Reverse KL  | 76.1      |
| Running Update Time | 548       |
-----------------------------------
--2024-08-12 09:22:50.859531 UTC---
| Itration            | 549       |
| PAGAR Loss          | -7.36e+09 |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -6.3e+06  |
| Running Env Steps   | 2745000   |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 230       |
| Running Update Time | 549       |
-----------------------------------
--2024-08-12 09:25:59.066556 UTC---
| Itration            | 550       |
| PAGAR Loss          | -8.03e+08 |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.25e+03  |
| Reward Loss         | -2.41e+06 |
| Running Env Steps   | 2750000   |
| Running Forward KL  | 22.5      |
| Running Reverse KL  | 87.4      |
| Running Update Time | 550       |
-----------------------------------
--2024-08-12 09:29:05.107463 UTC---
| Itration            | 551       |
| PAGAR Loss          | -3.97e+11 |
| Real Det Return     | 3.56e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -5.32e+06 |
| Running Env Steps   | 2755000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 643       |
| Running Update Time | 551       |
-----------------------------------
--2024-08-12 09:32:13.647311 UTC---
| Itration            | 552       |
| PAGAR Loss          | -9.06e+08 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.48e+03  |
| Reward Loss         | -3.31e+06 |
| Running Env Steps   | 2760000   |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 18.1      |
| Running Update Time | 552       |
-----------------------------------
--2024-08-12 09:35:23.726174 UTC---
| Itration            | 553       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.95e+03  |
| Real Sto Return     | 3.47e+03  |
| Reward Loss         | -1.55e+07 |
| Running Env Steps   | 2765000   |
| Running Forward KL  | 28.3      |
| Running Reverse KL  | 119       |
| Running Update Time | 553       |
-----------------------------------
--2024-08-12 09:38:31.595978 UTC---
| Itration            | 554       |
| PAGAR Loss          | -4.13e+09 |
| Real Det Return     | 3.39e+03  |
| Real Sto Return     | 2.04e+03  |
| Reward Loss         | -1.71e+07 |
| Running Env Steps   | 2770000   |
| Running Forward KL  | 28.2      |
| Running Reverse KL  | 661       |
| Running Update Time | 554       |
-----------------------------------
--2024-08-12 09:41:40.337902 UTC---
| Itration            | 555       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.75e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -5.08e+06 |
| Running Env Steps   | 2775000   |
| Running Forward KL  | 22.8      |
| Running Reverse KL  | 46.1      |
| Running Update Time | 555       |
-----------------------------------
--2024-08-12 09:44:45.378643 UTC---
| Itration            | 556       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.41e+03  |
| Real Sto Return     | 3.37e+03  |
| Reward Loss         | -1.11e+07 |
| Running Env Steps   | 2780000   |
| Running Forward KL  | 24.4      |
| Running Reverse KL  | 830       |
| Running Update Time | 556       |
-----------------------------------
--2024-08-12 09:47:52.222821 UTC---
| Itration            | 557       |
| PAGAR Loss          | -1.52e+09 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 3.69e+03  |
| Reward Loss         | -3.95e+06 |
| Running Env Steps   | 2785000   |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 190       |
| Running Update Time | 557       |
-----------------------------------
--2024-08-12 09:51:00.145693 UTC---
| Itration            | 558       |
| PAGAR Loss          | -1e+09    |
| Real Det Return     | 4.89e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -2.36e+06 |
| Running Env Steps   | 2790000   |
| Running Forward KL  | 21.8      |
| Running Reverse KL  | 17.5      |
| Running Update Time | 558       |
-----------------------------------
--2024-08-12 09:54:05.481738 UTC---
| Itration            | 559       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.5e+03   |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -3.37e+06 |
| Running Env Steps   | 2795000   |
| Running Forward KL  | 22.5      |
| Running Reverse KL  | 135       |
| Running Update Time | 559       |
-----------------------------------
--2024-08-12 09:57:12.228595 UTC---
| Itration            | 560       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -3.31e+06 |
| Running Env Steps   | 2800000   |
| Running Forward KL  | 25.7      |
| Running Reverse KL  | 48.7      |
| Running Update Time | 560       |
-----------------------------------
--2024-08-12 10:00:18.934031 UTC---
| Itration            | 561       |
| PAGAR Loss          | -1.05e+09 |
| Real Det Return     | 4.44e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -4.94e+06 |
| Running Env Steps   | 2805000   |
| Running Forward KL  | 23.2      |
| Running Reverse KL  | 327       |
| Running Update Time | 561       |
-----------------------------------
--2024-08-12 10:03:25.810438 UTC---
| Itration            | 562       |
| PAGAR Loss          | -1.41e+07 |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 4.29e+03  |
| Reward Loss         | -2.43e+06 |
| Running Env Steps   | 2810000   |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 17.2      |
| Running Update Time | 562       |
-----------------------------------
--2024-08-12 10:06:33.909035 UTC---
| Itration            | 563       |
| PAGAR Loss          | -3.21e+08 |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -3.34e+06 |
| Running Env Steps   | 2815000   |
| Running Forward KL  | 22.5      |
| Running Reverse KL  | 251       |
| Running Update Time | 563       |
-----------------------------------
--2024-08-12 10:09:42.010047 UTC--
| Itration            | 564      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.32e+03 |
| Real Sto Return     | 4.99e+03 |
| Reward Loss         | -3.9e+06 |
| Running Env Steps   | 2820000  |
| Running Forward KL  | 21.8     |
| Running Reverse KL  | 195      |
| Running Update Time | 564      |
----------------------------------
--2024-08-12 10:12:50.899831 UTC---
| Itration            | 565       |
| PAGAR Loss          | -1.87e+08 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -2.26e+06 |
| Running Env Steps   | 2825000   |
| Running Forward KL  | 23.8      |
| Running Reverse KL  | 17        |
| Running Update Time | 565       |
-----------------------------------
--2024-08-12 10:15:59.357968 UTC---
| Itration            | 566       |
| PAGAR Loss          | -9.34e+08 |
| Real Det Return     | 5.03e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -4.06e+06 |
| Running Env Steps   | 2830000   |
| Running Forward KL  | 25.8      |
| Running Reverse KL  | 20.2      |
| Running Update Time | 566       |
-----------------------------------
--2024-08-12 10:19:07.413050 UTC---
| Itration            | 567       |
| PAGAR Loss          | -2.96e+08 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -3.85e+06 |
| Running Env Steps   | 2835000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 237       |
| Running Update Time | 567       |
-----------------------------------
--2024-08-12 10:22:14.927944 UTC---
| Itration            | 568       |
| PAGAR Loss          | -1.06e+09 |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -4e+06    |
| Running Env Steps   | 2840000   |
| Running Forward KL  | 26.6      |
| Running Reverse KL  | 606       |
| Running Update Time | 568       |
-----------------------------------
--2024-08-12 10:25:20.197598 UTC---
| Itration            | 569       |
| PAGAR Loss          | -4.55e+08 |
| Real Det Return     | 4.51e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -5.5e+06  |
| Running Env Steps   | 2845000   |
| Running Forward KL  | 21.9      |
| Running Reverse KL  | 338       |
| Running Update Time | 569       |
-----------------------------------
--2024-08-12 10:28:28.232137 UTC---
| Itration            | 570       |
| PAGAR Loss          | -7.04e+08 |
| Real Det Return     | 4.96e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -2.85e+06 |
| Running Env Steps   | 2850000   |
| Running Forward KL  | 22        |
| Running Reverse KL  | 19.3      |
| Running Update Time | 570       |
-----------------------------------
--2024-08-12 10:31:36.330624 UTC---
| Itration            | 571       |
| PAGAR Loss          | -9.33e+08 |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -5.62e+06 |
| Running Env Steps   | 2855000   |
| Running Forward KL  | 22.4      |
| Running Reverse KL  | 520       |
| Running Update Time | 571       |
-----------------------------------
--2024-08-12 10:34:42.534914 UTC---
| Itration            | 572       |
| PAGAR Loss          | -2.57e+08 |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.13e+03  |
| Reward Loss         | -3.77e+06 |
| Running Env Steps   | 2860000   |
| Running Forward KL  | 27        |
| Running Reverse KL  | 224       |
| Running Update Time | 572       |
-----------------------------------
--2024-08-12 10:37:48.828700 UTC---
| Itration            | 573       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.75e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 2865000   |
| Running Forward KL  | 24.2      |
| Running Reverse KL  | 254       |
| Running Update Time | 573       |
-----------------------------------
--2024-08-12 10:40:57.022287 UTC---
| Itration            | 574       |
| PAGAR Loss          | -1.24e+08 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -4.68e+06 |
| Running Env Steps   | 2870000   |
| Running Forward KL  | 25.1      |
| Running Reverse KL  | 260       |
| Running Update Time | 574       |
-----------------------------------
--2024-08-12 10:44:03.836245 UTC---
| Itration            | 575       |
| PAGAR Loss          | -5.74e+08 |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -2.77e+06 |
| Running Env Steps   | 2875000   |
| Running Forward KL  | 26.3      |
| Running Reverse KL  | 25        |
| Running Update Time | 575       |
-----------------------------------
--2024-08-12 10:47:08.699904 UTC---
| Itration            | 576       |
| PAGAR Loss          | -1.2e+08  |
| Real Det Return     | 3.83e+03  |
| Real Sto Return     | 4.1e+03   |
| Reward Loss         | -5.06e+06 |
| Running Env Steps   | 2880000   |
| Running Forward KL  | 23.9      |
| Running Reverse KL  | 383       |
| Running Update Time | 576       |
-----------------------------------
--2024-08-12 10:50:16.109440 UTC---
| Itration            | 577       |
| PAGAR Loss          | -9.75e+08 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -2.34e+06 |
| Running Env Steps   | 2885000   |
| Running Forward KL  | 23.8      |
| Running Reverse KL  | 23.1      |
| Running Update Time | 577       |
-----------------------------------
--2024-08-12 10:53:24.558157 UTC---
| Itration            | 578       |
| PAGAR Loss          | -2.68e+08 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -2.77e+06 |
| Running Env Steps   | 2890000   |
| Running Forward KL  | 22.7      |
| Running Reverse KL  | 25.5      |
| Running Update Time | 578       |
-----------------------------------
--2024-08-12 10:56:32.502822 UTC---
| Itration            | 579       |
| PAGAR Loss          | 8.81e+08  |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -2.91e+06 |
| Running Env Steps   | 2895000   |
| Running Forward KL  | 25.9      |
| Running Reverse KL  | 24.4      |
| Running Update Time | 579       |
-----------------------------------
--2024-08-12 10:59:38.356232 UTC---
| Itration            | 580       |
| PAGAR Loss          | 3.32e+08  |
| Real Det Return     | 4.18e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -1.03e+07 |
| Running Env Steps   | 2900000   |
| Running Forward KL  | 30.6      |
| Running Reverse KL  | 882       |
| Running Update Time | 580       |
-----------------------------------
--2024-08-12 11:02:46.400614 UTC---
| Itration            | 581       |
| PAGAR Loss          | 5.07e+08  |
| Real Det Return     | 4.84e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -3.85e+06 |
| Running Env Steps   | 2905000   |
| Running Forward KL  | 23        |
| Running Reverse KL  | 83.5      |
| Running Update Time | 581       |
-----------------------------------
--2024-08-12 11:05:54.364295 UTC---
| Itration            | 582       |
| PAGAR Loss          | -5.37e+08 |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -2.72e+06 |
| Running Env Steps   | 2910000   |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 69        |
| Running Update Time | 582       |
-----------------------------------
--2024-08-12 11:09:01.411628 UTC---
| Itration            | 583       |
| PAGAR Loss          | -1.38e+09 |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -6.91e+06 |
| Running Env Steps   | 2915000   |
| Running Forward KL  | 25        |
| Running Reverse KL  | 665       |
| Running Update Time | 583       |
-----------------------------------
--2024-08-12 11:12:09.282658 UTC---
| Itration            | 584       |
| PAGAR Loss          | -4.85e+08 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -1.4e+06  |
| Running Env Steps   | 2920000   |
| Running Forward KL  | 26.5      |
| Running Reverse KL  | 18.2      |
| Running Update Time | 584       |
-----------------------------------
--2024-08-12 11:15:18.222941 UTC---
| Itration            | 585       |
| PAGAR Loss          | -8.08e+08 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.64e+06 |
| Running Env Steps   | 2925000   |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 17.3      |
| Running Update Time | 585       |
-----------------------------------
--2024-08-12 11:18:26.575654 UTC---
| Itration            | 586       |
| PAGAR Loss          | -7.53e+08 |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -4.66e+06 |
| Running Env Steps   | 2930000   |
| Running Forward KL  | 22.1      |
| Running Reverse KL  | 246       |
| Running Update Time | 586       |
-----------------------------------
--2024-08-12 11:21:34.435348 UTC---
| Itration            | 587       |
| PAGAR Loss          | -9.27e+08 |
| Real Det Return     | 4.69e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -2.05e+06 |
| Running Env Steps   | 2935000   |
| Running Forward KL  | 25.9      |
| Running Reverse KL  | 21        |
| Running Update Time | 587       |
-----------------------------------
--2024-08-12 11:24:43.180551 UTC---
| Itration            | 588       |
| PAGAR Loss          | -3.57e+07 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -3.84e+06 |
| Running Env Steps   | 2940000   |
| Running Forward KL  | 25.8      |
| Running Reverse KL  | 22.4      |
| Running Update Time | 588       |
-----------------------------------
--2024-08-12 11:27:52.264770 UTC---
| Itration            | 589       |
| PAGAR Loss          | -8.4e+08  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.78e+06 |
| Running Env Steps   | 2945000   |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 15        |
| Running Update Time | 589       |
-----------------------------------
--2024-08-12 11:31:01.526282 UTC---
| Itration            | 590       |
| PAGAR Loss          | -2.21e+08 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -2.12e+06 |
| Running Env Steps   | 2950000   |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 19.1      |
| Running Update Time | 590       |
-----------------------------------
--2024-08-12 11:34:09.493664 UTC---
| Itration            | 591       |
| PAGAR Loss          | -9.83e+10 |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -3.31e+06 |
| Running Env Steps   | 2955000   |
| Running Forward KL  | 26.9      |
| Running Reverse KL  | 255       |
| Running Update Time | 591       |
-----------------------------------
--2024-08-12 11:37:15.244934 UTC---
| Itration            | 592       |
| PAGAR Loss          | -1.61e+08 |
| Real Det Return     | 4.83e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -3.43e+06 |
| Running Env Steps   | 2960000   |
| Running Forward KL  | 26        |
| Running Reverse KL  | 303       |
| Running Update Time | 592       |
-----------------------------------
--2024-08-12 11:40:22.272428 UTC---
| Itration            | 593       |
| PAGAR Loss          | -2.6e+08  |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 2965000   |
| Running Forward KL  | 25        |
| Running Reverse KL  | 299       |
| Running Update Time | 593       |
-----------------------------------
--2024-08-12 11:43:29.804817 UTC---
| Itration            | 594       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.52e+03  |
| Real Sto Return     | 3.28e+03  |
| Reward Loss         | -7.19e+06 |
| Running Env Steps   | 2970000   |
| Running Forward KL  | 29.3      |
| Running Reverse KL  | 57.3      |
| Running Update Time | 594       |
-----------------------------------
--2024-08-12 11:46:35.601127 UTC---
| Itration            | 595       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.73e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -1.13e+07 |
| Running Env Steps   | 2975000   |
| Running Forward KL  | 30        |
| Running Reverse KL  | 831       |
| Running Update Time | 595       |
-----------------------------------
--2024-08-12 11:49:43.579521 UTC---
| Itration            | 596       |
| PAGAR Loss          | -1.35e+09 |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -2.1e+06  |
| Running Env Steps   | 2980000   |
| Running Forward KL  | 22.2      |
| Running Reverse KL  | 239       |
| Running Update Time | 596       |
-----------------------------------
--2024-08-12 11:52:51.776017 UTC---
| Itration            | 597       |
| PAGAR Loss          | -2.51e+07 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 2985000   |
| Running Forward KL  | 23.1      |
| Running Reverse KL  | 14.3      |
| Running Update Time | 597       |
-----------------------------------
--2024-08-12 11:55:59.179440 UTC---
| Itration            | 598       |
| PAGAR Loss          | -8.57e+08 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -6.17e+06 |
| Running Env Steps   | 2990000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 636       |
| Running Update Time | 598       |
-----------------------------------
--2024-08-12 11:59:07.817658 UTC---
| Itration            | 599       |
| PAGAR Loss          | -5.14e+08 |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -4.72e+06 |
| Running Env Steps   | 2995000   |
| Running Forward KL  | 23.2      |
| Running Reverse KL  | 197       |
| Running Update Time | 599       |
-----------------------------------
