Logging to logs/Walker2dFH-v0/exp-16/fkl/2024_08_11_05_54_32
--2024-08-11 05:55:57.769605 UTC---
| Itration            | 0         |
| Real Det Return     | -13.1     |
| Real Sto Return     | -23.6     |
| Reward Loss         | -2.55e+05 |
| Running Env Steps   | 0         |
| Running Forward KL  | 26.3      |
| Running Reverse KL  | 402       |
| Running Update Time | 0         |
-----------------------------------
--2024-08-11 05:57:22.902939 UTC--
| Itration            | 1        |
| Real Det Return     | -17      |
| Real Sto Return     | -26.9    |
| Reward Loss         | -5.6e+05 |
| Running Env Steps   | 5000     |
| Running Forward KL  | 25.8     |
| Running Reverse KL  | 393      |
| Running Update Time | 1        |
----------------------------------
--2024-08-11 05:58:36.342733 UTC--
| Itration            | 2        |
| Real Det Return     | 616      |
| Real Sto Return     | 401      |
| Reward Loss         | 2.43e+05 |
| Running Env Steps   | 10000    |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 324      |
| Running Update Time | 2        |
----------------------------------
--2024-08-11 05:59:44.804022 UTC---
| Itration            | 3         |
| Real Det Return     | 646       |
| Real Sto Return     | 420       |
| Reward Loss         | -2.51e+05 |
| Running Env Steps   | 15000     |
| Running Forward KL  | 22        |
| Running Reverse KL  | 328       |
| Running Update Time | 3         |
-----------------------------------
--2024-08-11 06:00:50.999110 UTC---
| Itration            | 4         |
| Real Det Return     | 449       |
| Real Sto Return     | 413       |
| Reward Loss         | -1.81e+05 |
| Running Env Steps   | 20000     |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 324       |
| Running Update Time | 4         |
-----------------------------------
--2024-08-11 06:02:00.778252 UTC--
| Itration            | 5        |
| Real Det Return     | 520      |
| Real Sto Return     | 496      |
| Reward Loss         | 6.09e+05 |
| Running Env Steps   | 25000    |
| Running Forward KL  | 21.3     |
| Running Reverse KL  | 312      |
| Running Update Time | 5        |
----------------------------------
--2024-08-11 06:03:10.596701 UTC--
| Itration            | 6        |
| Real Det Return     | 541      |
| Real Sto Return     | 478      |
| Reward Loss         | 7.8e+05  |
| Running Env Steps   | 30000    |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 268      |
| Running Update Time | 6        |
----------------------------------
--2024-08-11 06:04:32.245331 UTC--
| Itration            | 7        |
| Real Det Return     | 781      |
| Real Sto Return     | 883      |
| Reward Loss         | 1.08e+06 |
| Running Env Steps   | 35000    |
| Running Forward KL  | 21.2     |
| Running Reverse KL  | 162      |
| Running Update Time | 7        |
----------------------------------
--2024-08-11 06:05:50.090588 UTC---
| Itration            | 8         |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 284       |
| Reward Loss         | -4.31e+05 |
| Running Env Steps   | 40000     |
| Running Forward KL  | 23        |
| Running Reverse KL  | 308       |
| Running Update Time | 8         |
-----------------------------------
--2024-08-11 06:07:43.218993 UTC--
| Itration            | 9        |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 941      |
| Reward Loss         | 1.23e+06 |
| Running Env Steps   | 45000    |
| Running Forward KL  | 22.4     |
| Running Reverse KL  | 108      |
| Running Update Time | 9        |
----------------------------------
--2024-08-11 06:09:39.336808 UTC--
| Itration            | 10       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 979      |
| Reward Loss         | 1.43e+06 |
| Running Env Steps   | 50000    |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 18.6     |
| Running Update Time | 10       |
----------------------------------
--2024-08-11 06:11:33.314604 UTC--
| Itration            | 11       |
| Real Det Return     | 934      |
| Real Sto Return     | 954      |
| Reward Loss         | 1.23e+06 |
| Running Env Steps   | 55000    |
| Running Forward KL  | 22.1     |
| Running Reverse KL  | 63.6     |
| Running Update Time | 11       |
----------------------------------
--2024-08-11 06:13:21.489820 UTC--
| Itration            | 12       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 680      |
| Reward Loss         | 1.24e+06 |
| Running Env Steps   | 60000    |
| Running Forward KL  | 21.9     |
| Running Reverse KL  | 62.7     |
| Running Update Time | 12       |
----------------------------------
--2024-08-11 06:15:05.102514 UTC--
| Itration            | 13       |
| Real Det Return     | 757      |
| Real Sto Return     | 774      |
| Reward Loss         | 1.24e+06 |
| Running Env Steps   | 65000    |
| Running Forward KL  | 22       |
| Running Reverse KL  | 85.1     |
| Running Update Time | 13       |
----------------------------------
--2024-08-11 06:17:13.831600 UTC--
| Itration            | 14       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 967      |
| Reward Loss         | 1.44e+06 |
| Running Env Steps   | 70000    |
| Running Forward KL  | 22.2     |
| Running Reverse KL  | 15.1     |
| Running Update Time | 14       |
----------------------------------
--2024-08-11 06:19:34.858472 UTC--
| Itration            | 15       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 986      |
| Reward Loss         | 1.29e+06 |
| Running Env Steps   | 75000    |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 35.8     |
| Running Update Time | 15       |
----------------------------------
--2024-08-11 06:21:52.572533 UTC--
| Itration            | 16       |
| Real Det Return     | 976      |
| Real Sto Return     | 899      |
| Reward Loss         | 6.79e+05 |
| Running Env Steps   | 80000    |
| Running Forward KL  | 22       |
| Running Reverse KL  | 125      |
| Running Update Time | 16       |
----------------------------------
--2024-08-11 06:24:15.377016 UTC--
| Itration            | 17       |
| Real Det Return     | 1.01e+03 |
| Real Sto Return     | 945      |
| Reward Loss         | 1.08e+06 |
| Running Env Steps   | 85000    |
| Running Forward KL  | 22.5     |
| Running Reverse KL  | 60.8     |
| Running Update Time | 17       |
----------------------------------
--2024-08-11 06:26:38.118825 UTC--
| Itration            | 18       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 977      |
| Reward Loss         | 1.15e+06 |
| Running Env Steps   | 90000    |
| Running Forward KL  | 22       |
| Running Reverse KL  | 14.6     |
| Running Update Time | 18       |
----------------------------------
--2024-08-11 06:28:57.627600 UTC--
| Itration            | 19       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 984      |
| Reward Loss         | 8.57e+05 |
| Running Env Steps   | 95000    |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 41.3     |
| Running Update Time | 19       |
----------------------------------
--2024-08-11 06:31:19.750077 UTC--
| Itration            | 20       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 936      |
| Reward Loss         | 9.65e+05 |
| Running Env Steps   | 100000   |
| Running Forward KL  | 21.8     |
| Running Reverse KL  | 51       |
| Running Update Time | 20       |
----------------------------------
--2024-08-11 06:33:39.913151 UTC--
| Itration            | 21       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 990      |
| Reward Loss         | 8.57e+05 |
| Running Env Steps   | 105000   |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 13.6     |
| Running Update Time | 21       |
----------------------------------
--2024-08-11 06:36:01.224111 UTC--
| Itration            | 22       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 983      |
| Reward Loss         | 8.56e+05 |
| Running Env Steps   | 110000   |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 14       |
| Running Update Time | 22       |
----------------------------------
--2024-08-11 06:38:23.614216 UTC--
| Itration            | 23       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 977      |
| Reward Loss         | 5.71e+05 |
| Running Env Steps   | 115000   |
| Running Forward KL  | 21.3     |
| Running Reverse KL  | 47.9     |
| Running Update Time | 23       |
----------------------------------
--2024-08-11 06:40:46.418520 UTC--
| Itration            | 24       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 1.02e+03 |
| Reward Loss         | 8.02e+05 |
| Running Env Steps   | 120000   |
| Running Forward KL  | 22       |
| Running Reverse KL  | 14.3     |
| Running Update Time | 24       |
----------------------------------
--2024-08-11 06:43:05.975697 UTC--
| Itration            | 25       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 999      |
| Reward Loss         | 5.12e+05 |
| Running Env Steps   | 125000   |
| Running Forward KL  | 22.1     |
| Running Reverse KL  | 38.6     |
| Running Update Time | 25       |
----------------------------------
--2024-08-11 06:45:29.456535 UTC--
| Itration            | 26       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 936      |
| Reward Loss         | 5.11e+05 |
| Running Env Steps   | 130000   |
| Running Forward KL  | 22.2     |
| Running Reverse KL  | 45.9     |
| Running Update Time | 26       |
----------------------------------
--2024-08-11 06:47:50.582412 UTC--
| Itration            | 27       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.02e+03 |
| Reward Loss         | 4.99e+05 |
| Running Env Steps   | 135000   |
| Running Forward KL  | 21.8     |
| Running Reverse KL  | 17.4     |
| Running Update Time | 27       |
----------------------------------
--2024-08-11 06:50:10.498835 UTC--
| Itration            | 28       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.03e+03 |
| Reward Loss         | 4.1e+05  |
| Running Env Steps   | 140000   |
| Running Forward KL  | 21.8     |
| Running Reverse KL  | 13.4     |
| Running Update Time | 28       |
----------------------------------
--2024-08-11 06:52:35.609971 UTC--
| Itration            | 29       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 1.01e+03 |
| Reward Loss         | 4.57e+05 |
| Running Env Steps   | 145000   |
| Running Forward KL  | 21.9     |
| Running Reverse KL  | 14.1     |
| Running Update Time | 29       |
----------------------------------
--2024-08-11 06:54:56.396132 UTC--
| Itration            | 30       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1e+03    |
| Reward Loss         | 3.19e+05 |
| Running Env Steps   | 150000   |
| Running Forward KL  | 21.7     |
| Running Reverse KL  | 13.8     |
| Running Update Time | 30       |
----------------------------------
--2024-08-11 06:57:17.028505 UTC--
| Itration            | 31       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.04e+03 |
| Reward Loss         | 3.13e+05 |
| Running Env Steps   | 155000   |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 18.2     |
| Running Update Time | 31       |
----------------------------------
--2024-08-11 06:59:42.064850 UTC--
| Itration            | 32       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.02e+03 |
| Reward Loss         | 1.27e+05 |
| Running Env Steps   | 160000   |
| Running Forward KL  | 21.8     |
| Running Reverse KL  | 35.4     |
| Running Update Time | 32       |
----------------------------------
--2024-08-11 07:02:04.192678 UTC--
| Itration            | 33       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.04e+03 |
| Reward Loss         | 1.85e+05 |
| Running Env Steps   | 165000   |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 13.8     |
| Running Update Time | 33       |
----------------------------------
--2024-08-11 07:04:25.575427 UTC--
| Itration            | 34       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.02e+03 |
| Reward Loss         | 1.18e+05 |
| Running Env Steps   | 170000   |
| Running Forward KL  | 21.7     |
| Running Reverse KL  | 14.1     |
| Running Update Time | 34       |
----------------------------------
--2024-08-11 07:06:51.408352 UTC--
| Itration            | 35       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.04e+03 |
| Reward Loss         | 8.17e+04 |
| Running Env Steps   | 175000   |
| Running Forward KL  | 22.1     |
| Running Reverse KL  | 13.8     |
| Running Update Time | 35       |
----------------------------------
--2024-08-11 07:09:11.443663 UTC--
| Itration            | 36       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.05e+03 |
| Reward Loss         | 2.98e+04 |
| Running Env Steps   | 180000   |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 14.2     |
| Running Update Time | 36       |
----------------------------------
--2024-08-11 07:11:35.629132 UTC---
| Itration            | 37        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.03e+03  |
| Reward Loss         | -5.28e+04 |
| Running Env Steps   | 185000    |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 14.1      |
| Running Update Time | 37        |
-----------------------------------
--2024-08-11 07:14:01.561479 UTC---
| Itration            | 38        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.01e+03  |
| Reward Loss         | -1.12e+05 |
| Running Env Steps   | 190000    |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 14.8      |
| Running Update Time | 38        |
-----------------------------------
--2024-08-11 07:16:23.415507 UTC---
| Itration            | 39        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.02e+03  |
| Reward Loss         | -2.39e+05 |
| Running Env Steps   | 195000    |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 21.9      |
| Running Update Time | 39        |
-----------------------------------
--2024-08-11 07:18:47.338242 UTC---
| Itration            | 40        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -2.47e+05 |
| Running Env Steps   | 200000    |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 40        |
-----------------------------------
--2024-08-11 07:21:11.380006 UTC--
| Itration            | 41       |
| Real Det Return     | 1.02e+03 |
| Real Sto Return     | 1.01e+03 |
| Reward Loss         | -3.7e+05 |
| Running Env Steps   | 205000   |
| Running Forward KL  | 21.7     |
| Running Reverse KL  | 14.2     |
| Running Update Time | 41       |
----------------------------------
--2024-08-11 07:23:33.205994 UTC---
| Itration            | 42        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -3.58e+05 |
| Running Env Steps   | 210000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 14        |
| Running Update Time | 42        |
-----------------------------------
--2024-08-11 07:25:59.924222 UTC---
| Itration            | 43        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -4.05e+05 |
| Running Env Steps   | 215000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.7      |
| Running Update Time | 43        |
-----------------------------------
--2024-08-11 07:28:24.927444 UTC---
| Itration            | 44        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -4.81e+05 |
| Running Env Steps   | 220000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 44        |
-----------------------------------
--2024-08-11 07:30:45.463063 UTC---
| Itration            | 45        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.02e+03  |
| Reward Loss         | -6.06e+05 |
| Running Env Steps   | 225000    |
| Running Forward KL  | 21.7      |
| Running Reverse KL  | 30.6      |
| Running Update Time | 45        |
-----------------------------------
--2024-08-11 07:33:12.391452 UTC---
| Itration            | 46        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -6.25e+05 |
| Running Env Steps   | 230000    |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 13.8      |
| Running Update Time | 46        |
-----------------------------------
--2024-08-11 07:35:36.885505 UTC---
| Itration            | 47        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.03e+03  |
| Reward Loss         | -7.38e+05 |
| Running Env Steps   | 235000    |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 47        |
-----------------------------------
--2024-08-11 07:37:58.807211 UTC---
| Itration            | 48        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -6.09e+05 |
| Running Env Steps   | 240000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 14.3      |
| Running Update Time | 48        |
-----------------------------------
--2024-08-11 07:40:26.045697 UTC--
| Itration            | 49       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.06e+03 |
| Reward Loss         | -8.2e+05 |
| Running Env Steps   | 245000   |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 13.5     |
| Running Update Time | 49       |
----------------------------------
--2024-08-11 07:42:49.033390 UTC---
| Itration            | 50        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -8.86e+05 |
| Running Env Steps   | 250000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 50        |
-----------------------------------
--2024-08-11 07:45:09.999225 UTC---
| Itration            | 51        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.03e+03  |
| Reward Loss         | -9.32e+05 |
| Running Env Steps   | 255000    |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 51        |
-----------------------------------
--2024-08-11 07:47:36.249613 UTC---
| Itration            | 52        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 260000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 52        |
-----------------------------------
--2024-08-11 07:49:59.820074 UTC---
| Itration            | 53        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 265000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 53        |
-----------------------------------
--2024-08-11 07:52:21.797217 UTC---
| Itration            | 54        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 270000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 54        |
-----------------------------------
--2024-08-11 07:54:48.796147 UTC---
| Itration            | 55        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 275000    |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 55        |
-----------------------------------
--2024-08-11 07:57:11.556728 UTC---
| Itration            | 56        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 280000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 56        |
-----------------------------------
--2024-08-11 07:59:32.277294 UTC---
| Itration            | 57        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 285000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 57        |
-----------------------------------
--2024-08-11 08:01:58.685744 UTC---
| Itration            | 58        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -1.38e+06 |
| Running Env Steps   | 290000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 58        |
-----------------------------------
--2024-08-11 08:04:20.992051 UTC---
| Itration            | 59        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 295000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.8      |
| Running Update Time | 59        |
-----------------------------------
--2024-08-11 08:06:42.017665 UTC---
| Itration            | 60        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -1.54e+06 |
| Running Env Steps   | 300000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 60        |
-----------------------------------
--2024-08-11 08:09:08.326972 UTC---
| Itration            | 61        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -1.61e+06 |
| Running Env Steps   | 305000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.2      |
| Running Update Time | 61        |
-----------------------------------
--2024-08-11 08:11:29.710932 UTC---
| Itration            | 62        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 310000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 62        |
-----------------------------------
--2024-08-11 08:13:53.115922 UTC---
| Itration            | 63        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -1.66e+06 |
| Running Env Steps   | 315000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 63        |
-----------------------------------
--2024-08-11 08:16:18.037819 UTC---
| Itration            | 64        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -1.63e+06 |
| Running Env Steps   | 320000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.7      |
| Running Update Time | 64        |
-----------------------------------
--2024-08-11 08:18:37.866556 UTC---
| Itration            | 65        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 325000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 65        |
-----------------------------------
--2024-08-11 08:21:01.611704 UTC---
| Itration            | 66        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -1.83e+06 |
| Running Env Steps   | 330000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 37.9      |
| Running Update Time | 66        |
-----------------------------------
--2024-08-11 08:23:26.128680 UTC---
| Itration            | 67        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -1.96e+06 |
| Running Env Steps   | 335000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 67        |
-----------------------------------
--2024-08-11 08:25:46.087316 UTC---
| Itration            | 68        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -1.96e+06 |
| Running Env Steps   | 340000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 68        |
-----------------------------------
--2024-08-11 08:28:10.396932 UTC---
| Itration            | 69        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -1.97e+06 |
| Running Env Steps   | 345000    |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 16        |
| Running Update Time | 69        |
-----------------------------------
--2024-08-11 08:30:32.707663 UTC---
| Itration            | 70        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 350000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 70        |
-----------------------------------
--2024-08-11 08:32:51.428263 UTC---
| Itration            | 71        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 355000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.2      |
| Running Update Time | 71        |
-----------------------------------
--2024-08-11 08:35:17.453351 UTC---
| Itration            | 72        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -2.22e+06 |
| Running Env Steps   | 360000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 72        |
-----------------------------------
--2024-08-11 08:37:36.692486 UTC---
| Itration            | 73        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -2.19e+06 |
| Running Env Steps   | 365000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 73        |
-----------------------------------
--2024-08-11 08:39:55.449333 UTC--
| Itration            | 74       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.08e+03 |
| Reward Loss         | -2.4e+06 |
| Running Env Steps   | 370000   |
| Running Forward KL  | 20.9     |
| Running Reverse KL  | 12.5     |
| Running Update Time | 74       |
----------------------------------
--2024-08-11 08:42:18.941686 UTC---
| Itration            | 75        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -2.38e+06 |
| Running Env Steps   | 375000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 75        |
-----------------------------------
--2024-08-11 08:44:37.837580 UTC---
| Itration            | 76        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -2.42e+06 |
| Running Env Steps   | 380000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.4      |
| Running Update Time | 76        |
-----------------------------------
--2024-08-11 08:46:59.121270 UTC---
| Itration            | 77        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 385000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 77        |
-----------------------------------
--2024-08-11 08:49:22.554209 UTC---
| Itration            | 78        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -2.57e+06 |
| Running Env Steps   | 390000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 12.8      |
| Running Update Time | 78        |
-----------------------------------
--2024-08-11 08:51:41.620628 UTC---
| Itration            | 79        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -2.66e+06 |
| Running Env Steps   | 395000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13        |
| Running Update Time | 79        |
-----------------------------------
--2024-08-11 08:54:05.348057 UTC--
| Itration            | 80       |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.07e+03 |
| Reward Loss         | -2.7e+06 |
| Running Env Steps   | 400000   |
| Running Forward KL  | 20.9     |
| Running Reverse KL  | 13       |
| Running Update Time | 80       |
----------------------------------
--2024-08-11 08:56:29.014356 UTC---
| Itration            | 81        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -2.75e+06 |
| Running Env Steps   | 405000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 81        |
-----------------------------------
--2024-08-11 08:58:49.521239 UTC---
| Itration            | 82        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -2.85e+06 |
| Running Env Steps   | 410000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 82        |
-----------------------------------
--2024-08-11 09:01:14.730184 UTC---
| Itration            | 83        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -2.84e+06 |
| Running Env Steps   | 415000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.1      |
| Running Update Time | 83        |
-----------------------------------
--2024-08-11 09:03:38.406362 UTC---
| Itration            | 84        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -2.96e+06 |
| Running Env Steps   | 420000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.1      |
| Running Update Time | 84        |
-----------------------------------
--2024-08-11 09:06:00.159462 UTC---
| Itration            | 85        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -2.99e+06 |
| Running Env Steps   | 425000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 12.9      |
| Running Update Time | 85        |
-----------------------------------
--2024-08-11 09:08:28.948967 UTC---
| Itration            | 86        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -3.11e+06 |
| Running Env Steps   | 430000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 86        |
-----------------------------------
--2024-08-11 09:10:52.718121 UTC---
| Itration            | 87        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -3.09e+06 |
| Running Env Steps   | 435000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 87        |
-----------------------------------
--2024-08-11 09:13:14.416438 UTC---
| Itration            | 88        |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -3.12e+06 |
| Running Env Steps   | 440000    |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 13        |
| Running Update Time | 88        |
-----------------------------------
--2024-08-11 09:15:41.514769 UTC---
| Itration            | 89        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -3.23e+06 |
| Running Env Steps   | 445000    |
| Running Forward KL  | 20.5      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 89        |
-----------------------------------
--2024-08-11 09:18:04.997569 UTC---
| Itration            | 90        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -3.23e+06 |
| Running Env Steps   | 450000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 19.6      |
| Running Update Time | 90        |
-----------------------------------
--2024-08-11 09:20:26.883866 UTC---
| Itration            | 91        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -3.35e+06 |
| Running Env Steps   | 455000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 91        |
-----------------------------------
--2024-08-11 09:22:53.960382 UTC---
| Itration            | 92        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -3.34e+06 |
| Running Env Steps   | 460000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.1      |
| Running Update Time | 92        |
-----------------------------------
--2024-08-11 09:25:15.338091 UTC---
| Itration            | 93        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -3.36e+06 |
| Running Env Steps   | 465000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 93        |
-----------------------------------
--2024-08-11 09:27:37.171872 UTC---
| Itration            | 94        |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -3.39e+06 |
| Running Env Steps   | 470000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 94        |
-----------------------------------
--2024-08-11 09:30:01.103154 UTC---
| Itration            | 95        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -3.53e+06 |
| Running Env Steps   | 475000    |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 95        |
-----------------------------------
--2024-08-11 09:32:21.920082 UTC---
| Itration            | 96        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -3.57e+06 |
| Running Env Steps   | 480000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 96        |
-----------------------------------
--2024-08-11 09:34:43.696224 UTC---
| Itration            | 97        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -3.67e+06 |
| Running Env Steps   | 485000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 97        |
-----------------------------------
--2024-08-11 09:37:07.850428 UTC---
| Itration            | 98        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -3.63e+06 |
| Running Env Steps   | 490000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 25.9      |
| Running Update Time | 98        |
-----------------------------------
--2024-08-11 09:39:28.563456 UTC---
| Itration            | 99        |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -3.74e+06 |
| Running Env Steps   | 495000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 99        |
-----------------------------------
--2024-08-11 09:41:54.116147 UTC---
| Itration            | 100       |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 500000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 100       |
-----------------------------------
--2024-08-11 09:44:20.246538 UTC---
| Itration            | 101       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 505000    |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 31.8      |
| Running Update Time | 101       |
-----------------------------------
--2024-08-11 09:46:41.682684 UTC---
| Itration            | 102       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -3.88e+06 |
| Running Env Steps   | 510000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 102       |
-----------------------------------
--2024-08-11 09:49:09.544448 UTC---
| Itration            | 103       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.03e+06 |
| Running Env Steps   | 515000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 37.7      |
| Running Update Time | 103       |
-----------------------------------
--2024-08-11 09:51:35.959699 UTC---
| Itration            | 104       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.01e+06 |
| Running Env Steps   | 520000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 104       |
-----------------------------------
--2024-08-11 09:53:57.167363 UTC---
| Itration            | 105       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.06e+06 |
| Running Env Steps   | 525000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 105       |
-----------------------------------
--2024-08-11 09:56:22.724601 UTC---
| Itration            | 106       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.06e+06 |
| Running Env Steps   | 530000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 106       |
-----------------------------------
--2024-08-11 09:58:44.778505 UTC---
| Itration            | 107       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -3.83e+06 |
| Running Env Steps   | 535000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 48.3      |
| Running Update Time | 107       |
-----------------------------------
--2024-08-11 10:01:07.311027 UTC---
| Itration            | 108       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -4.18e+06 |
| Running Env Steps   | 540000    |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 108       |
-----------------------------------
--2024-08-11 10:03:32.622544 UTC---
| Itration            | 109       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.27e+06 |
| Running Env Steps   | 545000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 109       |
-----------------------------------
--2024-08-11 10:05:55.135736 UTC---
| Itration            | 110       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -4.21e+06 |
| Running Env Steps   | 550000    |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 12        |
| Running Update Time | 110       |
-----------------------------------
--2024-08-11 10:08:17.203898 UTC---
| Itration            | 111       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 555000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 111       |
-----------------------------------
--2024-08-11 10:10:42.305959 UTC---
| Itration            | 112       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -4.44e+06 |
| Running Env Steps   | 560000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 112       |
-----------------------------------
--2024-08-11 10:13:01.841517 UTC---
| Itration            | 113       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 565000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 26.1      |
| Running Update Time | 113       |
-----------------------------------
--2024-08-11 10:15:22.552806 UTC---
| Itration            | 114       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.26e+06 |
| Running Env Steps   | 570000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 29        |
| Running Update Time | 114       |
-----------------------------------
--2024-08-11 10:17:45.438129 UTC---
| Itration            | 115       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -4.38e+06 |
| Running Env Steps   | 575000    |
| Running Forward KL  | 20        |
| Running Reverse KL  | 12.2      |
| Running Update Time | 115       |
-----------------------------------
--2024-08-11 10:20:05.801374 UTC---
| Itration            | 116       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.49e+06 |
| Running Env Steps   | 580000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 49.9      |
| Running Update Time | 116       |
-----------------------------------
--2024-08-11 10:22:25.697522 UTC---
| Itration            | 117       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 585000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 66        |
| Running Update Time | 117       |
-----------------------------------
--2024-08-11 10:24:44.873448 UTC---
| Itration            | 118       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 590000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 110       |
| Running Update Time | 118       |
-----------------------------------
--2024-08-11 10:27:04.747196 UTC---
| Itration            | 119       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -4.36e+06 |
| Running Env Steps   | 595000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 14.6      |
| Running Update Time | 119       |
-----------------------------------
--2024-08-11 10:29:24.005441 UTC---
| Itration            | 120       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -4.15e+06 |
| Running Env Steps   | 600000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 67.2      |
| Running Update Time | 120       |
-----------------------------------
--2024-08-11 10:31:45.817449 UTC--
| Itration            | 121      |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.16e+03 |
| Reward Loss         | -4.5e+06 |
| Running Env Steps   | 605000   |
| Running Forward KL  | 19.6     |
| Running Reverse KL  | 67.4     |
| Running Update Time | 121      |
----------------------------------
--2024-08-11 10:34:03.312789 UTC---
| Itration            | 122       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 610000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 94.1      |
| Running Update Time | 122       |
-----------------------------------
--2024-08-11 10:36:25.247529 UTC---
| Itration            | 123       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -4.51e+06 |
| Running Env Steps   | 615000    |
| Running Forward KL  | 19        |
| Running Reverse KL  | 68.3      |
| Running Update Time | 123       |
-----------------------------------
--2024-08-11 10:38:40.518071 UTC---
| Itration            | 124       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 956       |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 620000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 106       |
| Running Update Time | 124       |
-----------------------------------
--2024-08-11 10:41:00.858798 UTC---
| Itration            | 125       |
| Real Det Return     | 1.09e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.46e+06 |
| Running Env Steps   | 625000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 52.7      |
| Running Update Time | 125       |
-----------------------------------
--2024-08-11 10:43:18.581723 UTC---
| Itration            | 126       |
| Real Det Return     | 1.1e+03   |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 630000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 126       |
-----------------------------------
--2024-08-11 10:45:39.484904 UTC---
| Itration            | 127       |
| Real Det Return     | 1.17e+03  |
| Real Sto Return     | 1.21e+03  |
| Reward Loss         | -4.18e+06 |
| Running Env Steps   | 635000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 57.8      |
| Running Update Time | 127       |
-----------------------------------
--2024-08-11 10:48:01.126867 UTC---
| Itration            | 128       |
| Real Det Return     | 1.08e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -4.41e+06 |
| Running Env Steps   | 640000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 54.6      |
| Running Update Time | 128       |
-----------------------------------
--2024-08-11 10:50:20.148043 UTC--
| Itration            | 129      |
| Real Det Return     | 1.12e+03 |
| Real Sto Return     | 1.07e+03 |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 645000   |
| Running Forward KL  | 19.2     |
| Running Reverse KL  | 96.4     |
| Running Update Time | 129      |
----------------------------------
--2024-08-11 10:52:39.538743 UTC---
| Itration            | 130       |
| Real Det Return     | 1.24e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 650000    |
| Running Forward KL  | 19        |
| Running Reverse KL  | 54.5      |
| Running Update Time | 130       |
-----------------------------------
--2024-08-11 10:54:46.031490 UTC---
| Itration            | 131       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 803       |
| Reward Loss         | -4.31e+06 |
| Running Env Steps   | 655000    |
| Running Forward KL  | 19        |
| Running Reverse KL  | 123       |
| Running Update Time | 131       |
-----------------------------------
--2024-08-11 10:57:02.999824 UTC--
| Itration            | 132      |
| Real Det Return     | 1.14e+03 |
| Real Sto Return     | 988      |
| Reward Loss         | -4.8e+06 |
| Running Env Steps   | 660000   |
| Running Forward KL  | 19.3     |
| Running Reverse KL  | 71.4     |
| Running Update Time | 132      |
----------------------------------
--2024-08-11 10:59:23.543807 UTC---
| Itration            | 133       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.44e+06 |
| Running Env Steps   | 665000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 83.2      |
| Running Update Time | 133       |
-----------------------------------
--2024-08-11 11:01:38.993008 UTC---
| Itration            | 134       |
| Real Det Return     | 1.13e+03  |
| Real Sto Return     | 948       |
| Reward Loss         | -4.56e+06 |
| Running Env Steps   | 670000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 135       |
| Running Update Time | 134       |
-----------------------------------
--2024-08-11 11:03:54.818500 UTC---
| Itration            | 135       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -4.31e+06 |
| Running Env Steps   | 675000    |
| Running Forward KL  | 18.6      |
| Running Reverse KL  | 114       |
| Running Update Time | 135       |
-----------------------------------
--2024-08-11 11:06:13.119520 UTC---
| Itration            | 136       |
| Real Det Return     | 1.1e+03   |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -4.54e+06 |
| Running Env Steps   | 680000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 125       |
| Running Update Time | 136       |
-----------------------------------
--2024-08-11 11:08:36.268469 UTC---
| Itration            | 137       |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -4.72e+06 |
| Running Env Steps   | 685000    |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 62.7      |
| Running Update Time | 137       |
-----------------------------------
--2024-08-11 11:10:57.419505 UTC---
| Itration            | 138       |
| Real Det Return     | 1.17e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -4.62e+06 |
| Running Env Steps   | 690000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 60.7      |
| Running Update Time | 138       |
-----------------------------------
--2024-08-11 11:13:14.976830 UTC---
| Itration            | 139       |
| Real Det Return     | 1.19e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 695000    |
| Running Forward KL  | 19        |
| Running Reverse KL  | 93.1      |
| Running Update Time | 139       |
-----------------------------------
--2024-08-11 11:15:30.579716 UTC---
| Itration            | 140       |
| Real Det Return     | 1.25e+03  |
| Real Sto Return     | 1.05e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 700000    |
| Running Forward KL  | 18.6      |
| Running Reverse KL  | 79.4      |
| Running Update Time | 140       |
-----------------------------------
--2024-08-11 11:17:54.124046 UTC---
| Itration            | 141       |
| Real Det Return     | 1.43e+03  |
| Real Sto Return     | 1.33e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 705000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 141       |
-----------------------------------
--2024-08-11 11:20:11.039789 UTC---
| Itration            | 142       |
| Real Det Return     | 1.45e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 710000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 69.2      |
| Running Update Time | 142       |
-----------------------------------
--2024-08-11 11:22:34.852994 UTC---
| Itration            | 143       |
| Real Det Return     | 1.16e+03  |
| Real Sto Return     | 1.32e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 715000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 143       |
-----------------------------------
--2024-08-11 11:24:55.810410 UTC---
| Itration            | 144       |
| Real Det Return     | 1.44e+03  |
| Real Sto Return     | 1.29e+03  |
| Reward Loss         | -4.94e+06 |
| Running Env Steps   | 720000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 85.3      |
| Running Update Time | 144       |
-----------------------------------
--2024-08-11 11:27:17.796936 UTC---
| Itration            | 145       |
| Real Det Return     | 1.38e+03  |
| Real Sto Return     | 1.27e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 725000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 39.4      |
| Running Update Time | 145       |
-----------------------------------
--2024-08-11 11:29:36.688246 UTC---
| Itration            | 146       |
| Real Det Return     | 1.09e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -5.05e+06 |
| Running Env Steps   | 730000    |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 98.4      |
| Running Update Time | 146       |
-----------------------------------
--2024-08-11 11:31:57.382033 UTC---
| Itration            | 147       |
| Real Det Return     | 1.2e+03   |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -4.82e+06 |
| Running Env Steps   | 735000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 147       |
-----------------------------------
--2024-08-11 11:34:15.816816 UTC---
| Itration            | 148       |
| Real Det Return     | 1.43e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -4.92e+06 |
| Running Env Steps   | 740000    |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 25.9      |
| Running Update Time | 148       |
-----------------------------------
--2024-08-11 11:36:38.853674 UTC---
| Itration            | 149       |
| Real Det Return     | 1.28e+03  |
| Real Sto Return     | 1.29e+03  |
| Reward Loss         | -4.87e+06 |
| Running Env Steps   | 745000    |
| Running Forward KL  | 18.3      |
| Running Reverse KL  | 70.7      |
| Running Update Time | 149       |
-----------------------------------
--2024-08-11 11:38:59.727054 UTC---
| Itration            | 150       |
| Real Det Return     | 1.51e+03  |
| Real Sto Return     | 1.41e+03  |
| Reward Loss         | -4.87e+06 |
| Running Env Steps   | 750000    |
| Running Forward KL  | 18.5      |
| Running Reverse KL  | 45.8      |
| Running Update Time | 150       |
-----------------------------------
--2024-08-11 11:41:22.612423 UTC---
| Itration            | 151       |
| Real Det Return     | 1.31e+03  |
| Real Sto Return     | 1.34e+03  |
| Reward Loss         | -5.08e+06 |
| Running Env Steps   | 755000    |
| Running Forward KL  | 18.6      |
| Running Reverse KL  | 125       |
| Running Update Time | 151       |
-----------------------------------
--2024-08-11 11:43:45.709887 UTC---
| Itration            | 152       |
| Real Det Return     | 1.21e+03  |
| Real Sto Return     | 1.26e+03  |
| Reward Loss         | -5.41e+06 |
| Running Env Steps   | 760000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 53.8      |
| Running Update Time | 152       |
-----------------------------------
--2024-08-11 11:46:06.661160 UTC---
| Itration            | 153       |
| Real Det Return     | 1.38e+03  |
| Real Sto Return     | 1.29e+03  |
| Reward Loss         | -4.68e+06 |
| Running Env Steps   | 765000    |
| Running Forward KL  | 18.3      |
| Running Reverse KL  | 83.3      |
| Running Update Time | 153       |
-----------------------------------
--2024-08-11 11:48:29.988897 UTC---
| Itration            | 154       |
| Real Det Return     | 1.4e+03   |
| Real Sto Return     | 1.28e+03  |
| Reward Loss         | -5.18e+06 |
| Running Env Steps   | 770000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 45.8      |
| Running Update Time | 154       |
-----------------------------------
--2024-08-11 11:50:49.944783 UTC---
| Itration            | 155       |
| Real Det Return     | 1.24e+03  |
| Real Sto Return     | 1.2e+03   |
| Reward Loss         | -6.11e+06 |
| Running Env Steps   | 775000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 78.1      |
| Running Update Time | 155       |
-----------------------------------
--2024-08-11 11:53:13.388515 UTC---
| Itration            | 156       |
| Real Det Return     | 1.31e+03  |
| Real Sto Return     | 1.32e+03  |
| Reward Loss         | -5.48e+06 |
| Running Env Steps   | 780000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 12.1      |
| Running Update Time | 156       |
-----------------------------------
--2024-08-11 11:55:35.201805 UTC---
| Itration            | 157       |
| Real Det Return     | 1.33e+03  |
| Real Sto Return     | 1.32e+03  |
| Reward Loss         | -5.14e+06 |
| Running Env Steps   | 785000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 12.1      |
| Running Update Time | 157       |
-----------------------------------
--2024-08-11 11:57:56.863946 UTC--
| Itration            | 158      |
| Real Det Return     | 1.4e+03  |
| Real Sto Return     | 1.32e+03 |
| Reward Loss         | -5.1e+06 |
| Running Env Steps   | 790000   |
| Running Forward KL  | 18.2     |
| Running Reverse KL  | 73       |
| Running Update Time | 158      |
----------------------------------
--2024-08-11 12:00:20.649127 UTC--
| Itration            | 159      |
| Real Det Return     | 1.37e+03 |
| Real Sto Return     | 1.3e+03  |
| Reward Loss         | -5.5e+06 |
| Running Env Steps   | 795000   |
| Running Forward KL  | 19       |
| Running Reverse KL  | 12.3     |
| Running Update Time | 159      |
----------------------------------
--2024-08-11 12:02:42.637348 UTC---
| Itration            | 160       |
| Real Det Return     | 1.31e+03  |
| Real Sto Return     | 1.33e+03  |
| Reward Loss         | -5.31e+06 |
| Running Env Steps   | 800000    |
| Running Forward KL  | 18.3      |
| Running Reverse KL  | 44.5      |
| Running Update Time | 160       |
-----------------------------------
--2024-08-11 12:05:03.776225 UTC---
| Itration            | 161       |
| Real Det Return     | 1.3e+03   |
| Real Sto Return     | 1.28e+03  |
| Reward Loss         | -6.04e+06 |
| Running Env Steps   | 805000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 78.4      |
| Running Update Time | 161       |
-----------------------------------
--2024-08-11 12:07:23.101767 UTC---
| Itration            | 162       |
| Real Det Return     | 1.29e+03  |
| Real Sto Return     | 1.29e+03  |
| Reward Loss         | -5.11e+06 |
| Running Env Steps   | 810000    |
| Running Forward KL  | 18.5      |
| Running Reverse KL  | 90.3      |
| Running Update Time | 162       |
-----------------------------------
--2024-08-11 12:09:42.765296 UTC---
| Itration            | 163       |
| Real Det Return     | 1.28e+03  |
| Real Sto Return     | 1.25e+03  |
| Reward Loss         | -5.16e+06 |
| Running Env Steps   | 815000    |
| Running Forward KL  | 18.6      |
| Running Reverse KL  | 67.7      |
| Running Update Time | 163       |
-----------------------------------
--2024-08-11 12:12:02.239744 UTC--
| Itration            | 164      |
| Real Det Return     | 1.32e+03 |
| Real Sto Return     | 1.34e+03 |
| Reward Loss         | -5.8e+06 |
| Running Env Steps   | 820000   |
| Running Forward KL  | 19.1     |
| Running Reverse KL  | 11.9     |
| Running Update Time | 164      |
----------------------------------
--2024-08-11 12:14:25.663060 UTC---
| Itration            | 165       |
| Real Det Return     | 1.46e+03  |
| Real Sto Return     | 1.37e+03  |
| Reward Loss         | -5.85e+06 |
| Running Env Steps   | 825000    |
| Running Forward KL  | 18.9      |
| Running Reverse KL  | 39.3      |
| Running Update Time | 165       |
-----------------------------------
--2024-08-11 12:16:42.217640 UTC---
| Itration            | 166       |
| Real Det Return     | 1.32e+03  |
| Real Sto Return     | 1.2e+03   |
| Reward Loss         | -5.41e+06 |
| Running Env Steps   | 830000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 65.4      |
| Running Update Time | 166       |
-----------------------------------
--2024-08-11 12:19:02.718742 UTC--
| Itration            | 167      |
| Real Det Return     | 1.34e+03 |
| Real Sto Return     | 1.36e+03 |
| Reward Loss         | -6.1e+06 |
| Running Env Steps   | 835000   |
| Running Forward KL  | 18.8     |
| Running Reverse KL  | 11.4     |
| Running Update Time | 167      |
----------------------------------
--2024-08-11 12:21:27.204372 UTC---
| Itration            | 168       |
| Real Det Return     | 1.32e+03  |
| Real Sto Return     | 1.37e+03  |
| Reward Loss         | -5.81e+06 |
| Running Env Steps   | 840000    |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 36.1      |
| Running Update Time | 168       |
-----------------------------------
--2024-08-11 12:23:36.361588 UTC---
| Itration            | 169       |
| Real Det Return     | 1.45e+03  |
| Real Sto Return     | 1.01e+03  |
| Reward Loss         | -4.88e+06 |
| Running Env Steps   | 845000    |
| Running Forward KL  | 17.8      |
| Running Reverse KL  | 127       |
| Running Update Time | 169       |
-----------------------------------
--2024-08-11 12:25:59.602591 UTC---
| Itration            | 170       |
| Real Det Return     | 1.42e+03  |
| Real Sto Return     | 1.6e+03   |
| Reward Loss         | -5.63e+06 |
| Running Env Steps   | 850000    |
| Running Forward KL  | 17.8      |
| Running Reverse KL  | 38.5      |
| Running Update Time | 170       |
-----------------------------------
--2024-08-11 12:28:19.619563 UTC---
| Itration            | 171       |
| Real Det Return     | 1.46e+03  |
| Real Sto Return     | 1.41e+03  |
| Reward Loss         | -5.93e+06 |
| Running Env Steps   | 855000    |
| Running Forward KL  | 17.9      |
| Running Reverse KL  | 10.6      |
| Running Update Time | 171       |
-----------------------------------
--2024-08-11 12:30:40.032174 UTC---
| Itration            | 172       |
| Real Det Return     | 1.48e+03  |
| Real Sto Return     | 1.51e+03  |
| Reward Loss         | -5.96e+06 |
| Running Env Steps   | 860000    |
| Running Forward KL  | 18.2      |
| Running Reverse KL  | 11        |
| Running Update Time | 172       |
-----------------------------------
--2024-08-11 12:33:00.709657 UTC---
| Itration            | 173       |
| Real Det Return     | 1.49e+03  |
| Real Sto Return     | 1.32e+03  |
| Reward Loss         | -6.12e+06 |
| Running Env Steps   | 865000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 59.7      |
| Running Update Time | 173       |
-----------------------------------
--2024-08-11 12:35:22.806188 UTC---
| Itration            | 174       |
| Real Det Return     | 1.46e+03  |
| Real Sto Return     | 1.44e+03  |
| Reward Loss         | -5.92e+06 |
| Running Env Steps   | 870000    |
| Running Forward KL  | 18.2      |
| Running Reverse KL  | 11.2      |
| Running Update Time | 174       |
-----------------------------------
--2024-08-11 12:37:36.910582 UTC---
| Itration            | 175       |
| Real Det Return     | 1.48e+03  |
| Real Sto Return     | 1.26e+03  |
| Reward Loss         | -4.93e+06 |
| Running Env Steps   | 875000    |
| Running Forward KL  | 18.2      |
| Running Reverse KL  | 104       |
| Running Update Time | 175       |
-----------------------------------
--2024-08-11 12:39:56.431205 UTC---
| Itration            | 176       |
| Real Det Return     | 1.87e+03  |
| Real Sto Return     | 1.32e+03  |
| Reward Loss         | -5.52e+06 |
| Running Env Steps   | 880000    |
| Running Forward KL  | 18.1      |
| Running Reverse KL  | 191       |
| Running Update Time | 176       |
-----------------------------------
--2024-08-11 12:42:13.940945 UTC---
| Itration            | 177       |
| Real Det Return     | 1.83e+03  |
| Real Sto Return     | 1.53e+03  |
| Reward Loss         | -5.26e+06 |
| Running Env Steps   | 885000    |
| Running Forward KL  | 17.6      |
| Running Reverse KL  | 113       |
| Running Update Time | 177       |
-----------------------------------
--2024-08-11 12:44:27.017910 UTC---
| Itration            | 178       |
| Real Det Return     | 1.82e+03  |
| Real Sto Return     | 1.46e+03  |
| Reward Loss         | -5.58e+06 |
| Running Env Steps   | 890000    |
| Running Forward KL  | 17.5      |
| Running Reverse KL  | 93.4      |
| Running Update Time | 178       |
-----------------------------------
--2024-08-11 12:46:46.869650 UTC---
| Itration            | 179       |
| Real Det Return     | 1.98e+03  |
| Real Sto Return     | 1.49e+03  |
| Reward Loss         | -5.04e+06 |
| Running Env Steps   | 895000    |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 134       |
| Running Update Time | 179       |
-----------------------------------
--2024-08-11 12:49:05.189248 UTC---
| Itration            | 180       |
| Real Det Return     | 1.81e+03  |
| Real Sto Return     | 1.63e+03  |
| Reward Loss         | -5.61e+06 |
| Running Env Steps   | 900000    |
| Running Forward KL  | 17.7      |
| Running Reverse KL  | 109       |
| Running Update Time | 180       |
-----------------------------------
--2024-08-11 12:51:24.345615 UTC---
| Itration            | 181       |
| Real Det Return     | 1.89e+03  |
| Real Sto Return     | 1.69e+03  |
| Reward Loss         | -5.79e+06 |
| Running Env Steps   | 905000    |
| Running Forward KL  | 17.2      |
| Running Reverse KL  | 52.5      |
| Running Update Time | 181       |
-----------------------------------
--2024-08-11 12:53:46.507996 UTC---
| Itration            | 182       |
| Real Det Return     | 1.76e+03  |
| Real Sto Return     | 1.67e+03  |
| Reward Loss         | -5.75e+06 |
| Running Env Steps   | 910000    |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 39.2      |
| Running Update Time | 182       |
-----------------------------------
--2024-08-11 12:56:06.306393 UTC---
| Itration            | 183       |
| Real Det Return     | 2.17e+03  |
| Real Sto Return     | 1.87e+03  |
| Reward Loss         | -5.34e+06 |
| Running Env Steps   | 915000    |
| Running Forward KL  | 16.7      |
| Running Reverse KL  | 61.1      |
| Running Update Time | 183       |
-----------------------------------
--2024-08-11 12:58:23.998654 UTC---
| Itration            | 184       |
| Real Det Return     | 2.13e+03  |
| Real Sto Return     | 1.7e+03   |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 920000    |
| Running Forward KL  | 17.2      |
| Running Reverse KL  | 174       |
| Running Update Time | 184       |
-----------------------------------
--2024-08-11 13:00:46.707072 UTC---
| Itration            | 185       |
| Real Det Return     | 2e+03     |
| Real Sto Return     | 1.83e+03  |
| Reward Loss         | -5.63e+06 |
| Running Env Steps   | 925000    |
| Running Forward KL  | 17.1      |
| Running Reverse KL  | 21.7      |
| Running Update Time | 185       |
-----------------------------------
--2024-08-11 13:03:06.752933 UTC---
| Itration            | 186       |
| Real Det Return     | 2.15e+03  |
| Real Sto Return     | 1.88e+03  |
| Reward Loss         | -6.08e+06 |
| Running Env Steps   | 930000    |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 41        |
| Running Update Time | 186       |
-----------------------------------
--2024-08-11 13:05:22.153049 UTC---
| Itration            | 187       |
| Real Det Return     | 2.14e+03  |
| Real Sto Return     | 1.39e+03  |
| Reward Loss         | -5.41e+06 |
| Running Env Steps   | 935000    |
| Running Forward KL  | 16.5      |
| Running Reverse KL  | 101       |
| Running Update Time | 187       |
-----------------------------------
--2024-08-11 13:07:40.321577 UTC---
| Itration            | 188       |
| Real Det Return     | 2.39e+03  |
| Real Sto Return     | 1.69e+03  |
| Reward Loss         | -5.45e+06 |
| Running Env Steps   | 940000    |
| Running Forward KL  | 16.7      |
| Running Reverse KL  | 61.6      |
| Running Update Time | 188       |
-----------------------------------
--2024-08-11 13:10:00.604029 UTC--
| Itration            | 189      |
| Real Det Return     | 2.34e+03 |
| Real Sto Return     | 2.1e+03  |
| Reward Loss         | -5.3e+06 |
| Running Env Steps   | 945000   |
| Running Forward KL  | 16.5     |
| Running Reverse KL  | 50.2     |
| Running Update Time | 189      |
----------------------------------
--2024-08-11 13:12:26.700203 UTC---
| Itration            | 190       |
| Real Det Return     | 2.38e+03  |
| Real Sto Return     | 2.04e+03  |
| Reward Loss         | -5.69e+06 |
| Running Env Steps   | 950000    |
| Running Forward KL  | 16.6      |
| Running Reverse KL  | 22.5      |
| Running Update Time | 190       |
-----------------------------------
--2024-08-11 13:14:47.256844 UTC---
| Itration            | 191       |
| Real Det Return     | 2.28e+03  |
| Real Sto Return     | 1.81e+03  |
| Reward Loss         | -5.78e+06 |
| Running Env Steps   | 955000    |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 45.5      |
| Running Update Time | 191       |
-----------------------------------
--2024-08-11 13:17:07.849355 UTC---
| Itration            | 192       |
| Real Det Return     | 2.39e+03  |
| Real Sto Return     | 2.2e+03   |
| Reward Loss         | -5.24e+06 |
| Running Env Steps   | 960000    |
| Running Forward KL  | 16.1      |
| Running Reverse KL  | 40        |
| Running Update Time | 192       |
-----------------------------------
--2024-08-11 13:19:34.228090 UTC---
| Itration            | 193       |
| Real Det Return     | 2.29e+03  |
| Real Sto Return     | 2.17e+03  |
| Reward Loss         | -5.39e+06 |
| Running Env Steps   | 965000    |
| Running Forward KL  | 16.3      |
| Running Reverse KL  | 27.9      |
| Running Update Time | 193       |
-----------------------------------
--2024-08-11 13:21:56.733438 UTC---
| Itration            | 194       |
| Real Det Return     | 2.07e+03  |
| Real Sto Return     | 2.03e+03  |
| Reward Loss         | -5.81e+06 |
| Running Env Steps   | 970000    |
| Running Forward KL  | 17.2      |
| Running Reverse KL  | 24.9      |
| Running Update Time | 194       |
-----------------------------------
--2024-08-11 13:24:16.115818 UTC---
| Itration            | 195       |
| Real Det Return     | 2.06e+03  |
| Real Sto Return     | 1.91e+03  |
| Reward Loss         | -5.63e+06 |
| Running Env Steps   | 975000    |
| Running Forward KL  | 17.2      |
| Running Reverse KL  | 43.9      |
| Running Update Time | 195       |
-----------------------------------
--2024-08-11 13:26:40.197333 UTC--
| Itration            | 196      |
| Real Det Return     | 2.2e+03  |
| Real Sto Return     | 1.94e+03 |
| Reward Loss         | -5.5e+06 |
| Running Env Steps   | 980000   |
| Running Forward KL  | 16.5     |
| Running Reverse KL  | 66.9     |
| Running Update Time | 196      |
----------------------------------
--2024-08-11 13:28:44.995561 UTC---
| Itration            | 197       |
| Real Det Return     | 2.2e+03   |
| Real Sto Return     | 1.9e+03   |
| Reward Loss         | -5.55e+06 |
| Running Env Steps   | 985000    |
| Running Forward KL  | 16.6      |
| Running Reverse KL  | 44        |
| Running Update Time | 197       |
-----------------------------------
--2024-08-11 13:30:40.106409 UTC---
| Itration            | 198       |
| Real Det Return     | 1.91e+03  |
| Real Sto Return     | 1.98e+03  |
| Reward Loss         | -6.15e+06 |
| Running Env Steps   | 990000    |
| Running Forward KL  | 16.6      |
| Running Reverse KL  | 8.89      |
| Running Update Time | 198       |
-----------------------------------
--2024-08-11 13:32:33.375226 UTC---
| Itration            | 199       |
| Real Det Return     | 2.35e+03  |
| Real Sto Return     | 1.65e+03  |
| Reward Loss         | -4.72e+06 |
| Running Env Steps   | 995000    |
| Running Forward KL  | 16.6      |
| Running Reverse KL  | 121       |
| Running Update Time | 199       |
-----------------------------------
--2024-08-11 13:34:27.976071 UTC---
| Itration            | 200       |
| Real Det Return     | 2.42e+03  |
| Real Sto Return     | 2.57e+03  |
| Reward Loss         | -5.02e+06 |
| Running Env Steps   | 1000000   |
| Running Forward KL  | 15.3      |
| Running Reverse KL  | 10.2      |
| Running Update Time | 200       |
-----------------------------------
--2024-08-11 13:36:24.682263 UTC---
| Itration            | 201       |
| Real Det Return     | 2.63e+03  |
| Real Sto Return     | 2.56e+03  |
| Reward Loss         | -4.95e+06 |
| Running Env Steps   | 1005000   |
| Running Forward KL  | 15.1      |
| Running Reverse KL  | 21.2      |
| Running Update Time | 201       |
-----------------------------------
--2024-08-11 13:37:57.225346 UTC---
| Itration            | 202       |
| Real Det Return     | 985       |
| Real Sto Return     | 1.5e+03   |
| Reward Loss         | -4.38e+06 |
| Running Env Steps   | 1010000   |
| Running Forward KL  | 15.8      |
| Running Reverse KL  | 177       |
| Running Update Time | 202       |
-----------------------------------
--2024-08-11 13:39:48.985570 UTC---
| Itration            | 203       |
| Real Det Return     | 2.48e+03  |
| Real Sto Return     | 2.57e+03  |
| Reward Loss         | -4.78e+06 |
| Running Env Steps   | 1015000   |
| Running Forward KL  | 15.1      |
| Running Reverse KL  | 142       |
| Running Update Time | 203       |
-----------------------------------
--2024-08-11 13:41:47.186282 UTC---
| Itration            | 204       |
| Real Det Return     | 2.45e+03  |
| Real Sto Return     | 2.31e+03  |
| Reward Loss         | -5.03e+06 |
| Running Env Steps   | 1020000   |
| Running Forward KL  | 15.9      |
| Running Reverse KL  | 37.7      |
| Running Update Time | 204       |
-----------------------------------
--2024-08-11 13:43:42.228958 UTC---
| Itration            | 205       |
| Real Det Return     | 2.92e+03  |
| Real Sto Return     | 2.6e+03   |
| Reward Loss         | -5.37e+06 |
| Running Env Steps   | 1025000   |
| Running Forward KL  | 15        |
| Running Reverse KL  | 8.96      |
| Running Update Time | 205       |
-----------------------------------
--2024-08-11 13:45:34.854822 UTC---
| Itration            | 206       |
| Real Det Return     | 3.03e+03  |
| Real Sto Return     | 2.59e+03  |
| Reward Loss         | -4.76e+06 |
| Running Env Steps   | 1030000   |
| Running Forward KL  | 14.3      |
| Running Reverse KL  | 31.6      |
| Running Update Time | 206       |
-----------------------------------
--2024-08-11 13:47:32.700349 UTC---
| Itration            | 207       |
| Real Det Return     | 2.94e+03  |
| Real Sto Return     | 2.79e+03  |
| Reward Loss         | -4.18e+06 |
| Running Env Steps   | 1035000   |
| Running Forward KL  | 14.2      |
| Running Reverse KL  | 60.8      |
| Running Update Time | 207       |
-----------------------------------
--2024-08-11 13:49:27.688082 UTC---
| Itration            | 208       |
| Real Det Return     | 3.15e+03  |
| Real Sto Return     | 3.01e+03  |
| Reward Loss         | -4.28e+06 |
| Running Env Steps   | 1040000   |
| Running Forward KL  | 14.5      |
| Running Reverse KL  | 19.2      |
| Running Update Time | 208       |
-----------------------------------
--2024-08-11 13:51:24.815243 UTC---
| Itration            | 209       |
| Real Det Return     | 3.11e+03  |
| Real Sto Return     | 3.01e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 1045000   |
| Running Forward KL  | 13.8      |
| Running Reverse KL  | 22        |
| Running Update Time | 209       |
-----------------------------------
--2024-08-11 13:53:21.541174 UTC---
| Itration            | 210       |
| Real Det Return     | 2.74e+03  |
| Real Sto Return     | 2.63e+03  |
| Reward Loss         | -4.75e+06 |
| Running Env Steps   | 1050000   |
| Running Forward KL  | 14.2      |
| Running Reverse KL  | 49.7      |
| Running Update Time | 210       |
-----------------------------------
--2024-08-11 13:55:09.046925 UTC---
| Itration            | 211       |
| Real Det Return     | 2.95e+03  |
| Real Sto Return     | 2.03e+03  |
| Reward Loss         | -4.52e+06 |
| Running Env Steps   | 1055000   |
| Running Forward KL  | 14.5      |
| Running Reverse KL  | 119       |
| Running Update Time | 211       |
-----------------------------------
--2024-08-11 13:57:07.218884 UTC---
| Itration            | 212       |
| Real Det Return     | 2.99e+03  |
| Real Sto Return     | 2.97e+03  |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 1060000   |
| Running Forward KL  | 13.5      |
| Running Reverse KL  | 8.69      |
| Running Update Time | 212       |
-----------------------------------
--2024-08-11 13:59:03.945083 UTC---
| Itration            | 213       |
| Real Det Return     | 3.36e+03  |
| Real Sto Return     | 3.06e+03  |
| Reward Loss         | -4.11e+06 |
| Running Env Steps   | 1065000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 26.2      |
| Running Update Time | 213       |
-----------------------------------
--2024-08-11 14:00:37.116492 UTC--
| Itration            | 214      |
| Real Det Return     | 994      |
| Real Sto Return     | 2.27e+03 |
| Reward Loss         | -3.4e+06 |
| Running Env Steps   | 1070000  |
| Running Forward KL  | 13.7     |
| Running Reverse KL  | 114      |
| Running Update Time | 214      |
----------------------------------
--2024-08-11 14:02:01.169439 UTC---
| Itration            | 215       |
| Real Det Return     | 602       |
| Real Sto Return     | 867       |
| Reward Loss         | -3.43e+06 |
| Running Env Steps   | 1075000   |
| Running Forward KL  | 15.7      |
| Running Reverse KL  | 267       |
| Running Update Time | 215       |
-----------------------------------
--2024-08-11 14:03:52.838543 UTC---
| Itration            | 216       |
| Real Det Return     | 3.42e+03  |
| Real Sto Return     | 2.79e+03  |
| Reward Loss         | -3.92e+06 |
| Running Env Steps   | 1080000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 90.4      |
| Running Update Time | 216       |
-----------------------------------
--2024-08-11 14:05:52.488366 UTC---
| Itration            | 217       |
| Real Det Return     | 3.51e+03  |
| Real Sto Return     | 3.37e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 1085000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 8.51      |
| Running Update Time | 217       |
-----------------------------------
--2024-08-11 14:07:29.443438 UTC---
| Itration            | 218       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 2.34e+03  |
| Reward Loss         | -4.92e+06 |
| Running Env Steps   | 1090000   |
| Running Forward KL  | 14.7      |
| Running Reverse KL  | 95        |
| Running Update Time | 218       |
-----------------------------------
--2024-08-11 14:09:23.044696 UTC---
| Itration            | 219       |
| Real Det Return     | 3.3e+03   |
| Real Sto Return     | 2.8e+03   |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 1095000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 8.22      |
| Running Update Time | 219       |
-----------------------------------
--2024-08-11 14:11:21.543499 UTC--
| Itration            | 220      |
| Real Det Return     | 2.93e+03 |
| Real Sto Return     | 3.14e+03 |
| Reward Loss         | -4.6e+06 |
| Running Env Steps   | 1100000  |
| Running Forward KL  | 13.4     |
| Running Reverse KL  | 9.06     |
| Running Update Time | 220      |
----------------------------------
--2024-08-11 14:13:14.665634 UTC---
| Itration            | 221       |
| Real Det Return     | 3.26e+03  |
| Real Sto Return     | 3.06e+03  |
| Reward Loss         | -4.05e+06 |
| Running Env Steps   | 1105000   |
| Running Forward KL  | 13.2      |
| Running Reverse KL  | 65        |
| Running Update Time | 221       |
-----------------------------------
--2024-08-11 14:15:12.073409 UTC---
| Itration            | 222       |
| Real Det Return     | 3.12e+03  |
| Real Sto Return     | 3.19e+03  |
| Reward Loss         | -4.54e+06 |
| Running Env Steps   | 1110000   |
| Running Forward KL  | 13        |
| Running Reverse KL  | 8.54      |
| Running Update Time | 222       |
-----------------------------------
--2024-08-11 14:17:09.277530 UTC---
| Itration            | 223       |
| Real Det Return     | 3.33e+03  |
| Real Sto Return     | 3.15e+03  |
| Reward Loss         | -4.49e+06 |
| Running Env Steps   | 1115000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 7.54      |
| Running Update Time | 223       |
-----------------------------------
--2024-08-11 14:19:04.971453 UTC---
| Itration            | 224       |
| Real Det Return     | 3.27e+03  |
| Real Sto Return     | 3.42e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 1120000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 9.26      |
| Running Update Time | 224       |
-----------------------------------
--2024-08-11 14:21:01.425592 UTC--
| Itration            | 225      |
| Real Det Return     | 3.34e+03 |
| Real Sto Return     | 3.14e+03 |
| Reward Loss         | -4.5e+06 |
| Running Env Steps   | 1125000  |
| Running Forward KL  | 13       |
| Running Reverse KL  | 30.5     |
| Running Update Time | 225      |
----------------------------------
--2024-08-11 14:22:56.652929 UTC---
| Itration            | 226       |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 2.69e+03  |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 1130000   |
| Running Forward KL  | 12.9      |
| Running Reverse KL  | 55.4      |
| Running Update Time | 226       |
-----------------------------------
--2024-08-11 14:24:52.411655 UTC---
| Itration            | 227       |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 3.16e+03  |
| Reward Loss         | -4.64e+06 |
| Running Env Steps   | 1135000   |
| Running Forward KL  | 12.9      |
| Running Reverse KL  | 19.2      |
| Running Update Time | 227       |
-----------------------------------
--2024-08-11 14:26:49.964961 UTC---
| Itration            | 228       |
| Real Det Return     | 3.23e+03  |
| Real Sto Return     | 3.14e+03  |
| Reward Loss         | -4.99e+06 |
| Running Env Steps   | 1140000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 20.6      |
| Running Update Time | 228       |
-----------------------------------
--2024-08-11 14:28:38.332895 UTC---
| Itration            | 229       |
| Real Det Return     | 2.56e+03  |
| Real Sto Return     | 3.08e+03  |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 1145000   |
| Running Forward KL  | 13.1      |
| Running Reverse KL  | 137       |
| Running Update Time | 229       |
-----------------------------------
--2024-08-11 14:30:34.867323 UTC--
| Itration            | 230      |
| Real Det Return     | 3.68e+03 |
| Real Sto Return     | 3.34e+03 |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 1150000  |
| Running Forward KL  | 12.3     |
| Running Reverse KL  | 8.25     |
| Running Update Time | 230      |
----------------------------------
--2024-08-11 14:32:32.422552 UTC---
| Itration            | 231       |
| Real Det Return     | 3.34e+03  |
| Real Sto Return     | 3.28e+03  |
| Reward Loss         | -4.43e+06 |
| Running Env Steps   | 1155000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 7.93      |
| Running Update Time | 231       |
-----------------------------------
--2024-08-11 14:34:29.370529 UTC---
| Itration            | 232       |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 3.22e+03  |
| Reward Loss         | -4.79e+06 |
| Running Env Steps   | 1160000   |
| Running Forward KL  | 12.4      |
| Running Reverse KL  | 7.8       |
| Running Update Time | 232       |
-----------------------------------
--2024-08-11 14:36:25.457517 UTC---
| Itration            | 233       |
| Real Det Return     | 3.39e+03  |
| Real Sto Return     | 3.01e+03  |
| Reward Loss         | -4.84e+06 |
| Running Env Steps   | 1165000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 16.3      |
| Running Update Time | 233       |
-----------------------------------
--2024-08-11 14:38:22.800690 UTC---
| Itration            | 234       |
| Real Det Return     | 3.48e+03  |
| Real Sto Return     | 3.43e+03  |
| Reward Loss         | -4.29e+06 |
| Running Env Steps   | 1170000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 7.26      |
| Running Update Time | 234       |
-----------------------------------
--2024-08-11 14:40:20.149229 UTC---
| Itration            | 235       |
| Real Det Return     | 3.68e+03  |
| Real Sto Return     | 3.63e+03  |
| Reward Loss         | -3.69e+06 |
| Running Env Steps   | 1175000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 7.25      |
| Running Update Time | 235       |
-----------------------------------
--2024-08-11 14:42:17.938336 UTC--
| Itration            | 236      |
| Real Det Return     | 3.41e+03 |
| Real Sto Return     | 3.3e+03  |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 1180000  |
| Running Forward KL  | 12.3     |
| Running Reverse KL  | 8.13     |
| Running Update Time | 236      |
----------------------------------
--2024-08-11 14:44:13.842267 UTC---
| Itration            | 237       |
| Real Det Return     | 3.74e+03  |
| Real Sto Return     | 3.34e+03  |
| Reward Loss         | -3.86e+06 |
| Running Env Steps   | 1185000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 41.2      |
| Running Update Time | 237       |
-----------------------------------
--2024-08-11 14:46:12.114373 UTC---
| Itration            | 238       |
| Real Det Return     | 3.45e+03  |
| Real Sto Return     | 3.32e+03  |
| Reward Loss         | -4.29e+06 |
| Running Env Steps   | 1190000   |
| Running Forward KL  | 11.8      |
| Running Reverse KL  | 7.45      |
| Running Update Time | 238       |
-----------------------------------
--2024-08-11 14:48:10.511399 UTC---
| Itration            | 239       |
| Real Det Return     | 3.33e+03  |
| Real Sto Return     | 3.29e+03  |
| Reward Loss         | -4.56e+06 |
| Running Env Steps   | 1195000   |
| Running Forward KL  | 12.5      |
| Running Reverse KL  | 8.33      |
| Running Update Time | 239       |
-----------------------------------
--2024-08-11 14:49:49.731808 UTC---
| Itration            | 240       |
| Real Det Return     | 1.14e+03  |
| Real Sto Return     | 3.25e+03  |
| Reward Loss         | -3.63e+06 |
| Running Env Steps   | 1200000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 60.1      |
| Running Update Time | 240       |
-----------------------------------
--2024-08-11 14:51:45.821166 UTC---
| Itration            | 241       |
| Real Det Return     | 3.25e+03  |
| Real Sto Return     | 3.25e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 1205000   |
| Running Forward KL  | 12        |
| Running Reverse KL  | 31        |
| Running Update Time | 241       |
-----------------------------------
--2024-08-11 14:53:43.036639 UTC---
| Itration            | 242       |
| Real Det Return     | 3.68e+03  |
| Real Sto Return     | 3.69e+03  |
| Reward Loss         | -3.59e+06 |
| Running Env Steps   | 1210000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 7.39      |
| Running Update Time | 242       |
-----------------------------------
--2024-08-11 14:55:39.835653 UTC---
| Itration            | 243       |
| Real Det Return     | 3.78e+03  |
| Real Sto Return     | 3.55e+03  |
| Reward Loss         | -3.61e+06 |
| Running Env Steps   | 1215000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 36.9      |
| Running Update Time | 243       |
-----------------------------------
--2024-08-11 14:57:36.740798 UTC--
| Itration            | 244      |
| Real Det Return     | 3.82e+03 |
| Real Sto Return     | 3.59e+03 |
| Reward Loss         | -3.9e+06 |
| Running Env Steps   | 1220000  |
| Running Forward KL  | 11.7     |
| Running Reverse KL  | 25.1     |
| Running Update Time | 244      |
----------------------------------
--2024-08-11 14:59:21.704199 UTC--
| Itration            | 245      |
| Real Det Return     | 1.93e+03 |
| Real Sto Return     | 3.3e+03  |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 1225000  |
| Running Forward KL  | 12       |
| Running Reverse KL  | 20.4     |
| Running Update Time | 245      |
----------------------------------
--2024-08-11 15:01:18.499497 UTC---
| Itration            | 246       |
| Real Det Return     | 3.64e+03  |
| Real Sto Return     | 3.65e+03  |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 1230000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 6.89      |
| Running Update Time | 246       |
-----------------------------------
--2024-08-11 15:03:16.289798 UTC--
| Itration            | 247      |
| Real Det Return     | 3.75e+03 |
| Real Sto Return     | 3.79e+03 |
| Reward Loss         | -3.3e+06 |
| Running Env Steps   | 1235000  |
| Running Forward KL  | 11.2     |
| Running Reverse KL  | 7.59     |
| Running Update Time | 247      |
----------------------------------
--2024-08-11 15:05:10.839408 UTC---
| Itration            | 248       |
| Real Det Return     | 3.88e+03  |
| Real Sto Return     | 3.41e+03  |
| Reward Loss         | -4.04e+06 |
| Running Env Steps   | 1240000   |
| Running Forward KL  | 11.8      |
| Running Reverse KL  | 7.64      |
| Running Update Time | 248       |
-----------------------------------
--2024-08-11 15:07:08.893953 UTC---
| Itration            | 249       |
| Real Det Return     | 3.49e+03  |
| Real Sto Return     | 3.49e+03  |
| Reward Loss         | -4.48e+06 |
| Running Env Steps   | 1245000   |
| Running Forward KL  | 11.9      |
| Running Reverse KL  | 21.5      |
| Running Update Time | 249       |
-----------------------------------
--2024-08-11 15:09:06.403407 UTC---
| Itration            | 250       |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 3.58e+03  |
| Reward Loss         | -4.25e+06 |
| Running Env Steps   | 1250000   |
| Running Forward KL  | 12        |
| Running Reverse KL  | 8.37      |
| Running Update Time | 250       |
-----------------------------------
--2024-08-11 15:11:01.821238 UTC---
| Itration            | 251       |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 3.96e+03  |
| Reward Loss         | -3.03e+06 |
| Running Env Steps   | 1255000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 7.35      |
| Running Update Time | 251       |
-----------------------------------
--2024-08-11 15:13:00.406357 UTC---
| Itration            | 252       |
| Real Det Return     | 4.15e+03  |
| Real Sto Return     | 3.67e+03  |
| Reward Loss         | -3.38e+06 |
| Running Env Steps   | 1260000   |
| Running Forward KL  | 10.7      |
| Running Reverse KL  | 6.75      |
| Running Update Time | 252       |
-----------------------------------
--2024-08-11 15:14:56.889485 UTC---
| Itration            | 253       |
| Real Det Return     | 3.72e+03  |
| Real Sto Return     | 3.8e+03   |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 1265000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 7.81      |
| Running Update Time | 253       |
-----------------------------------
--2024-08-11 15:16:52.457895 UTC---
| Itration            | 254       |
| Real Det Return     | 3.94e+03  |
| Real Sto Return     | 3.86e+03  |
| Reward Loss         | -3.49e+06 |
| Running Env Steps   | 1270000   |
| Running Forward KL  | 10.6      |
| Running Reverse KL  | 7.08      |
| Running Update Time | 254       |
-----------------------------------
--2024-08-11 15:18:46.912994 UTC---
| Itration            | 255       |
| Real Det Return     | 4.06e+03  |
| Real Sto Return     | 3.23e+03  |
| Reward Loss         | -3.67e+06 |
| Running Env Steps   | 1275000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 62.6      |
| Running Update Time | 255       |
-----------------------------------
--2024-08-11 15:20:41.957779 UTC---
| Itration            | 256       |
| Real Det Return     | 4.24e+03  |
| Real Sto Return     | 3.83e+03  |
| Reward Loss         | -3.14e+06 |
| Running Env Steps   | 1280000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 7.69      |
| Running Update Time | 256       |
-----------------------------------
--2024-08-11 15:22:37.108098 UTC---
| Itration            | 257       |
| Real Det Return     | 4.05e+03  |
| Real Sto Return     | 3.42e+03  |
| Reward Loss         | -4.48e+06 |
| Running Env Steps   | 1285000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 57.6      |
| Running Update Time | 257       |
-----------------------------------
--2024-08-11 15:24:36.030659 UTC---
| Itration            | 258       |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -3.46e+06 |
| Running Env Steps   | 1290000   |
| Running Forward KL  | 11.1      |
| Running Reverse KL  | 17.2      |
| Running Update Time | 258       |
-----------------------------------
--2024-08-11 15:26:30.956249 UTC---
| Itration            | 259       |
| Real Det Return     | 4.05e+03  |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -3.57e+06 |
| Running Env Steps   | 1295000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 7.26      |
| Running Update Time | 259       |
-----------------------------------
--2024-08-11 15:28:27.698243 UTC---
| Itration            | 260       |
| Real Det Return     | 4.11e+03  |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -3.61e+06 |
| Running Env Steps   | 1300000   |
| Running Forward KL  | 10.9      |
| Running Reverse KL  | 7.32      |
| Running Update Time | 260       |
-----------------------------------
--2024-08-11 15:30:26.487918 UTC---
| Itration            | 261       |
| Real Det Return     | 3.64e+03  |
| Real Sto Return     | 3.67e+03  |
| Reward Loss         | -4.26e+06 |
| Running Env Steps   | 1305000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 7.55      |
| Running Update Time | 261       |
-----------------------------------
--2024-08-11 15:32:21.135191 UTC---
| Itration            | 262       |
| Real Det Return     | 4.03e+03  |
| Real Sto Return     | 3.91e+03  |
| Reward Loss         | -3.42e+06 |
| Running Env Steps   | 1310000   |
| Running Forward KL  | 10.6      |
| Running Reverse KL  | 6.97      |
| Running Update Time | 262       |
-----------------------------------
--2024-08-11 15:34:18.288118 UTC---
| Itration            | 263       |
| Real Det Return     | 4.09e+03  |
| Real Sto Return     | 3.66e+03  |
| Reward Loss         | -3.62e+06 |
| Running Env Steps   | 1315000   |
| Running Forward KL  | 10.5      |
| Running Reverse KL  | 6.83      |
| Running Update Time | 263       |
-----------------------------------
--2024-08-11 15:36:16.613905 UTC---
| Itration            | 264       |
| Real Det Return     | 3.51e+03  |
| Real Sto Return     | 3.47e+03  |
| Reward Loss         | -4.68e+06 |
| Running Env Steps   | 1320000   |
| Running Forward KL  | 12.1      |
| Running Reverse KL  | 7.24      |
| Running Update Time | 264       |
-----------------------------------
--2024-08-11 15:38:10.676957 UTC---
| Itration            | 265       |
| Real Det Return     | 3.69e+03  |
| Real Sto Return     | 3.87e+03  |
| Reward Loss         | -3.87e+06 |
| Running Env Steps   | 1325000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 265       |
-----------------------------------
--2024-08-11 15:40:07.696339 UTC---
| Itration            | 266       |
| Real Det Return     | 4.03e+03  |
| Real Sto Return     | 3.23e+03  |
| Reward Loss         | -4.87e+06 |
| Running Env Steps   | 1330000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 13.7      |
| Running Update Time | 266       |
-----------------------------------
--2024-08-11 15:42:04.648344 UTC--
| Itration            | 267      |
| Real Det Return     | 4.12e+03 |
| Real Sto Return     | 3.61e+03 |
| Reward Loss         | -3.9e+06 |
| Running Env Steps   | 1335000  |
| Running Forward KL  | 10.5     |
| Running Reverse KL  | 7.04     |
| Running Update Time | 267      |
----------------------------------
--2024-08-11 15:43:59.128962 UTC---
| Itration            | 268       |
| Real Det Return     | 3.52e+03  |
| Real Sto Return     | 3.6e+03   |
| Reward Loss         | -4.35e+06 |
| Running Env Steps   | 1340000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 7.11      |
| Running Update Time | 268       |
-----------------------------------
--2024-08-11 15:45:58.687206 UTC---
| Itration            | 269       |
| Real Det Return     | 4.2e+03   |
| Real Sto Return     | 3.56e+03  |
| Reward Loss         | -4.47e+06 |
| Running Env Steps   | 1345000   |
| Running Forward KL  | 10.8      |
| Running Reverse KL  | 36.9      |
| Running Update Time | 269       |
-----------------------------------
--2024-08-11 15:47:54.761532 UTC---
| Itration            | 270       |
| Real Det Return     | 3.65e+03  |
| Real Sto Return     | 3.75e+03  |
| Reward Loss         | -5.11e+06 |
| Running Env Steps   | 1350000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 29.2      |
| Running Update Time | 270       |
-----------------------------------
--2024-08-11 15:49:48.464236 UTC---
| Itration            | 271       |
| Real Det Return     | 4.01e+03  |
| Real Sto Return     | 3.48e+03  |
| Reward Loss         | -5.82e+06 |
| Running Env Steps   | 1355000   |
| Running Forward KL  | 11.5      |
| Running Reverse KL  | 68.4      |
| Running Update Time | 271       |
-----------------------------------
--2024-08-11 15:51:48.197305 UTC---
| Itration            | 272       |
| Real Det Return     | 3.74e+03  |
| Real Sto Return     | 3.46e+03  |
| Reward Loss         | -4.65e+06 |
| Running Env Steps   | 1360000   |
| Running Forward KL  | 11.5      |
| Running Reverse KL  | 7.82      |
| Running Update Time | 272       |
-----------------------------------
--2024-08-11 15:53:43.509687 UTC---
| Itration            | 273       |
| Real Det Return     | 3.44e+03  |
| Real Sto Return     | 3.57e+03  |
| Reward Loss         | -4.31e+06 |
| Running Env Steps   | 1365000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 8.11      |
| Running Update Time | 273       |
-----------------------------------
--2024-08-11 15:55:37.920036 UTC--
| Itration            | 274      |
| Real Det Return     | 4.52e+03 |
| Real Sto Return     | 3.87e+03 |
| Reward Loss         | -3.9e+06 |
| Running Env Steps   | 1370000  |
| Running Forward KL  | 10.9     |
| Running Reverse KL  | 7.51     |
| Running Update Time | 274      |
----------------------------------
--2024-08-11 15:57:37.405109 UTC--
| Itration            | 275      |
| Real Det Return     | 4.23e+03 |
| Real Sto Return     | 4.2e+03  |
| Reward Loss         | -2.4e+06 |
| Running Env Steps   | 1375000  |
| Running Forward KL  | 9.74     |
| Running Reverse KL  | 5.77     |
| Running Update Time | 275      |
----------------------------------
--2024-08-11 15:59:32.931104 UTC--
| Itration            | 276      |
| Real Det Return     | 3.78e+03 |
| Real Sto Return     | 3.84e+03 |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 1380000  |
| Running Forward KL  | 10.3     |
| Running Reverse KL  | 6.8      |
| Running Update Time | 276      |
----------------------------------
--2024-08-11 16:01:27.432495 UTC---
| Itration            | 277       |
| Real Det Return     | 4.25e+03  |
| Real Sto Return     | 3.65e+03  |
| Reward Loss         | -4.02e+06 |
| Running Env Steps   | 1385000   |
| Running Forward KL  | 10.6      |
| Running Reverse KL  | 6.58      |
| Running Update Time | 277       |
-----------------------------------
--2024-08-11 16:03:26.957258 UTC---
| Itration            | 278       |
| Real Det Return     | 3.8e+03   |
| Real Sto Return     | 3.67e+03  |
| Reward Loss         | -4.26e+06 |
| Running Env Steps   | 1390000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 7.47      |
| Running Update Time | 278       |
-----------------------------------
--2024-08-11 16:05:22.559682 UTC---
| Itration            | 279       |
| Real Det Return     | 4.33e+03  |
| Real Sto Return     | 3.93e+03  |
| Reward Loss         | -2.73e+06 |
| Running Env Steps   | 1395000   |
| Running Forward KL  | 9.73      |
| Running Reverse KL  | 5.67      |
| Running Update Time | 279       |
-----------------------------------
--2024-08-11 16:07:17.919146 UTC---
| Itration            | 280       |
| Real Det Return     | 3.68e+03  |
| Real Sto Return     | 3.8e+03   |
| Reward Loss         | -3.88e+06 |
| Running Env Steps   | 1400000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 7.51      |
| Running Update Time | 280       |
-----------------------------------
--2024-08-11 16:09:18.608236 UTC---
| Itration            | 281       |
| Real Det Return     | 3.65e+03  |
| Real Sto Return     | 3.88e+03  |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 1405000   |
| Running Forward KL  | 10.2      |
| Running Reverse KL  | 7.74      |
| Running Update Time | 281       |
-----------------------------------
--2024-08-11 16:11:13.844883 UTC--
| Itration            | 282      |
| Real Det Return     | 3.6e+03  |
| Real Sto Return     | 3.72e+03 |
| Reward Loss         | -4e+06   |
| Running Env Steps   | 1410000  |
| Running Forward KL  | 11.4     |
| Running Reverse KL  | 7.55     |
| Running Update Time | 282      |
----------------------------------
--2024-08-11 16:13:09.619099 UTC---
| Itration            | 283       |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 3.84e+03  |
| Reward Loss         | -3.96e+06 |
| Running Env Steps   | 1415000   |
| Running Forward KL  | 11.4      |
| Running Reverse KL  | 7.36      |
| Running Update Time | 283       |
-----------------------------------
--2024-08-11 16:15:09.176066 UTC---
| Itration            | 284       |
| Real Det Return     | 3.56e+03  |
| Real Sto Return     | 3.91e+03  |
| Reward Loss         | -4.02e+06 |
| Running Env Steps   | 1420000   |
| Running Forward KL  | 10.5      |
| Running Reverse KL  | 7.26      |
| Running Update Time | 284       |
-----------------------------------
--2024-08-11 16:17:03.883299 UTC---
| Itration            | 285       |
| Real Det Return     | 4.25e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -3.24e+06 |
| Running Env Steps   | 1425000   |
| Running Forward KL  | 9.58      |
| Running Reverse KL  | 6.84      |
| Running Update Time | 285       |
-----------------------------------
--2024-08-11 16:18:59.744655 UTC---
| Itration            | 286       |
| Real Det Return     | 3.48e+03  |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -4.07e+06 |
| Running Env Steps   | 1430000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 6.36      |
| Running Update Time | 286       |
-----------------------------------
--2024-08-11 16:20:58.152326 UTC---
| Itration            | 287       |
| Real Det Return     | 4.12e+03  |
| Real Sto Return     | 3.98e+03  |
| Reward Loss         | -3.34e+06 |
| Running Env Steps   | 1435000   |
| Running Forward KL  | 10.8      |
| Running Reverse KL  | 7.32      |
| Running Update Time | 287       |
-----------------------------------
--2024-08-11 16:22:52.786389 UTC---
| Itration            | 288       |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 4.05e+03  |
| Reward Loss         | -3.37e+06 |
| Running Env Steps   | 1440000   |
| Running Forward KL  | 10.2      |
| Running Reverse KL  | 6.22      |
| Running Update Time | 288       |
-----------------------------------
--2024-08-11 16:24:50.334915 UTC---
| Itration            | 289       |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 4.09e+03  |
| Reward Loss         | -3.07e+06 |
| Running Env Steps   | 1445000   |
| Running Forward KL  | 9.81      |
| Running Reverse KL  | 6.33      |
| Running Update Time | 289       |
-----------------------------------
--2024-08-11 16:26:47.024607 UTC---
| Itration            | 290       |
| Real Det Return     | 4.29e+03  |
| Real Sto Return     | 3.86e+03  |
| Reward Loss         | -2.88e+06 |
| Running Env Steps   | 1450000   |
| Running Forward KL  | 10.7      |
| Running Reverse KL  | 7.74      |
| Running Update Time | 290       |
-----------------------------------
--2024-08-11 16:28:41.323582 UTC---
| Itration            | 291       |
| Real Det Return     | 4.19e+03  |
| Real Sto Return     | 4.14e+03  |
| Reward Loss         | -3.14e+06 |
| Running Env Steps   | 1455000   |
| Running Forward KL  | 9.23      |
| Running Reverse KL  | 16        |
| Running Update Time | 291       |
-----------------------------------
--2024-08-11 16:30:38.167647 UTC---
| Itration            | 292       |
| Real Det Return     | 4.38e+03  |
| Real Sto Return     | 3.66e+03  |
| Reward Loss         | -3.62e+06 |
| Running Env Steps   | 1460000   |
| Running Forward KL  | 10.5      |
| Running Reverse KL  | 31.6      |
| Running Update Time | 292       |
-----------------------------------
--2024-08-11 16:32:34.118769 UTC---
| Itration            | 293       |
| Real Det Return     | 3.58e+03  |
| Real Sto Return     | 3.78e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 1465000   |
| Running Forward KL  | 11.5      |
| Running Reverse KL  | 64.5      |
| Running Update Time | 293       |
-----------------------------------
--2024-08-11 16:34:24.382382 UTC---
| Itration            | 294       |
| Real Det Return     | 4.41e+03  |
| Real Sto Return     | 3.45e+03  |
| Reward Loss         | -3.57e+06 |
| Running Env Steps   | 1470000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 172       |
| Running Update Time | 294       |
-----------------------------------
--2024-08-11 16:36:23.410966 UTC---
| Itration            | 295       |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 4.13e+03  |
| Reward Loss         | -2.58e+06 |
| Running Env Steps   | 1475000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 7.05      |
| Running Update Time | 295       |
-----------------------------------
--2024-08-11 16:38:18.443476 UTC---
| Itration            | 296       |
| Real Det Return     | 4.36e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -2.89e+06 |
| Running Env Steps   | 1480000   |
| Running Forward KL  | 10.4      |
| Running Reverse KL  | 7.75      |
| Running Update Time | 296       |
-----------------------------------
--2024-08-11 16:40:12.088752 UTC---
| Itration            | 297       |
| Real Det Return     | 3.99e+03  |
| Real Sto Return     | 3.8e+03   |
| Reward Loss         | -3.34e+06 |
| Running Env Steps   | 1485000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 8.37      |
| Running Update Time | 297       |
-----------------------------------
--2024-08-11 16:42:09.995130 UTC---
| Itration            | 298       |
| Real Det Return     | 4.21e+03  |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -2.66e+06 |
| Running Env Steps   | 1490000   |
| Running Forward KL  | 10.7      |
| Running Reverse KL  | 8.15      |
| Running Update Time | 298       |
-----------------------------------
--2024-08-11 16:44:04.206237 UTC--
| Itration            | 299      |
| Real Det Return     | 4.49e+03 |
| Real Sto Return     | 4.31e+03 |
| Reward Loss         | -2.3e+06 |
| Running Env Steps   | 1495000  |
| Running Forward KL  | 10.7     |
| Running Reverse KL  | 7.57     |
| Running Update Time | 299      |
----------------------------------
--2024-08-11 16:45:58.248973 UTC---
| Itration            | 300       |
| Real Det Return     | 4.21e+03  |
| Real Sto Return     | 3.78e+03  |
| Reward Loss         | -3.73e+06 |
| Running Env Steps   | 1500000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 75.8      |
| Running Update Time | 300       |
-----------------------------------
--2024-08-11 16:47:27.095374 UTC--
| Itration            | 301      |
| Real Det Return     | 590      |
| Real Sto Return     | 2.36e+03 |
| Reward Loss         | -6.6e+06 |
| Running Env Steps   | 1505000  |
| Running Forward KL  | 12.9     |
| Running Reverse KL  | 221      |
| Running Update Time | 301      |
----------------------------------
--2024-08-11 16:49:20.135538 UTC---
| Itration            | 302       |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -2.75e+06 |
| Running Env Steps   | 1510000   |
| Running Forward KL  | 10.5      |
| Running Reverse KL  | 41.2      |
| Running Update Time | 302       |
-----------------------------------
--2024-08-11 16:51:18.647509 UTC---
| Itration            | 303       |
| Real Det Return     | 4.36e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 1515000   |
| Running Forward KL  | 9.47      |
| Running Reverse KL  | 7.59      |
| Running Update Time | 303       |
-----------------------------------
--2024-08-11 16:53:13.365739 UTC--
| Itration            | 304      |
| Real Det Return     | 4.35e+03 |
| Real Sto Return     | 4.29e+03 |
| Reward Loss         | -2.4e+06 |
| Running Env Steps   | 1520000  |
| Running Forward KL  | 11.1     |
| Running Reverse KL  | 7.89     |
| Running Update Time | 304      |
----------------------------------
--2024-08-11 16:55:08.294108 UTC---
| Itration            | 305       |
| Real Det Return     | 4.51e+03  |
| Real Sto Return     | 4.44e+03  |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 1525000   |
| Running Forward KL  | 9.36      |
| Running Reverse KL  | 7.12      |
| Running Update Time | 305       |
-----------------------------------
--2024-08-11 16:57:04.891702 UTC---
| Itration            | 306       |
| Real Det Return     | 4.46e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -2.32e+06 |
| Running Env Steps   | 1530000   |
| Running Forward KL  | 9.85      |
| Running Reverse KL  | 6.94      |
| Running Update Time | 306       |
-----------------------------------
--2024-08-11 16:58:54.581273 UTC---
| Itration            | 307       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 3.41e+03  |
| Reward Loss         | -3.78e+06 |
| Running Env Steps   | 1535000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 103       |
| Running Update Time | 307       |
-----------------------------------
--2024-08-11 17:00:48.167698 UTC---
| Itration            | 308       |
| Real Det Return     | 4.62e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 1540000   |
| Running Forward KL  | 11.5      |
| Running Reverse KL  | 8.99      |
| Running Update Time | 308       |
-----------------------------------
--2024-08-11 17:02:45.142863 UTC---
| Itration            | 309       |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -1.95e+06 |
| Running Env Steps   | 1545000   |
| Running Forward KL  | 10.2      |
| Running Reverse KL  | 7.67      |
| Running Update Time | 309       |
-----------------------------------
--2024-08-11 17:04:14.421710 UTC---
| Itration            | 310       |
| Real Det Return     | 934       |
| Real Sto Return     | 2.42e+03  |
| Reward Loss         | -4.73e+06 |
| Running Env Steps   | 1550000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 176       |
| Running Update Time | 310       |
-----------------------------------
--2024-08-11 17:06:11.486213 UTC---
| Itration            | 311       |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 4.35e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 1555000   |
| Running Forward KL  | 10.5      |
| Running Reverse KL  | 7.45      |
| Running Update Time | 311       |
-----------------------------------
--2024-08-11 17:08:05.487531 UTC---
| Itration            | 312       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 4.27e+03  |
| Reward Loss         | -1.92e+06 |
| Running Env Steps   | 1560000   |
| Running Forward KL  | 10.2      |
| Running Reverse KL  | 7.13      |
| Running Update Time | 312       |
-----------------------------------
--2024-08-11 17:10:01.588616 UTC---
| Itration            | 313       |
| Real Det Return     | 4.18e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -2.84e+06 |
| Running Env Steps   | 1565000   |
| Running Forward KL  | 11.4      |
| Running Reverse KL  | 8.53      |
| Running Update Time | 313       |
-----------------------------------
--2024-08-11 17:11:58.389741 UTC---
| Itration            | 314       |
| Real Det Return     | 4.68e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.65e+06 |
| Running Env Steps   | 1570000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 7.03      |
| Running Update Time | 314       |
-----------------------------------
--2024-08-11 17:13:53.337931 UTC---
| Itration            | 315       |
| Real Det Return     | 4.53e+03  |
| Real Sto Return     | 4.44e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 1575000   |
| Running Forward KL  | 9.29      |
| Running Reverse KL  | 6.88      |
| Running Update Time | 315       |
-----------------------------------
--2024-08-11 17:15:49.427889 UTC---
| Itration            | 316       |
| Real Det Return     | 4.65e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 1580000   |
| Running Forward KL  | 9.91      |
| Running Reverse KL  | 6.88      |
| Running Update Time | 316       |
-----------------------------------
--2024-08-11 17:17:22.952474 UTC---
| Itration            | 317       |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -2.29e+06 |
| Running Env Steps   | 1585000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 7.03      |
| Running Update Time | 317       |
-----------------------------------
--2024-08-11 17:18:55.020899 UTC---
| Itration            | 318       |
| Real Det Return     | 3.81e+03  |
| Real Sto Return     | 4.33e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 1590000   |
| Running Forward KL  | 9.89      |
| Running Reverse KL  | 26.2      |
| Running Update Time | 318       |
-----------------------------------
--2024-08-11 17:20:29.258273 UTC---
| Itration            | 319       |
| Real Det Return     | 4.56e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -2.45e+06 |
| Running Env Steps   | 1595000   |
| Running Forward KL  | 9.69      |
| Running Reverse KL  | 7.6       |
| Running Update Time | 319       |
-----------------------------------
--2024-08-11 17:22:01.007282 UTC--
| Itration            | 320      |
| Real Det Return     | 4.5e+03  |
| Real Sto Return     | 4.53e+03 |
| Reward Loss         | -1.7e+06 |
| Running Env Steps   | 1600000  |
| Running Forward KL  | 9.89     |
| Running Reverse KL  | 17.9     |
| Running Update Time | 320      |
----------------------------------
--2024-08-11 17:23:37.019762 UTC---
| Itration            | 321       |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -1.61e+06 |
| Running Env Steps   | 1605000   |
| Running Forward KL  | 10        |
| Running Reverse KL  | 7.3       |
| Running Update Time | 321       |
-----------------------------------
--2024-08-11 17:25:09.399761 UTC---
| Itration            | 322       |
| Real Det Return     | 4.72e+03  |
| Real Sto Return     | 4.24e+03  |
| Reward Loss         | -1.64e+06 |
| Running Env Steps   | 1610000   |
| Running Forward KL  | 9.64      |
| Running Reverse KL  | 7.66      |
| Running Update Time | 322       |
-----------------------------------
--2024-08-11 17:26:39.931116 UTC---
| Itration            | 323       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 4.06e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 1615000   |
| Running Forward KL  | 9.87      |
| Running Reverse KL  | 61.2      |
| Running Update Time | 323       |
-----------------------------------
--2024-08-11 17:28:17.304157 UTC---
| Itration            | 324       |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 1620000   |
| Running Forward KL  | 9.65      |
| Running Reverse KL  | 6.08      |
| Running Update Time | 324       |
-----------------------------------
--2024-08-11 17:29:50.853641 UTC---
| Itration            | 325       |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 4.47e+03  |
| Reward Loss         | -2.03e+06 |
| Running Env Steps   | 1625000   |
| Running Forward KL  | 9.43      |
| Running Reverse KL  | 6         |
| Running Update Time | 325       |
-----------------------------------
--2024-08-11 17:31:21.329245 UTC---
| Itration            | 326       |
| Real Det Return     | 4.5e+03   |
| Real Sto Return     | 2.97e+03  |
| Reward Loss         | -2.97e+06 |
| Running Env Steps   | 1630000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 40.6      |
| Running Update Time | 326       |
-----------------------------------
--2024-08-11 17:32:57.699600 UTC---
| Itration            | 327       |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 1635000   |
| Running Forward KL  | 8.72      |
| Running Reverse KL  | 5.26      |
| Running Update Time | 327       |
-----------------------------------
--2024-08-11 17:34:27.801431 UTC---
| Itration            | 328       |
| Real Det Return     | 4.84e+03  |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 1640000   |
| Running Forward KL  | 8.88      |
| Running Reverse KL  | 26.4      |
| Running Update Time | 328       |
-----------------------------------
--2024-08-11 17:36:02.304137 UTC---
| Itration            | 329       |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -1.81e+06 |
| Running Env Steps   | 1645000   |
| Running Forward KL  | 8.89      |
| Running Reverse KL  | 5.59      |
| Running Update Time | 329       |
-----------------------------------
--2024-08-11 17:37:36.549499 UTC---
| Itration            | 330       |
| Real Det Return     | 4.61e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.77e+06 |
| Running Env Steps   | 1650000   |
| Running Forward KL  | 9.9       |
| Running Reverse KL  | 24.7      |
| Running Update Time | 330       |
-----------------------------------
--2024-08-11 17:39:08.678454 UTC---
| Itration            | 331       |
| Real Det Return     | 4.49e+03  |
| Real Sto Return     | 4.39e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 1655000   |
| Running Forward KL  | 8.71      |
| Running Reverse KL  | 16.3      |
| Running Update Time | 331       |
-----------------------------------
--2024-08-11 17:40:44.706103 UTC---
| Itration            | 332       |
| Real Det Return     | 4.62e+03  |
| Real Sto Return     | 4.44e+03  |
| Reward Loss         | -2.24e+06 |
| Running Env Steps   | 1660000   |
| Running Forward KL  | 10        |
| Running Reverse KL  | 6.74      |
| Running Update Time | 332       |
-----------------------------------
--2024-08-11 17:42:17.013467 UTC---
| Itration            | 333       |
| Real Det Return     | 4.35e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.59e+06 |
| Running Env Steps   | 1665000   |
| Running Forward KL  | 9.59      |
| Running Reverse KL  | 6.63      |
| Running Update Time | 333       |
-----------------------------------
--2024-08-11 17:43:51.221796 UTC---
| Itration            | 334       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -2.18e+06 |
| Running Env Steps   | 1670000   |
| Running Forward KL  | 9.01      |
| Running Reverse KL  | 6.02      |
| Running Update Time | 334       |
-----------------------------------
--2024-08-11 17:45:27.087876 UTC---
| Itration            | 335       |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -1.36e+06 |
| Running Env Steps   | 1675000   |
| Running Forward KL  | 8.59      |
| Running Reverse KL  | 5.59      |
| Running Update Time | 335       |
-----------------------------------
--2024-08-11 17:46:59.918298 UTC---
| Itration            | 336       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 1680000   |
| Running Forward KL  | 8.15      |
| Running Reverse KL  | 6.86      |
| Running Update Time | 336       |
-----------------------------------
--2024-08-11 17:48:32.977401 UTC---
| Itration            | 337       |
| Real Det Return     | 4.75e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 1685000   |
| Running Forward KL  | 7.74      |
| Running Reverse KL  | 4.96      |
| Running Update Time | 337       |
-----------------------------------
--2024-08-11 17:50:10.079124 UTC---
| Itration            | 338       |
| Real Det Return     | 4.62e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -2.21e+06 |
| Running Env Steps   | 1690000   |
| Running Forward KL  | 9.6       |
| Running Reverse KL  | 6.17      |
| Running Update Time | 338       |
-----------------------------------
--2024-08-11 17:51:41.488720 UTC---
| Itration            | 339       |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.98e+06 |
| Running Env Steps   | 1695000   |
| Running Forward KL  | 7.91      |
| Running Reverse KL  | 5.33      |
| Running Update Time | 339       |
-----------------------------------
--2024-08-11 17:53:14.616377 UTC---
| Itration            | 340       |
| Real Det Return     | 4.6e+03   |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -2.21e+06 |
| Running Env Steps   | 1700000   |
| Running Forward KL  | 8.6       |
| Running Reverse KL  | 5.92      |
| Running Update Time | 340       |
-----------------------------------
--2024-08-11 17:54:50.214385 UTC---
| Itration            | 341       |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -2.43e+06 |
| Running Env Steps   | 1705000   |
| Running Forward KL  | 8.11      |
| Running Reverse KL  | 5.18      |
| Running Update Time | 341       |
-----------------------------------
--2024-08-11 17:56:21.572885 UTC---
| Itration            | 342       |
| Real Det Return     | 4.51e+03  |
| Real Sto Return     | 4.61e+03  |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 1710000   |
| Running Forward KL  | 8.19      |
| Running Reverse KL  | 14.3      |
| Running Update Time | 342       |
-----------------------------------
--2024-08-11 17:57:57.489671 UTC---
| Itration            | 343       |
| Real Det Return     | 4.88e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 1715000   |
| Running Forward KL  | 8.23      |
| Running Reverse KL  | 5.53      |
| Running Update Time | 343       |
-----------------------------------
--2024-08-11 17:59:30.931427 UTC---
| Itration            | 344       |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -2.48e+06 |
| Running Env Steps   | 1720000   |
| Running Forward KL  | 8.43      |
| Running Reverse KL  | 5.56      |
| Running Update Time | 344       |
-----------------------------------
--2024-08-11 18:01:04.791018 UTC---
| Itration            | 345       |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 4.25e+03  |
| Reward Loss         | -2.18e+06 |
| Running Env Steps   | 1725000   |
| Running Forward KL  | 7.48      |
| Running Reverse KL  | 4.61      |
| Running Update Time | 345       |
-----------------------------------
--2024-08-11 18:02:40.190610 UTC---
| Itration            | 346       |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.61e+03  |
| Reward Loss         | -1.85e+06 |
| Running Env Steps   | 1730000   |
| Running Forward KL  | 8.39      |
| Running Reverse KL  | 5.02      |
| Running Update Time | 346       |
-----------------------------------
--2024-08-11 18:04:13.293675 UTC---
| Itration            | 347       |
| Real Det Return     | 4.67e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.01e+06 |
| Running Env Steps   | 1735000   |
| Running Forward KL  | 8.38      |
| Running Reverse KL  | 5.63      |
| Running Update Time | 347       |
-----------------------------------
--2024-08-11 18:05:46.421953 UTC--
| Itration            | 348      |
| Real Det Return     | 4.12e+03 |
| Real Sto Return     | 4.51e+03 |
| Reward Loss         | -1.9e+06 |
| Running Env Steps   | 1740000  |
| Running Forward KL  | 7.5      |
| Running Reverse KL  | 4.59     |
| Running Update Time | 348      |
----------------------------------
--2024-08-11 18:07:22.928451 UTC---
| Itration            | 349       |
| Real Det Return     | 4.76e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 1745000   |
| Running Forward KL  | 7.91      |
| Running Reverse KL  | 4.37      |
| Running Update Time | 349       |
-----------------------------------
--2024-08-11 18:08:55.956442 UTC--
| Itration            | 350      |
| Real Det Return     | 4.72e+03 |
| Real Sto Return     | 4.62e+03 |
| Reward Loss         | -2e+06   |
| Running Env Steps   | 1750000  |
| Running Forward KL  | 8.05     |
| Running Reverse KL  | 5.03     |
| Running Update Time | 350      |
----------------------------------
--2024-08-11 18:10:29.178266 UTC---
| Itration            | 351       |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -2.27e+06 |
| Running Env Steps   | 1755000   |
| Running Forward KL  | 7.84      |
| Running Reverse KL  | 4.67      |
| Running Update Time | 351       |
-----------------------------------
--2024-08-11 18:12:06.215086 UTC---
| Itration            | 352       |
| Real Det Return     | 4.54e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -2.21e+06 |
| Running Env Steps   | 1760000   |
| Running Forward KL  | 8.09      |
| Running Reverse KL  | 4.79      |
| Running Update Time | 352       |
-----------------------------------
--2024-08-11 18:13:37.980468 UTC---
| Itration            | 353       |
| Real Det Return     | 4.55e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -2.46e+06 |
| Running Env Steps   | 1765000   |
| Running Forward KL  | 8.68      |
| Running Reverse KL  | 5.36      |
| Running Update Time | 353       |
-----------------------------------
--2024-08-11 18:15:14.446905 UTC---
| Itration            | 354       |
| Real Det Return     | 4.59e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 1770000   |
| Running Forward KL  | 7.61      |
| Running Reverse KL  | 4.45      |
| Running Update Time | 354       |
-----------------------------------
--2024-08-11 18:16:47.965942 UTC---
| Itration            | 355       |
| Real Det Return     | 4.53e+03  |
| Real Sto Return     | 4.4e+03   |
| Reward Loss         | -2.12e+06 |
| Running Env Steps   | 1775000   |
| Running Forward KL  | 8.38      |
| Running Reverse KL  | 5.58      |
| Running Update Time | 355       |
-----------------------------------
--2024-08-11 18:18:20.143230 UTC---
| Itration            | 356       |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.12e+06 |
| Running Env Steps   | 1780000   |
| Running Forward KL  | 7.85      |
| Running Reverse KL  | 32.6      |
| Running Update Time | 356       |
-----------------------------------
--2024-08-11 18:19:55.878706 UTC---
| Itration            | 357       |
| Real Det Return     | 4.47e+03  |
| Real Sto Return     | 4.61e+03  |
| Reward Loss         | -2.26e+06 |
| Running Env Steps   | 1785000   |
| Running Forward KL  | 7.36      |
| Running Reverse KL  | 4.09      |
| Running Update Time | 357       |
-----------------------------------
--2024-08-11 18:21:28.422482 UTC---
| Itration            | 358       |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 1790000   |
| Running Forward KL  | 7.73      |
| Running Reverse KL  | 4.55      |
| Running Update Time | 358       |
-----------------------------------
--2024-08-11 18:22:59.354717 UTC---
| Itration            | 359       |
| Real Det Return     | 4.51e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 1795000   |
| Running Forward KL  | 8.49      |
| Running Reverse KL  | 8.85      |
| Running Update Time | 359       |
-----------------------------------
--2024-08-11 18:24:36.417007 UTC--
| Itration            | 360      |
| Real Det Return     | 4.88e+03 |
| Real Sto Return     | 4.72e+03 |
| Reward Loss         | -1.6e+06 |
| Running Env Steps   | 1800000  |
| Running Forward KL  | 7.15     |
| Running Reverse KL  | 4.1      |
| Running Update Time | 360      |
----------------------------------
--2024-08-11 18:26:09.790457 UTC---
| Itration            | 361       |
| Real Det Return     | 4.46e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -2.05e+06 |
| Running Env Steps   | 1805000   |
| Running Forward KL  | 7.95      |
| Running Reverse KL  | 4.81      |
| Running Update Time | 361       |
-----------------------------------
--2024-08-11 18:27:43.121230 UTC---
| Itration            | 362       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -1.66e+06 |
| Running Env Steps   | 1810000   |
| Running Forward KL  | 7         |
| Running Reverse KL  | 3.91      |
| Running Update Time | 362       |
-----------------------------------
--2024-08-11 18:29:20.415692 UTC---
| Itration            | 363       |
| Real Det Return     | 4.51e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -2.19e+06 |
| Running Env Steps   | 1815000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 4.43      |
| Running Update Time | 363       |
-----------------------------------
--2024-08-11 18:30:53.172476 UTC---
| Itration            | 364       |
| Real Det Return     | 4.64e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.88e+06 |
| Running Env Steps   | 1820000   |
| Running Forward KL  | 7.14      |
| Running Reverse KL  | 4.32      |
| Running Update Time | 364       |
-----------------------------------
--2024-08-11 18:32:25.409013 UTC---
| Itration            | 365       |
| Real Det Return     | 4.46e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -2.32e+06 |
| Running Env Steps   | 1825000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 4.25      |
| Running Update Time | 365       |
-----------------------------------
--2024-08-11 18:34:00.525622 UTC---
| Itration            | 366       |
| Real Det Return     | 4.69e+03  |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 1830000   |
| Running Forward KL  | 7.57      |
| Running Reverse KL  | 4.5       |
| Running Update Time | 366       |
-----------------------------------
--2024-08-11 18:35:32.901425 UTC---
| Itration            | 367       |
| Real Det Return     | 4.45e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -1.98e+06 |
| Running Env Steps   | 1835000   |
| Running Forward KL  | 6.86      |
| Running Reverse KL  | 4.13      |
| Running Update Time | 367       |
-----------------------------------
--2024-08-11 18:37:08.703257 UTC---
| Itration            | 368       |
| Real Det Return     | 4.1e+03   |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -2.55e+06 |
| Running Env Steps   | 1840000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 4.19      |
| Running Update Time | 368       |
-----------------------------------
--2024-08-11 18:38:42.394494 UTC---
| Itration            | 369       |
| Real Det Return     | 4.47e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -2.74e+06 |
| Running Env Steps   | 1845000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 369       |
-----------------------------------
--2024-08-11 18:40:15.619865 UTC---
| Itration            | 370       |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.66e+06 |
| Running Env Steps   | 1850000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 3.21      |
| Running Update Time | 370       |
-----------------------------------
--2024-08-11 18:41:52.211851 UTC---
| Itration            | 371       |
| Real Det Return     | 4.56e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 1855000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 4.33      |
| Running Update Time | 371       |
-----------------------------------
--2024-08-11 18:43:24.661273 UTC---
| Itration            | 372       |
| Real Det Return     | 4.25e+03  |
| Real Sto Return     | 4.26e+03  |
| Reward Loss         | -3.01e+06 |
| Running Env Steps   | 1860000   |
| Running Forward KL  | 7.36      |
| Running Reverse KL  | 15.8      |
| Running Update Time | 372       |
-----------------------------------
--2024-08-11 18:44:59.253897 UTC---
| Itration            | 373       |
| Real Det Return     | 4.47e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -2.74e+06 |
| Running Env Steps   | 1865000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 3.72      |
| Running Update Time | 373       |
-----------------------------------
--2024-08-11 18:46:35.730806 UTC---
| Itration            | 374       |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -1.55e+06 |
| Running Env Steps   | 1870000   |
| Running Forward KL  | 7.44      |
| Running Reverse KL  | 4.22      |
| Running Update Time | 374       |
-----------------------------------
--2024-08-11 18:48:08.920543 UTC---
| Itration            | 375       |
| Real Det Return     | 4.69e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -2.23e+06 |
| Running Env Steps   | 1875000   |
| Running Forward KL  | 7.36      |
| Running Reverse KL  | 9.02      |
| Running Update Time | 375       |
-----------------------------------
--2024-08-11 18:49:42.535729 UTC---
| Itration            | 376       |
| Real Det Return     | 4.46e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -2.72e+06 |
| Running Env Steps   | 1880000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 3.89      |
| Running Update Time | 376       |
-----------------------------------
--2024-08-11 18:51:19.387488 UTC---
| Itration            | 377       |
| Real Det Return     | 4.28e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -2.22e+06 |
| Running Env Steps   | 1885000   |
| Running Forward KL  | 6.69      |
| Running Reverse KL  | 4.09      |
| Running Update Time | 377       |
-----------------------------------
--2024-08-11 18:52:51.067740 UTC---
| Itration            | 378       |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 1890000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 3.72      |
| Running Update Time | 378       |
-----------------------------------
--2024-08-11 18:54:26.178966 UTC---
| Itration            | 379       |
| Real Det Return     | 4.59e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 1895000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 4.17      |
| Running Update Time | 379       |
-----------------------------------
--2024-08-11 18:56:00.577835 UTC---
| Itration            | 380       |
| Real Det Return     | 4.56e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -2.16e+06 |
| Running Env Steps   | 1900000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 380       |
-----------------------------------
--2024-08-11 18:57:32.080271 UTC---
| Itration            | 381       |
| Real Det Return     | 4.2e+03   |
| Real Sto Return     | 4.42e+03  |
| Reward Loss         | -2.23e+06 |
| Running Env Steps   | 1905000   |
| Running Forward KL  | 6.87      |
| Running Reverse KL  | 24.2      |
| Running Update Time | 381       |
-----------------------------------
--2024-08-11 18:59:06.805961 UTC---
| Itration            | 382       |
| Real Det Return     | 4.8e+03   |
| Real Sto Return     | 4.35e+03  |
| Reward Loss         | -2.33e+06 |
| Running Env Steps   | 1910000   |
| Running Forward KL  | 6.98      |
| Running Reverse KL  | 33.9      |
| Running Update Time | 382       |
-----------------------------------
--2024-08-11 19:00:41.186325 UTC---
| Itration            | 383       |
| Real Det Return     | 4.56e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.92e+06 |
| Running Env Steps   | 1915000   |
| Running Forward KL  | 6.31      |
| Running Reverse KL  | 3.45      |
| Running Update Time | 383       |
-----------------------------------
--2024-08-11 19:02:14.407225 UTC---
| Itration            | 384       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.33e+06 |
| Running Env Steps   | 1920000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 3.22      |
| Running Update Time | 384       |
-----------------------------------
--2024-08-11 19:03:50.240266 UTC---
| Itration            | 385       |
| Real Det Return     | 4.63e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.14e+06 |
| Running Env Steps   | 1925000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 4.26      |
| Running Update Time | 385       |
-----------------------------------
--2024-08-11 19:05:21.647924 UTC---
| Itration            | 386       |
| Real Det Return     | 4.42e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 1930000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 3.8       |
| Running Update Time | 386       |
-----------------------------------
--2024-08-11 19:06:44.792856 UTC---
| Itration            | 387       |
| Real Det Return     | 1.23e+03  |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -4.47e+06 |
| Running Env Steps   | 1935000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 45.1      |
| Running Update Time | 387       |
-----------------------------------
--2024-08-11 19:08:18.396595 UTC---
| Itration            | 388       |
| Real Det Return     | 4.47e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.42e+06 |
| Running Env Steps   | 1940000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 388       |
-----------------------------------
--2024-08-11 19:09:49.240708 UTC---
| Itration            | 389       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 3.91e+03  |
| Reward Loss         | -3.08e+06 |
| Running Env Steps   | 1945000   |
| Running Forward KL  | 7.91      |
| Running Reverse KL  | 66.6      |
| Running Update Time | 389       |
-----------------------------------
--2024-08-11 19:11:26.426861 UTC---
| Itration            | 390       |
| Real Det Return     | 4.45e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -2.71e+06 |
| Running Env Steps   | 1950000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 4.25      |
| Running Update Time | 390       |
-----------------------------------
--2024-08-11 19:12:58.946481 UTC---
| Itration            | 391       |
| Real Det Return     | 4.46e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -2.33e+06 |
| Running Env Steps   | 1955000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 30.4      |
| Running Update Time | 391       |
-----------------------------------
--2024-08-11 19:14:31.069744 UTC---
| Itration            | 392       |
| Real Det Return     | 4.72e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.72e+06 |
| Running Env Steps   | 1960000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 2.99      |
| Running Update Time | 392       |
-----------------------------------
--2024-08-11 19:16:08.106766 UTC---
| Itration            | 393       |
| Real Det Return     | 4.94e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 1965000   |
| Running Forward KL  | 5.69      |
| Running Reverse KL  | 4.62      |
| Running Update Time | 393       |
-----------------------------------
--2024-08-11 19:17:40.242413 UTC---
| Itration            | 394       |
| Real Det Return     | 4.6e+03   |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 1970000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 3.73      |
| Running Update Time | 394       |
-----------------------------------
--2024-08-11 19:19:10.183431 UTC---
| Itration            | 395       |
| Real Det Return     | 4.9e+03   |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 1975000   |
| Running Forward KL  | 6.69      |
| Running Reverse KL  | 3.6       |
| Running Update Time | 395       |
-----------------------------------
--2024-08-11 19:20:46.609096 UTC---
| Itration            | 396       |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 1980000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 3.49      |
| Running Update Time | 396       |
-----------------------------------
--2024-08-11 19:22:19.044401 UTC--
| Itration            | 397      |
| Real Det Return     | 4.67e+03 |
| Real Sto Return     | 4.6e+03  |
| Reward Loss         | -2.3e+06 |
| Running Env Steps   | 1985000  |
| Running Forward KL  | 7.06     |
| Running Reverse KL  | 13.9     |
| Running Update Time | 397      |
----------------------------------
--2024-08-11 19:23:54.059637 UTC---
| Itration            | 398       |
| Real Det Return     | 4.61e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -1.96e+06 |
| Running Env Steps   | 1990000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 3.59      |
| Running Update Time | 398       |
-----------------------------------
--2024-08-11 19:25:26.812148 UTC---
| Itration            | 399       |
| Real Det Return     | 4.72e+03  |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -1.83e+06 |
| Running Env Steps   | 1995000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 58.6      |
| Running Update Time | 399       |
-----------------------------------
--2024-08-11 19:26:59.592169 UTC---
| Itration            | 400       |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 2000000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 3.28      |
| Running Update Time | 400       |
-----------------------------------
--2024-08-11 19:28:26.910538 UTC---
| Itration            | 401       |
| Real Det Return     | 2.72e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 2005000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 4.05      |
| Running Update Time | 401       |
-----------------------------------
--2024-08-11 19:29:59.060519 UTC---
| Itration            | 402       |
| Real Det Return     | 4.8e+03   |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 2010000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 3.49      |
| Running Update Time | 402       |
-----------------------------------
--2024-08-11 19:31:31.620000 UTC---
| Itration            | 403       |
| Real Det Return     | 4.6e+03   |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -2.11e+06 |
| Running Env Steps   | 2015000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 403       |
-----------------------------------
--2024-08-11 19:33:07.897964 UTC---
| Itration            | 404       |
| Real Det Return     | 4.84e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -1.57e+06 |
| Running Env Steps   | 2020000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 404       |
-----------------------------------
--2024-08-11 19:34:19.214741 UTC---
| Itration            | 405       |
| Real Det Return     | 169       |
| Real Sto Return     | 3.68e+03  |
| Reward Loss         | -1.74e+06 |
| Running Env Steps   | 2025000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 42        |
| Running Update Time | 405       |
-----------------------------------
--2024-08-11 19:35:54.887875 UTC---
| Itration            | 406       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 2030000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 2.83      |
| Running Update Time | 406       |
-----------------------------------
--2024-08-11 19:37:24.469323 UTC---
| Itration            | 407       |
| Real Det Return     | 4.52e+03  |
| Real Sto Return     | 3.79e+03  |
| Reward Loss         | -1.98e+06 |
| Running Env Steps   | 2035000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 24.1      |
| Running Update Time | 407       |
-----------------------------------
--2024-08-11 19:38:57.077984 UTC---
| Itration            | 408       |
| Real Det Return     | 4.72e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -1.61e+06 |
| Running Env Steps   | 2040000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 2.9       |
| Running Update Time | 408       |
-----------------------------------
--2024-08-11 19:40:33.261971 UTC--
| Itration            | 409      |
| Real Det Return     | 4.84e+03 |
| Real Sto Return     | 4.83e+03 |
| Reward Loss         | -1.3e+06 |
| Running Env Steps   | 2045000  |
| Running Forward KL  | 6.34     |
| Running Reverse KL  | 3.81     |
| Running Update Time | 409      |
----------------------------------
--2024-08-11 19:42:05.652085 UTC--
| Itration            | 410      |
| Real Det Return     | 4.63e+03 |
| Real Sto Return     | 4.74e+03 |
| Reward Loss         | -1.7e+06 |
| Running Env Steps   | 2050000  |
| Running Forward KL  | 6.03     |
| Running Reverse KL  | 3.91     |
| Running Update Time | 410      |
----------------------------------
--2024-08-11 19:43:38.842576 UTC---
| Itration            | 411       |
| Real Det Return     | 4.84e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 2055000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 2.84      |
| Running Update Time | 411       |
-----------------------------------
--2024-08-11 19:45:13.852539 UTC---
| Itration            | 412       |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -2.21e+06 |
| Running Env Steps   | 2060000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 3.97      |
| Running Update Time | 412       |
-----------------------------------
--2024-08-11 19:46:45.685806 UTC---
| Itration            | 413       |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -2.02e+06 |
| Running Env Steps   | 2065000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 4.3       |
| Running Update Time | 413       |
-----------------------------------
--2024-08-11 19:48:19.896808 UTC---
| Itration            | 414       |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.73e+06 |
| Running Env Steps   | 2070000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 4         |
| Running Update Time | 414       |
-----------------------------------
--2024-08-11 19:49:53.492287 UTC---
| Itration            | 415       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 2075000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 21.8      |
| Running Update Time | 415       |
-----------------------------------
--2024-08-11 19:51:26.686254 UTC---
| Itration            | 416       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -1.61e+06 |
| Running Env Steps   | 2080000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 3.48      |
| Running Update Time | 416       |
-----------------------------------
--2024-08-11 19:53:02.073043 UTC---
| Itration            | 417       |
| Real Det Return     | 4.86e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -9.53e+05 |
| Running Env Steps   | 2085000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 2.73      |
| Running Update Time | 417       |
-----------------------------------
--2024-08-11 19:54:32.640623 UTC---
| Itration            | 418       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 2090000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 44.7      |
| Running Update Time | 418       |
-----------------------------------
--2024-08-11 19:56:02.794787 UTC---
| Itration            | 419       |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 2095000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 3.28      |
| Running Update Time | 419       |
-----------------------------------
--2024-08-11 19:57:39.205456 UTC---
| Itration            | 420       |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 2100000   |
| Running Forward KL  | 5.14      |
| Running Reverse KL  | 2.42      |
| Running Update Time | 420       |
-----------------------------------
--2024-08-11 19:59:10.479699 UTC---
| Itration            | 421       |
| Real Det Return     | 4.62e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 2105000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 17.5      |
| Running Update Time | 421       |
-----------------------------------
--2024-08-11 20:00:41.186840 UTC---
| Itration            | 422       |
| Real Det Return     | 4.77e+03  |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -1.83e+06 |
| Running Env Steps   | 2110000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 7.9       |
| Running Update Time | 422       |
-----------------------------------
--2024-08-11 20:02:14.963012 UTC---
| Itration            | 423       |
| Real Det Return     | 4.72e+03  |
| Real Sto Return     | 4.4e+03   |
| Reward Loss         | -1.81e+06 |
| Running Env Steps   | 2115000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 21.1      |
| Running Update Time | 423       |
-----------------------------------
--2024-08-11 20:03:47.083434 UTC---
| Itration            | 424       |
| Real Det Return     | 4.93e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 2120000   |
| Running Forward KL  | 4.97      |
| Running Reverse KL  | 3.74      |
| Running Update Time | 424       |
-----------------------------------
--2024-08-11 20:05:15.054116 UTC---
| Itration            | 425       |
| Real Det Return     | 4.45e+03  |
| Real Sto Return     | 2.86e+03  |
| Reward Loss         | -6.97e+06 |
| Running Env Steps   | 2125000   |
| Running Forward KL  | 7.07      |
| Running Reverse KL  | 159       |
| Running Update Time | 425       |
-----------------------------------
--2024-08-11 20:06:48.451943 UTC--
| Itration            | 426      |
| Real Det Return     | 4.79e+03 |
| Real Sto Return     | 4.83e+03 |
| Reward Loss         | -1.5e+06 |
| Running Env Steps   | 2130000  |
| Running Forward KL  | 5.61     |
| Running Reverse KL  | 2.48     |
| Running Update Time | 426      |
----------------------------------
--2024-08-11 20:08:20.315053 UTC---
| Itration            | 427       |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 2135000   |
| Running Forward KL  | 5.39      |
| Running Reverse KL  | 14        |
| Running Update Time | 427       |
-----------------------------------
--2024-08-11 20:09:56.786846 UTC--
| Itration            | 428      |
| Real Det Return     | 4.43e+03 |
| Real Sto Return     | 4.76e+03 |
| Reward Loss         | -1.6e+06 |
| Running Env Steps   | 2140000  |
| Running Forward KL  | 5.63     |
| Running Reverse KL  | 3        |
| Running Update Time | 428      |
----------------------------------
--2024-08-11 20:11:29.049788 UTC---
| Itration            | 429       |
| Real Det Return     | 4.68e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -1.84e+06 |
| Running Env Steps   | 2145000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 2.62      |
| Running Update Time | 429       |
-----------------------------------
--2024-08-11 20:13:01.823702 UTC---
| Itration            | 430       |
| Real Det Return     | 4.99e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 2150000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 2.65      |
| Running Update Time | 430       |
-----------------------------------
--2024-08-11 20:14:39.489978 UTC---
| Itration            | 431       |
| Real Det Return     | 4.88e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 2155000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 2.89      |
| Running Update Time | 431       |
-----------------------------------
--2024-08-11 20:16:11.696097 UTC---
| Itration            | 432       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 2160000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 432       |
-----------------------------------
--2024-08-11 20:17:46.039045 UTC---
| Itration            | 433       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -8.32e+05 |
| Running Env Steps   | 2165000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 25.2      |
| Running Update Time | 433       |
-----------------------------------
--2024-08-11 20:19:21.373358 UTC---
| Itration            | 434       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 2170000   |
| Running Forward KL  | 5.25      |
| Running Reverse KL  | 2.36      |
| Running Update Time | 434       |
-----------------------------------
--2024-08-11 20:20:52.299042 UTC---
| Itration            | 435       |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.32e+03  |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 2175000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 83.8      |
| Running Update Time | 435       |
-----------------------------------
--2024-08-11 20:22:28.903880 UTC---
| Itration            | 436       |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 2180000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 3         |
| Running Update Time | 436       |
-----------------------------------
--2024-08-11 20:24:02.826313 UTC---
| Itration            | 437       |
| Real Det Return     | 4.77e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -1.49e+06 |
| Running Env Steps   | 2185000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 2.32      |
| Running Update Time | 437       |
-----------------------------------
--2024-08-11 20:25:34.736939 UTC---
| Itration            | 438       |
| Real Det Return     | 4.78e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.77e+06 |
| Running Env Steps   | 2190000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 30.6      |
| Running Update Time | 438       |
-----------------------------------
--2024-08-11 20:27:11.350073 UTC---
| Itration            | 439       |
| Real Det Return     | 4.81e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.33e+06 |
| Running Env Steps   | 2195000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 48.7      |
| Running Update Time | 439       |
-----------------------------------
--2024-08-11 20:28:43.792707 UTC---
| Itration            | 440       |
| Real Det Return     | 4.94e+03  |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 2200000   |
| Running Forward KL  | 5.03      |
| Running Reverse KL  | 18.2      |
| Running Update Time | 440       |
-----------------------------------
--2024-08-11 20:30:16.817369 UTC---
| Itration            | 441       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 2205000   |
| Running Forward KL  | 5.91      |
| Running Reverse KL  | 16.2      |
| Running Update Time | 441       |
-----------------------------------
--2024-08-11 20:31:52.225977 UTC--
| Itration            | 442      |
| Real Det Return     | 4.54e+03 |
| Real Sto Return     | 4.41e+03 |
| Reward Loss         | -1.6e+06 |
| Running Env Steps   | 2210000  |
| Running Forward KL  | 5.03     |
| Running Reverse KL  | 29.7     |
| Running Update Time | 442      |
----------------------------------
--2024-08-11 20:33:24.175001 UTC---
| Itration            | 443       |
| Real Det Return     | 4.81e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -1.77e+06 |
| Running Env Steps   | 2215000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.7       |
| Running Update Time | 443       |
-----------------------------------
--2024-08-11 20:34:56.963707 UTC---
| Itration            | 444       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.47e+03  |
| Reward Loss         | -9.51e+05 |
| Running Env Steps   | 2220000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 59.7      |
| Running Update Time | 444       |
-----------------------------------
--2024-08-11 20:36:31.903730 UTC---
| Itration            | 445       |
| Real Det Return     | 4.89e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 2225000   |
| Running Forward KL  | 5.07      |
| Running Reverse KL  | 20.3      |
| Running Update Time | 445       |
-----------------------------------
--2024-08-11 20:38:05.135747 UTC---
| Itration            | 446       |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 2230000   |
| Running Forward KL  | 5.38      |
| Running Reverse KL  | 2.41      |
| Running Update Time | 446       |
-----------------------------------
--2024-08-11 20:39:40.148021 UTC---
| Itration            | 447       |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 2235000   |
| Running Forward KL  | 5.42      |
| Running Reverse KL  | 2.77      |
| Running Update Time | 447       |
-----------------------------------
--2024-08-11 20:41:08.775738 UTC---
| Itration            | 448       |
| Real Det Return     | 4.58e+03  |
| Real Sto Return     | 3.39e+03  |
| Reward Loss         | -1.88e+06 |
| Running Env Steps   | 2240000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 77.7      |
| Running Update Time | 448       |
-----------------------------------
--2024-08-11 20:42:40.805186 UTC---
| Itration            | 449       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -9.44e+05 |
| Running Env Steps   | 2245000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 9.46      |
| Running Update Time | 449       |
-----------------------------------
--2024-08-11 20:44:17.577001 UTC---
| Itration            | 450       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 2250000   |
| Running Forward KL  | 4.66      |
| Running Reverse KL  | 2.43      |
| Running Update Time | 450       |
-----------------------------------
--2024-08-11 20:45:51.511109 UTC--
| Itration            | 451      |
| Real Det Return     | 4.8e+03  |
| Real Sto Return     | 4.84e+03 |
| Reward Loss         | -9.7e+05 |
| Running Env Steps   | 2255000  |
| Running Forward KL  | 4.44     |
| Running Reverse KL  | 1.52     |
| Running Update Time | 451      |
----------------------------------
--2024-08-11 20:47:20.219439 UTC---
| Itration            | 452       |
| Real Det Return     | 4.75e+03  |
| Real Sto Return     | 4.12e+03  |
| Reward Loss         | -1.55e+06 |
| Running Env Steps   | 2260000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 59.4      |
| Running Update Time | 452       |
-----------------------------------
--2024-08-11 20:48:56.604863 UTC---
| Itration            | 453       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -9.09e+05 |
| Running Env Steps   | 2265000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 2.05      |
| Running Update Time | 453       |
-----------------------------------
--2024-08-11 20:50:28.326683 UTC--
| Itration            | 454      |
| Real Det Return     | 4.97e+03 |
| Real Sto Return     | 4.7e+03  |
| Reward Loss         | -8.6e+05 |
| Running Env Steps   | 2270000  |
| Running Forward KL  | 5.42     |
| Running Reverse KL  | 2.9      |
| Running Update Time | 454      |
----------------------------------
--2024-08-11 20:52:03.415414 UTC---
| Itration            | 455       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 2275000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 2.2       |
| Running Update Time | 455       |
-----------------------------------
--2024-08-11 20:53:37.488056 UTC--
| Itration            | 456      |
| Real Det Return     | 4.89e+03 |
| Real Sto Return     | 4.78e+03 |
| Reward Loss         | -1.6e+06 |
| Running Env Steps   | 2280000  |
| Running Forward KL  | 5.96     |
| Running Reverse KL  | 18.5     |
| Running Update Time | 456      |
----------------------------------
--2024-08-11 20:55:08.008541 UTC---
| Itration            | 457       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -8.69e+05 |
| Running Env Steps   | 2285000   |
| Running Forward KL  | 5.18      |
| Running Reverse KL  | 3.24      |
| Running Update Time | 457       |
-----------------------------------
--2024-08-11 20:56:34.374422 UTC---
| Itration            | 458       |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 2.72e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 2290000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 163       |
| Running Update Time | 458       |
-----------------------------------
--2024-08-11 20:58:06.781819 UTC---
| Itration            | 459       |
| Real Det Return     | 4.9e+03   |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 2295000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 3.07      |
| Running Update Time | 459       |
-----------------------------------
--2024-08-11 20:59:41.156484 UTC---
| Itration            | 460       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 2300000   |
| Running Forward KL  | 4.83      |
| Running Reverse KL  | 2.57      |
| Running Update Time | 460       |
-----------------------------------
--2024-08-11 21:01:17.178827 UTC---
| Itration            | 461       |
| Real Det Return     | 4.99e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 2305000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 74        |
| Running Update Time | 461       |
-----------------------------------
--2024-08-11 21:02:50.166150 UTC---
| Itration            | 462       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -9.12e+05 |
| Running Env Steps   | 2310000   |
| Running Forward KL  | 5.27      |
| Running Reverse KL  | 9.19      |
| Running Update Time | 462       |
-----------------------------------
--2024-08-11 21:04:21.162860 UTC---
| Itration            | 463       |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -1.86e+06 |
| Running Env Steps   | 2315000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 13.8      |
| Running Update Time | 463       |
-----------------------------------
--2024-08-11 21:05:58.055775 UTC---
| Itration            | 464       |
| Real Det Return     | 4.99e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -1.28e+06 |
| Running Env Steps   | 2320000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 464       |
-----------------------------------
--2024-08-11 21:07:29.771252 UTC---
| Itration            | 465       |
| Real Det Return     | 4.84e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 2325000   |
| Running Forward KL  | 5.18      |
| Running Reverse KL  | 19.3      |
| Running Update Time | 465       |
-----------------------------------
--2024-08-11 21:09:02.720971 UTC---
| Itration            | 466       |
| Real Det Return     | 4.83e+03  |
| Real Sto Return     | 4.39e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 2330000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 38.6      |
| Running Update Time | 466       |
-----------------------------------
--2024-08-11 21:10:35.246088 UTC---
| Itration            | 467       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -5.48e+05 |
| Running Env Steps   | 2335000   |
| Running Forward KL  | 5.28      |
| Running Reverse KL  | 60.8      |
| Running Update Time | 467       |
-----------------------------------
--2024-08-11 21:12:09.155022 UTC---
| Itration            | 468       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 2340000   |
| Running Forward KL  | 4.97      |
| Running Reverse KL  | 2.09      |
| Running Update Time | 468       |
-----------------------------------
--2024-08-11 21:13:43.525435 UTC---
| Itration            | 469       |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -6.23e+05 |
| Running Env Steps   | 2345000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 22.9      |
| Running Update Time | 469       |
-----------------------------------
--2024-08-11 21:15:15.454142 UTC---
| Itration            | 470       |
| Real Det Return     | 5e+03     |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 2350000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 27.2      |
| Running Update Time | 470       |
-----------------------------------
--2024-08-11 21:16:45.375539 UTC---
| Itration            | 471       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -5.42e+05 |
| Running Env Steps   | 2355000   |
| Running Forward KL  | 4.88      |
| Running Reverse KL  | 28.5      |
| Running Update Time | 471       |
-----------------------------------
--2024-08-11 21:18:22.151168 UTC---
| Itration            | 472       |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -6.56e+05 |
| Running Env Steps   | 2360000   |
| Running Forward KL  | 4.53      |
| Running Reverse KL  | 1.73      |
| Running Update Time | 472       |
-----------------------------------
--2024-08-11 21:19:55.020695 UTC---
| Itration            | 473       |
| Real Det Return     | 4.83e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -8.73e+05 |
| Running Env Steps   | 2365000   |
| Running Forward KL  | 4.87      |
| Running Reverse KL  | 2.22      |
| Running Update Time | 473       |
-----------------------------------
--2024-08-11 21:21:25.326462 UTC---
| Itration            | 474       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -9.58e+05 |
| Running Env Steps   | 2370000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 40        |
| Running Update Time | 474       |
-----------------------------------
--2024-08-11 21:22:59.487713 UTC---
| Itration            | 475       |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.1e+03   |
| Reward Loss         | -9.57e+05 |
| Running Env Steps   | 2375000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 44.9      |
| Running Update Time | 475       |
-----------------------------------
--2024-08-11 21:24:12.272721 UTC---
| Itration            | 476       |
| Real Det Return     | 4.69e+03  |
| Real Sto Return     | 763       |
| Reward Loss         | -1.78e+07 |
| Running Env Steps   | 2380000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 390       |
| Running Update Time | 476       |
-----------------------------------
--2024-08-11 21:25:47.799849 UTC---
| Itration            | 477       |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 2385000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 9.56      |
| Running Update Time | 477       |
-----------------------------------
--2024-08-11 21:27:15.507990 UTC---
| Itration            | 478       |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 3.39e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 2390000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 33.4      |
| Running Update Time | 478       |
-----------------------------------
--2024-08-11 21:28:47.160905 UTC---
| Itration            | 479       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 2395000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 10.4      |
| Running Update Time | 479       |
-----------------------------------
--2024-08-11 21:30:23.367189 UTC---
| Itration            | 480       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -6.76e+05 |
| Running Env Steps   | 2400000   |
| Running Forward KL  | 5.38      |
| Running Reverse KL  | 48.5      |
| Running Update Time | 480       |
-----------------------------------
--2024-08-11 21:31:53.956989 UTC---
| Itration            | 481       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -7.03e+05 |
| Running Env Steps   | 2405000   |
| Running Forward KL  | 5.28      |
| Running Reverse KL  | 16.7      |
| Running Update Time | 481       |
-----------------------------------
--2024-08-11 21:33:25.288842 UTC---
| Itration            | 482       |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 2410000   |
| Running Forward KL  | 5.31      |
| Running Reverse KL  | 77        |
| Running Update Time | 482       |
-----------------------------------
--2024-08-11 21:34:57.318135 UTC---
| Itration            | 483       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.24e+03  |
| Reward Loss         | -6.48e+05 |
| Running Env Steps   | 2415000   |
| Running Forward KL  | 4.81      |
| Running Reverse KL  | 9.3       |
| Running Update Time | 483       |
-----------------------------------
--2024-08-11 21:36:26.020738 UTC---
| Itration            | 484       |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -2.68e+06 |
| Running Env Steps   | 2420000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 169       |
| Running Update Time | 484       |
-----------------------------------
--2024-08-11 21:38:01.997418 UTC---
| Itration            | 485       |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -8.78e+05 |
| Running Env Steps   | 2425000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 2.49      |
| Running Update Time | 485       |
-----------------------------------
--2024-08-11 21:39:34.853976 UTC---
| Itration            | 486       |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 2430000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 13.7      |
| Running Update Time | 486       |
-----------------------------------
--2024-08-11 21:41:06.105538 UTC---
| Itration            | 487       |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -2.11e+06 |
| Running Env Steps   | 2435000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 42.6      |
| Running Update Time | 487       |
-----------------------------------
--2024-08-11 21:42:42.498396 UTC---
| Itration            | 488       |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 2440000   |
| Running Forward KL  | 5.04      |
| Running Reverse KL  | 2.5       |
| Running Update Time | 488       |
-----------------------------------
--2024-08-11 21:44:13.738827 UTC---
| Itration            | 489       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -7.41e+05 |
| Running Env Steps   | 2445000   |
| Running Forward KL  | 4.94      |
| Running Reverse KL  | 6.26      |
| Running Update Time | 489       |
-----------------------------------
--2024-08-11 21:45:46.291116 UTC---
| Itration            | 490       |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -1.64e+06 |
| Running Env Steps   | 2450000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 490       |
-----------------------------------
--2024-08-11 21:47:22.252233 UTC---
| Itration            | 491       |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -2.53e+06 |
| Running Env Steps   | 2455000   |
| Running Forward KL  | 5.85      |
| Running Reverse KL  | 55.7      |
| Running Update Time | 491       |
-----------------------------------
--2024-08-11 21:48:55.003031 UTC---
| Itration            | 492       |
| Real Det Return     | 4.93e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 2460000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 2.49      |
| Running Update Time | 492       |
-----------------------------------
--2024-08-11 21:50:28.304051 UTC---
| Itration            | 493       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -6.98e+05 |
| Running Env Steps   | 2465000   |
| Running Forward KL  | 5.39      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 493       |
-----------------------------------
--2024-08-11 21:52:03.243420 UTC---
| Itration            | 494       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.52e+03  |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 2470000   |
| Running Forward KL  | 5.91      |
| Running Reverse KL  | 34.7      |
| Running Update Time | 494       |
-----------------------------------
--2024-08-11 21:53:35.887111 UTC---
| Itration            | 495       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -8.82e+05 |
| Running Env Steps   | 2475000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 1.97      |
| Running Update Time | 495       |
-----------------------------------
--2024-08-11 21:55:06.978381 UTC--
| Itration            | 496      |
| Real Det Return     | 4.45e+03 |
| Real Sto Return     | 4.3e+03  |
| Reward Loss         | -6.2e+06 |
| Running Env Steps   | 2480000  |
| Running Forward KL  | 6.18     |
| Running Reverse KL  | 127      |
| Running Update Time | 496      |
----------------------------------
--2024-08-11 21:56:42.627651 UTC---
| Itration            | 497       |
| Real Det Return     | 4.96e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -1.57e+06 |
| Running Env Steps   | 2485000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 2.32      |
| Running Update Time | 497       |
-----------------------------------
--2024-08-11 21:58:13.609725 UTC---
| Itration            | 498       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 2490000   |
| Running Forward KL  | 4.74      |
| Running Reverse KL  | 2.34      |
| Running Update Time | 498       |
-----------------------------------
--2024-08-11 21:59:48.664916 UTC---
| Itration            | 499       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -6.65e+05 |
| Running Env Steps   | 2495000   |
| Running Forward KL  | 4.19      |
| Running Reverse KL  | 1.67      |
| Running Update Time | 499       |
-----------------------------------
--2024-08-11 22:01:21.442516 UTC---
| Itration            | 500       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.61e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 2500000   |
| Running Forward KL  | 4.96      |
| Running Reverse KL  | 1.82      |
| Running Update Time | 500       |
-----------------------------------
--2024-08-11 22:02:53.998541 UTC---
| Itration            | 501       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -2.24e+06 |
| Running Env Steps   | 2505000   |
| Running Forward KL  | 5.42      |
| Running Reverse KL  | 33.7      |
| Running Update Time | 501       |
-----------------------------------
--2024-08-11 22:04:30.163529 UTC---
| Itration            | 502       |
| Real Det Return     | 4.85e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -1.64e+06 |
| Running Env Steps   | 2510000   |
| Running Forward KL  | 4.63      |
| Running Reverse KL  | 1.97      |
| Running Update Time | 502       |
-----------------------------------
--2024-08-11 22:06:03.160428 UTC---
| Itration            | 503       |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -8.76e+05 |
| Running Env Steps   | 2515000   |
| Running Forward KL  | 4.53      |
| Running Reverse KL  | 1.77      |
| Running Update Time | 503       |
-----------------------------------
--2024-08-11 22:07:34.351656 UTC---
| Itration            | 504       |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 2520000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 2.36      |
| Running Update Time | 504       |
-----------------------------------
--2024-08-11 22:09:10.049955 UTC---
| Itration            | 505       |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -9.36e+05 |
| Running Env Steps   | 2525000   |
| Running Forward KL  | 5.35      |
| Running Reverse KL  | 31.7      |
| Running Update Time | 505       |
-----------------------------------
--2024-08-11 22:10:39.681452 UTC---
| Itration            | 506       |
| Real Det Return     | 4.57e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 2530000   |
| Running Forward KL  | 5.11      |
| Running Reverse KL  | 35.9      |
| Running Update Time | 506       |
-----------------------------------
--2024-08-11 22:12:12.717727 UTC---
| Itration            | 507       |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.36e+06 |
| Running Env Steps   | 2535000   |
| Running Forward KL  | 4.8       |
| Running Reverse KL  | 32.8      |
| Running Update Time | 507       |
-----------------------------------
--2024-08-11 22:13:56.971546 UTC---
| Itration            | 508       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -8.73e+05 |
| Running Env Steps   | 2540000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 1.95      |
| Running Update Time | 508       |
-----------------------------------
--2024-08-11 22:15:41.476061 UTC--
| Itration            | 509      |
| Real Det Return     | 5.24e+03 |
| Real Sto Return     | 4.18e+03 |
| Reward Loss         | -6.4e+05 |
| Running Env Steps   | 2545000  |
| Running Forward KL  | 4.79     |
| Running Reverse KL  | 35.6     |
| Running Update Time | 509      |
----------------------------------
--2024-08-11 22:17:28.323948 UTC--
| Itration            | 510      |
| Real Det Return     | 5.08e+03 |
| Real Sto Return     | 4.73e+03 |
| Reward Loss         | -1.1e+06 |
| Running Env Steps   | 2550000  |
| Running Forward KL  | 5.26     |
| Running Reverse KL  | 34.1     |
| Running Update Time | 510      |
----------------------------------
--2024-08-11 22:19:15.777009 UTC---
| Itration            | 511       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -4.73e+05 |
| Running Env Steps   | 2555000   |
| Running Forward KL  | 4.21      |
| Running Reverse KL  | 1.69      |
| Running Update Time | 511       |
-----------------------------------
--2024-08-11 22:21:01.579294 UTC---
| Itration            | 512       |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -6.67e+05 |
| Running Env Steps   | 2560000   |
| Running Forward KL  | 4.5       |
| Running Reverse KL  | 2.05      |
| Running Update Time | 512       |
-----------------------------------
--2024-08-11 22:22:50.278067 UTC---
| Itration            | 513       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -9.44e+05 |
| Running Env Steps   | 2565000   |
| Running Forward KL  | 4.85      |
| Running Reverse KL  | 26.8      |
| Running Update Time | 513       |
-----------------------------------
--2024-08-11 22:24:39.311649 UTC--
| Itration            | 514      |
| Real Det Return     | 5e+03    |
| Real Sto Return     | 4.89e+03 |
| Reward Loss         | -1.4e+06 |
| Running Env Steps   | 2570000  |
| Running Forward KL  | 4.56     |
| Running Reverse KL  | 1.73     |
| Running Update Time | 514      |
----------------------------------
--2024-08-11 22:26:26.468565 UTC---
| Itration            | 515       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 2575000   |
| Running Forward KL  | 4.4       |
| Running Reverse KL  | 2.57      |
| Running Update Time | 515       |
-----------------------------------
--2024-08-11 22:28:14.844795 UTC---
| Itration            | 516       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -8.27e+05 |
| Running Env Steps   | 2580000   |
| Running Forward KL  | 4.83      |
| Running Reverse KL  | 1.93      |
| Running Update Time | 516       |
-----------------------------------
--2024-08-11 22:30:02.590310 UTC--
| Itration            | 517      |
| Real Det Return     | 4.65e+03 |
| Real Sto Return     | 4.95e+03 |
| Reward Loss         | -7.6e+05 |
| Running Env Steps   | 2585000  |
| Running Forward KL  | 4.42     |
| Running Reverse KL  | 1.92     |
| Running Update Time | 517      |
----------------------------------
--2024-08-11 22:31:51.324672 UTC---
| Itration            | 518       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 2590000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.33      |
| Running Update Time | 518       |
-----------------------------------
--2024-08-11 22:33:41.461086 UTC---
| Itration            | 519       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -9.42e+05 |
| Running Env Steps   | 2595000   |
| Running Forward KL  | 4.82      |
| Running Reverse KL  | 2.4       |
| Running Update Time | 519       |
-----------------------------------
--2024-08-11 22:35:28.047171 UTC---
| Itration            | 520       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -4.42e+05 |
| Running Env Steps   | 2600000   |
| Running Forward KL  | 4.68      |
| Running Reverse KL  | 14.4      |
| Running Update Time | 520       |
-----------------------------------
--2024-08-11 22:37:15.397297 UTC---
| Itration            | 521       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -7.77e+05 |
| Running Env Steps   | 2605000   |
| Running Forward KL  | 4.57      |
| Running Reverse KL  | 2.07      |
| Running Update Time | 521       |
-----------------------------------
--2024-08-11 22:39:07.189236 UTC---
| Itration            | 522       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -7.35e+05 |
| Running Env Steps   | 2610000   |
| Running Forward KL  | 4.8       |
| Running Reverse KL  | 2.12      |
| Running Update Time | 522       |
-----------------------------------
--2024-08-11 22:40:52.409850 UTC--
| Itration            | 523      |
| Real Det Return     | 5.21e+03 |
| Real Sto Return     | 4.06e+03 |
| Reward Loss         | -7.9e+05 |
| Running Env Steps   | 2615000  |
| Running Forward KL  | 5.07     |
| Running Reverse KL  | 12.3     |
| Running Update Time | 523      |
----------------------------------
--2024-08-11 22:42:41.895222 UTC---
| Itration            | 524       |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -5.98e+05 |
| Running Env Steps   | 2620000   |
| Running Forward KL  | 4.13      |
| Running Reverse KL  | 1.75      |
| Running Update Time | 524       |
-----------------------------------
--2024-08-11 22:44:34.284349 UTC---
| Itration            | 525       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -5.47e+05 |
| Running Env Steps   | 2625000   |
| Running Forward KL  | 5.23      |
| Running Reverse KL  | 2.77      |
| Running Update Time | 525       |
-----------------------------------
--2024-08-11 22:46:23.074409 UTC---
| Itration            | 526       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -5.76e+05 |
| Running Env Steps   | 2630000   |
| Running Forward KL  | 5.34      |
| Running Reverse KL  | 7.02      |
| Running Update Time | 526       |
-----------------------------------
--2024-08-11 22:48:13.859685 UTC---
| Itration            | 527       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 2635000   |
| Running Forward KL  | 5.19      |
| Running Reverse KL  | 45.8      |
| Running Update Time | 527       |
-----------------------------------
--2024-08-11 22:50:05.382914 UTC---
| Itration            | 528       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 2640000   |
| Running Forward KL  | 5.09      |
| Running Reverse KL  | 2.76      |
| Running Update Time | 528       |
-----------------------------------
--2024-08-11 22:51:54.238292 UTC--
| Itration            | 529      |
| Real Det Return     | 5.13e+03 |
| Real Sto Return     | 4.83e+03 |
| Reward Loss         | -9.1e+05 |
| Running Env Steps   | 2645000  |
| Running Forward KL  | 4.94     |
| Running Reverse KL  | 2.17     |
| Running Update Time | 529      |
----------------------------------
--2024-08-11 22:53:47.188664 UTC---
| Itration            | 530       |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -9.71e+05 |
| Running Env Steps   | 2650000   |
| Running Forward KL  | 5.21      |
| Running Reverse KL  | 2.8       |
| Running Update Time | 530       |
-----------------------------------
--2024-08-11 22:55:36.871674 UTC---
| Itration            | 531       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 2655000   |
| Running Forward KL  | 5.48      |
| Running Reverse KL  | 8.53      |
| Running Update Time | 531       |
-----------------------------------
--2024-08-11 22:57:25.701876 UTC---
| Itration            | 532       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 2660000   |
| Running Forward KL  | 5.21      |
| Running Reverse KL  | 36.4      |
| Running Update Time | 532       |
-----------------------------------
--2024-08-11 22:59:17.374225 UTC---
| Itration            | 533       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -8.45e+05 |
| Running Env Steps   | 2665000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 3.01      |
| Running Update Time | 533       |
-----------------------------------
--2024-08-11 23:01:05.344072 UTC---
| Itration            | 534       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -7.07e+05 |
| Running Env Steps   | 2670000   |
| Running Forward KL  | 4.64      |
| Running Reverse KL  | 2.39      |
| Running Update Time | 534       |
-----------------------------------
--2024-08-11 23:02:54.311896 UTC---
| Itration            | 535       |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.93e+06 |
| Running Env Steps   | 2675000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 37.4      |
| Running Update Time | 535       |
-----------------------------------
--2024-08-11 23:04:49.009970 UTC---
| Itration            | 536       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -8.97e+05 |
| Running Env Steps   | 2680000   |
| Running Forward KL  | 4.72      |
| Running Reverse KL  | 2.46      |
| Running Update Time | 536       |
-----------------------------------
--2024-08-11 23:06:39.931574 UTC---
| Itration            | 537       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -7.36e+05 |
| Running Env Steps   | 2685000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 537       |
-----------------------------------
--2024-08-11 23:08:28.928953 UTC---
| Itration            | 538       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -5.09e+05 |
| Running Env Steps   | 2690000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 2.06      |
| Running Update Time | 538       |
-----------------------------------
--2024-08-11 23:10:20.032517 UTC--
| Itration            | 539      |
| Real Det Return     | 5.32e+03 |
| Real Sto Return     | 4.61e+03 |
| Reward Loss         | -8.5e+05 |
| Running Env Steps   | 2695000  |
| Running Forward KL  | 5.51     |
| Running Reverse KL  | 47.3     |
| Running Update Time | 539      |
----------------------------------
--2024-08-11 23:12:08.675704 UTC---
| Itration            | 540       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 2700000   |
| Running Forward KL  | 5.06      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 540       |
-----------------------------------
--2024-08-11 23:14:00.624243 UTC---
| Itration            | 541       |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -8.23e+05 |
| Running Env Steps   | 2705000   |
| Running Forward KL  | 4.36      |
| Running Reverse KL  | 2.25      |
| Running Update Time | 541       |
-----------------------------------
--2024-08-11 23:15:52.270220 UTC---
| Itration            | 542       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -8.04e+05 |
| Running Env Steps   | 2710000   |
| Running Forward KL  | 4.33      |
| Running Reverse KL  | 2.16      |
| Running Update Time | 542       |
-----------------------------------
--2024-08-11 23:17:30.217916 UTC---
| Itration            | 543       |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 2715000   |
| Running Forward KL  | 4.89      |
| Running Reverse KL  | 2.48      |
| Running Update Time | 543       |
-----------------------------------
--2024-08-11 23:19:04.895146 UTC---
| Itration            | 544       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -8.37e+05 |
| Running Env Steps   | 2720000   |
| Running Forward KL  | 4.72      |
| Running Reverse KL  | 2.11      |
| Running Update Time | 544       |
-----------------------------------
--2024-08-11 23:20:35.712463 UTC---
| Itration            | 545       |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 2725000   |
| Running Forward KL  | 4.86      |
| Running Reverse KL  | 22.5      |
| Running Update Time | 545       |
-----------------------------------
--2024-08-11 23:22:08.659938 UTC---
| Itration            | 546       |
| Real Det Return     | 4.92e+03  |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -3.89e+06 |
| Running Env Steps   | 2730000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 43.6      |
| Running Update Time | 546       |
-----------------------------------
--2024-08-11 23:23:44.799154 UTC---
| Itration            | 547       |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -8.45e+05 |
| Running Env Steps   | 2735000   |
| Running Forward KL  | 5.02      |
| Running Reverse KL  | 2.97      |
| Running Update Time | 547       |
-----------------------------------
--2024-08-11 23:25:17.140269 UTC--
| Itration            | 548      |
| Real Det Return     | 5.26e+03 |
| Real Sto Return     | 5.07e+03 |
| Reward Loss         | -7.1e+05 |
| Running Env Steps   | 2740000  |
| Running Forward KL  | 4.56     |
| Running Reverse KL  | 1.73     |
| Running Update Time | 548      |
----------------------------------
--2024-08-11 23:26:48.779091 UTC---
| Itration            | 549       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -7.72e+05 |
| Running Env Steps   | 2745000   |
| Running Forward KL  | 5.03      |
| Running Reverse KL  | 2.65      |
| Running Update Time | 549       |
-----------------------------------
--2024-08-11 23:28:24.764937 UTC--
| Itration            | 550      |
| Real Det Return     | 5.33e+03 |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | -4.8e+05 |
| Running Env Steps   | 2750000  |
| Running Forward KL  | 5.06     |
| Running Reverse KL  | 2.99     |
| Running Update Time | 550      |
----------------------------------
--2024-08-11 23:29:56.820217 UTC---
| Itration            | 551       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -4.42e+05 |
| Running Env Steps   | 2755000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 2.85      |
| Running Update Time | 551       |
-----------------------------------
--2024-08-11 23:31:29.408259 UTC--
| Itration            | 552      |
| Real Det Return     | 5.25e+03 |
| Real Sto Return     | 5.16e+03 |
| Reward Loss         | -4.3e+05 |
| Running Env Steps   | 2760000  |
| Running Forward KL  | 4.72     |
| Running Reverse KL  | 8.57     |
| Running Update Time | 552      |
----------------------------------
--2024-08-11 23:33:03.476671 UTC---
| Itration            | 553       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -5.94e+05 |
| Running Env Steps   | 2765000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 18.8      |
| Running Update Time | 553       |
-----------------------------------
--2024-08-11 23:34:34.470405 UTC---
| Itration            | 554       |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -1.35e+06 |
| Running Env Steps   | 2770000   |
| Running Forward KL  | 4.99      |
| Running Reverse KL  | 37.7      |
| Running Update Time | 554       |
-----------------------------------
--2024-08-11 23:36:09.831534 UTC---
| Itration            | 555       |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -9.91e+05 |
| Running Env Steps   | 2775000   |
| Running Forward KL  | 5.19      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 555       |
-----------------------------------
--2024-08-11 23:37:30.683040 UTC---
| Itration            | 556       |
| Real Det Return     | 4.53e+03  |
| Real Sto Return     | 2.09e+03  |
| Reward Loss         | -2.22e+06 |
| Running Env Steps   | 2780000   |
| Running Forward KL  | 7.71      |
| Running Reverse KL  | 206       |
| Running Update Time | 556       |
-----------------------------------
--2024-08-11 23:39:02.578630 UTC---
| Itration            | 557       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -8.79e+05 |
| Running Env Steps   | 2785000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 3.75      |
| Running Update Time | 557       |
-----------------------------------
--2024-08-11 23:40:36.946709 UTC---
| Itration            | 558       |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -4.33e+05 |
| Running Env Steps   | 2790000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 2.46      |
| Running Update Time | 558       |
-----------------------------------
--2024-08-11 23:42:07.101367 UTC---
| Itration            | 559       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.53e+05 |
| Running Env Steps   | 2795000   |
| Running Forward KL  | 5.06      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 559       |
-----------------------------------
--2024-08-11 23:43:40.319127 UTC--
| Itration            | 560      |
| Real Det Return     | 5.15e+03 |
| Real Sto Return     | 4.52e+03 |
| Reward Loss         | -9.6e+05 |
| Running Env Steps   | 2800000  |
| Running Forward KL  | 5.03     |
| Running Reverse KL  | 10.7     |
| Running Update Time | 560      |
----------------------------------
--2024-08-11 23:45:18.190653 UTC---
| Itration            | 561       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -9.48e+05 |
| Running Env Steps   | 2805000   |
| Running Forward KL  | 5.13      |
| Running Reverse KL  | 2.62      |
| Running Update Time | 561       |
-----------------------------------
--2024-08-11 23:46:46.465334 UTC---
| Itration            | 562       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -2.52e+06 |
| Running Env Steps   | 2810000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 115       |
| Running Update Time | 562       |
-----------------------------------
--2024-08-11 23:48:20.485854 UTC---
| Itration            | 563       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 2815000   |
| Running Forward KL  | 5.29      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 563       |
-----------------------------------
--2024-08-11 23:49:53.540548 UTC---
| Itration            | 564       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -4.48e+05 |
| Running Env Steps   | 2820000   |
| Running Forward KL  | 4.99      |
| Running Reverse KL  | 2.36      |
| Running Update Time | 564       |
-----------------------------------
--2024-08-11 23:51:25.242092 UTC---
| Itration            | 565       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -7.19e+05 |
| Running Env Steps   | 2825000   |
| Running Forward KL  | 4.86      |
| Running Reverse KL  | 2.26      |
| Running Update Time | 565       |
-----------------------------------
--2024-08-11 23:53:00.949314 UTC---
| Itration            | 566       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -9.08e+05 |
| Running Env Steps   | 2830000   |
| Running Forward KL  | 4.64      |
| Running Reverse KL  | 2.16      |
| Running Update Time | 566       |
-----------------------------------
--2024-08-11 23:54:33.415013 UTC---
| Itration            | 567       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -7.44e+05 |
| Running Env Steps   | 2835000   |
| Running Forward KL  | 5.13      |
| Running Reverse KL  | 2.43      |
| Running Update Time | 567       |
-----------------------------------
--2024-08-11 23:56:06.082581 UTC---
| Itration            | 568       |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -2.15e+06 |
| Running Env Steps   | 2840000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 64.9      |
| Running Update Time | 568       |
-----------------------------------
--2024-08-11 23:57:41.804090 UTC---
| Itration            | 569       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -7.17e+05 |
| Running Env Steps   | 2845000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 2.9       |
| Running Update Time | 569       |
-----------------------------------
--2024-08-11 23:59:11.984634 UTC--
| Itration            | 570      |
| Real Det Return     | 5.39e+03 |
| Real Sto Return     | 5.05e+03 |
| Reward Loss         | -4.2e+05 |
| Running Env Steps   | 2850000  |
| Running Forward KL  | 5.12     |
| Running Reverse KL  | 15.2     |
| Running Update Time | 570      |
----------------------------------
--2024-08-12 00:00:45.598654 UTC---
| Itration            | 571       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -3.98e+05 |
| Running Env Steps   | 2855000   |
| Running Forward KL  | 4.99      |
| Running Reverse KL  | 3.44      |
| Running Update Time | 571       |
-----------------------------------
--2024-08-12 00:02:21.472717 UTC---
| Itration            | 572       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -4.53e+05 |
| Running Env Steps   | 2860000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 36.2      |
| Running Update Time | 572       |
-----------------------------------
--2024-08-12 00:03:54.990115 UTC---
| Itration            | 573       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -4.86e+05 |
| Running Env Steps   | 2865000   |
| Running Forward KL  | 4.64      |
| Running Reverse KL  | 2.01      |
| Running Update Time | 573       |
-----------------------------------
--2024-08-12 00:05:30.024754 UTC---
| Itration            | 574       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -5.37e+05 |
| Running Env Steps   | 2870000   |
| Running Forward KL  | 4.62      |
| Running Reverse KL  | 16.7      |
| Running Update Time | 574       |
-----------------------------------
--2024-08-12 00:07:03.166768 UTC---
| Itration            | 575       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -5.14e+05 |
| Running Env Steps   | 2875000   |
| Running Forward KL  | 4.55      |
| Running Reverse KL  | 2.59      |
| Running Update Time | 575       |
-----------------------------------
--2024-08-12 00:08:33.946180 UTC---
| Itration            | 576       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -8.47e+05 |
| Running Env Steps   | 2880000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 26.7      |
| Running Update Time | 576       |
-----------------------------------
--2024-08-12 00:10:08.411222 UTC---
| Itration            | 577       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -1.49e+06 |
| Running Env Steps   | 2885000   |
| Running Forward KL  | 5.06      |
| Running Reverse KL  | 27.6      |
| Running Update Time | 577       |
-----------------------------------
--2024-08-12 00:11:40.882841 UTC---
| Itration            | 578       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -5.22e+05 |
| Running Env Steps   | 2890000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 2.32      |
| Running Update Time | 578       |
-----------------------------------
--2024-08-12 00:13:12.574453 UTC---
| Itration            | 579       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -9.47e+05 |
| Running Env Steps   | 2895000   |
| Running Forward KL  | 4.66      |
| Running Reverse KL  | 2.08      |
| Running Update Time | 579       |
-----------------------------------
--2024-08-12 00:14:49.293697 UTC---
| Itration            | 580       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -4.57e+05 |
| Running Env Steps   | 2900000   |
| Running Forward KL  | 4.96      |
| Running Reverse KL  | 2.26      |
| Running Update Time | 580       |
-----------------------------------
--2024-08-12 00:16:21.552883 UTC---
| Itration            | 581       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -5.92e+05 |
| Running Env Steps   | 2905000   |
| Running Forward KL  | 5.12      |
| Running Reverse KL  | 2.84      |
| Running Update Time | 581       |
-----------------------------------
--2024-08-12 00:17:54.767626 UTC---
| Itration            | 582       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -7.09e+05 |
| Running Env Steps   | 2910000   |
| Running Forward KL  | 5.14      |
| Running Reverse KL  | 2.43      |
| Running Update Time | 582       |
-----------------------------------
--2024-08-12 00:19:31.349276 UTC---
| Itration            | 583       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -9.14e+05 |
| Running Env Steps   | 2915000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 3.01      |
| Running Update Time | 583       |
-----------------------------------
--2024-08-12 00:21:03.386418 UTC---
| Itration            | 584       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 2920000   |
| Running Forward KL  | 4.54      |
| Running Reverse KL  | 2.09      |
| Running Update Time | 584       |
-----------------------------------
--2024-08-12 00:22:37.312749 UTC---
| Itration            | 585       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -4.27e+05 |
| Running Env Steps   | 2925000   |
| Running Forward KL  | 5.29      |
| Running Reverse KL  | 9.13      |
| Running Update Time | 585       |
-----------------------------------
--2024-08-12 00:24:12.194087 UTC--
| Itration            | 586      |
| Real Det Return     | 5.13e+03 |
| Real Sto Return     | 4.83e+03 |
| Reward Loss         | -1e+06   |
| Running Env Steps   | 2930000  |
| Running Forward KL  | 4.97     |
| Running Reverse KL  | 2.25     |
| Running Update Time | 586      |
----------------------------------
--2024-08-12 00:25:42.064262 UTC---
| Itration            | 587       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 2935000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 36.6      |
| Running Update Time | 587       |
-----------------------------------
--2024-08-12 00:27:17.423324 UTC---
| Itration            | 588       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 2940000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 47.2      |
| Running Update Time | 588       |
-----------------------------------
--2024-08-12 00:28:49.635179 UTC---
| Itration            | 589       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -8.99e+05 |
| Running Env Steps   | 2945000   |
| Running Forward KL  | 5.08      |
| Running Reverse KL  | 31.1      |
| Running Update Time | 589       |
-----------------------------------
--2024-08-12 00:30:22.172110 UTC--
| Itration            | 590      |
| Real Det Return     | 5.1e+03  |
| Real Sto Return     | 5.07e+03 |
| Reward Loss         | -6.4e+05 |
| Running Env Steps   | 2950000  |
| Running Forward KL  | 5.03     |
| Running Reverse KL  | 2.54     |
| Running Update Time | 590      |
----------------------------------
--2024-08-12 00:31:58.766615 UTC---
| Itration            | 591       |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -7.92e+05 |
| Running Env Steps   | 2955000   |
| Running Forward KL  | 4.89      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 591       |
-----------------------------------
--2024-08-12 00:33:30.883164 UTC---
| Itration            | 592       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -3.52e+05 |
| Running Env Steps   | 2960000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 592       |
-----------------------------------
--2024-08-12 00:35:03.666863 UTC---
| Itration            | 593       |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -8.96e+05 |
| Running Env Steps   | 2965000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 3.33      |
| Running Update Time | 593       |
-----------------------------------
--2024-08-12 00:36:37.087688 UTC--
| Itration            | 594      |
| Real Det Return     | 5.25e+03 |
| Real Sto Return     | 4.61e+03 |
| Reward Loss         | -3.7e+05 |
| Running Env Steps   | 2970000  |
| Running Forward KL  | 5.54     |
| Running Reverse KL  | 2.92     |
| Running Update Time | 594      |
----------------------------------
--2024-08-12 00:38:08.635892 UTC---
| Itration            | 595       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -8.48e+05 |
| Running Env Steps   | 2975000   |
| Running Forward KL  | 5.15      |
| Running Reverse KL  | 2.57      |
| Running Update Time | 595       |
-----------------------------------
--2024-08-12 00:39:41.442406 UTC--
| Itration            | 596      |
| Real Det Return     | 5.26e+03 |
| Real Sto Return     | 5.21e+03 |
| Reward Loss         | -5.1e+05 |
| Running Env Steps   | 2980000  |
| Running Forward KL  | 5.4      |
| Running Reverse KL  | 2.69     |
| Running Update Time | 596      |
----------------------------------
--2024-08-12 00:41:17.687042 UTC---
| Itration            | 597       |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -9.57e+05 |
| Running Env Steps   | 2985000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 3.86      |
| Running Update Time | 597       |
-----------------------------------
--2024-08-12 00:42:45.675127 UTC---
| Itration            | 598       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.21e+03  |
| Reward Loss         | -8.82e+05 |
| Running Env Steps   | 2990000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 598       |
-----------------------------------
--2024-08-12 00:44:19.770905 UTC--
| Itration            | 599      |
| Real Det Return     | 5.39e+03 |
| Real Sto Return     | 5.36e+03 |
| Reward Loss         | 2.39e+04 |
| Running Env Steps   | 2995000  |
| Running Forward KL  | 5.31     |
| Running Reverse KL  | 2.95     |
| Running Update Time | 599      |
----------------------------------
--2024-08-12 00:45:54.072374 UTC---
| Itration            | 600       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -6.16e+05 |
| Running Env Steps   | 3000000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 600       |
-----------------------------------
--2024-08-12 00:47:26.135962 UTC---
| Itration            | 601       |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -2.56e+06 |
| Running Env Steps   | 3005000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 79.8      |
| Running Update Time | 601       |
-----------------------------------
--2024-08-12 00:48:59.221720 UTC---
| Itration            | 602       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 3010000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 32.6      |
| Running Update Time | 602       |
-----------------------------------
--2024-08-12 00:50:32.513854 UTC---
| Itration            | 603       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -2.28e+06 |
| Running Env Steps   | 3015000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 38.7      |
| Running Update Time | 603       |
-----------------------------------
--2024-08-12 00:52:04.632176 UTC---
| Itration            | 604       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -4.53e+05 |
| Running Env Steps   | 3020000   |
| Running Forward KL  | 5.34      |
| Running Reverse KL  | 2.57      |
| Running Update Time | 604       |
-----------------------------------
--2024-08-12 00:53:41.408685 UTC---
| Itration            | 605       |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -9.86e+05 |
| Running Env Steps   | 3025000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 2.63      |
| Running Update Time | 605       |
-----------------------------------
--2024-08-12 00:55:13.834668 UTC---
| Itration            | 606       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 3030000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 3.13      |
| Running Update Time | 606       |
-----------------------------------
--2024-08-12 00:56:46.759052 UTC---
| Itration            | 607       |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -3.47e+06 |
| Running Env Steps   | 3035000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 66.7      |
| Running Update Time | 607       |
-----------------------------------
--2024-08-12 00:58:22.781876 UTC---
| Itration            | 608       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.67e+05 |
| Running Env Steps   | 3040000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 3.05      |
| Running Update Time | 608       |
-----------------------------------
--2024-08-12 00:59:55.096848 UTC---
| Itration            | 609       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.83e+06 |
| Running Env Steps   | 3045000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 30.9      |
| Running Update Time | 609       |
-----------------------------------
--2024-08-12 01:01:26.122525 UTC--
| Itration            | 610      |
| Real Det Return     | 5.29e+03 |
| Real Sto Return     | 4.79e+03 |
| Reward Loss         | -4.7e+06 |
| Running Env Steps   | 3050000  |
| Running Forward KL  | 6.69     |
| Running Reverse KL  | 105      |
| Running Update Time | 610      |
----------------------------------
--2024-08-12 01:03:01.696181 UTC---
| Itration            | 611       |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -5.41e+05 |
| Running Env Steps   | 3055000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 3.01      |
| Running Update Time | 611       |
-----------------------------------
--2024-08-12 01:04:32.228291 UTC---
| Itration            | 612       |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 3060000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 22.1      |
| Running Update Time | 612       |
-----------------------------------
--2024-08-12 01:06:06.164649 UTC---
| Itration            | 613       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.16e+06 |
| Running Env Steps   | 3065000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 35.2      |
| Running Update Time | 613       |
-----------------------------------
--2024-08-12 01:07:40.540181 UTC---
| Itration            | 614       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -3.73e+05 |
| Running Env Steps   | 3070000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 3.92      |
| Running Update Time | 614       |
-----------------------------------
--2024-08-12 01:09:12.423148 UTC---
| Itration            | 615       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -8.74e+05 |
| Running Env Steps   | 3075000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 3.16      |
| Running Update Time | 615       |
-----------------------------------
--2024-08-12 01:10:46.879081 UTC---
| Itration            | 616       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 3080000   |
| Running Forward KL  | 5.59      |
| Running Reverse KL  | 31.8      |
| Running Update Time | 616       |
-----------------------------------
--2024-08-12 01:12:18.621184 UTC---
| Itration            | 617       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -8.04e+05 |
| Running Env Steps   | 3085000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.58      |
| Running Update Time | 617       |
-----------------------------------
--2024-08-12 01:13:50.028326 UTC---
| Itration            | 618       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -3.34e+06 |
| Running Env Steps   | 3090000   |
| Running Forward KL  | 6.78      |
| Running Reverse KL  | 101       |
| Running Update Time | 618       |
-----------------------------------
--2024-08-12 01:15:26.285972 UTC---
| Itration            | 619       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.32e+05 |
| Running Env Steps   | 3095000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 2.67      |
| Running Update Time | 619       |
-----------------------------------
--2024-08-12 01:16:58.327363 UTC---
| Itration            | 620       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -2.82e+05 |
| Running Env Steps   | 3100000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 3.09      |
| Running Update Time | 620       |
-----------------------------------
--2024-08-12 01:18:30.568881 UTC---
| Itration            | 621       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 3105000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 4.22      |
| Running Update Time | 621       |
-----------------------------------
--2024-08-12 01:20:05.368672 UTC---
| Itration            | 622       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -1.55e+06 |
| Running Env Steps   | 3110000   |
| Running Forward KL  | 5.41      |
| Running Reverse KL  | 2.96      |
| Running Update Time | 622       |
-----------------------------------
--2024-08-12 01:21:37.884257 UTC---
| Itration            | 623       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -2.07e+06 |
| Running Env Steps   | 3115000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 37.9      |
| Running Update Time | 623       |
-----------------------------------
--2024-08-12 01:23:11.101522 UTC---
| Itration            | 624       |
| Real Det Return     | 5.03e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 3120000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 3.74      |
| Running Update Time | 624       |
-----------------------------------
--2024-08-12 01:24:46.323506 UTC---
| Itration            | 625       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -5.48e+05 |
| Running Env Steps   | 3125000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 34.1      |
| Running Update Time | 625       |
-----------------------------------
--2024-08-12 01:26:15.411135 UTC---
| Itration            | 626       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -1.45e+06 |
| Running Env Steps   | 3130000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 105       |
| Running Update Time | 626       |
-----------------------------------
--2024-08-12 01:27:49.839603 UTC---
| Itration            | 627       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -4.73e+05 |
| Running Env Steps   | 3135000   |
| Running Forward KL  | 5.21      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 627       |
-----------------------------------
--2024-08-12 01:29:23.503500 UTC---
| Itration            | 628       |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -6.22e+05 |
| Running Env Steps   | 3140000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 3.96      |
| Running Update Time | 628       |
-----------------------------------
--2024-08-12 01:30:57.219494 UTC---
| Itration            | 629       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -7.59e+05 |
| Running Env Steps   | 3145000   |
| Running Forward KL  | 5.39      |
| Running Reverse KL  | 3.39      |
| Running Update Time | 629       |
-----------------------------------
--2024-08-12 01:32:32.189779 UTC---
| Itration            | 630       |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 3150000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 9.54      |
| Running Update Time | 630       |
-----------------------------------
--2024-08-12 01:34:01.764246 UTC---
| Itration            | 631       |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 3.82e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 3155000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 3.15      |
| Running Update Time | 631       |
-----------------------------------
--2024-08-12 01:35:35.338343 UTC---
| Itration            | 632       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -6.18e+05 |
| Running Env Steps   | 3160000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 632       |
-----------------------------------
--2024-08-12 01:37:11.085965 UTC---
| Itration            | 633       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -9.25e+05 |
| Running Env Steps   | 3165000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 39        |
| Running Update Time | 633       |
-----------------------------------
--2024-08-12 01:38:43.565229 UTC---
| Itration            | 634       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -6.85e+05 |
| Running Env Steps   | 3170000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 3.68      |
| Running Update Time | 634       |
-----------------------------------
--2024-08-12 01:40:13.299271 UTC---
| Itration            | 635       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -5.81e+05 |
| Running Env Steps   | 3175000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 635       |
-----------------------------------
--2024-08-12 01:41:45.142667 UTC---
| Itration            | 636       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 3.87e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 3180000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 59        |
| Running Update Time | 636       |
-----------------------------------
--2024-08-12 01:43:15.777773 UTC---
| Itration            | 637       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -6.69e+05 |
| Running Env Steps   | 3185000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 3.15      |
| Running Update Time | 637       |
-----------------------------------
--2024-08-12 01:44:48.885237 UTC---
| Itration            | 638       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -1.85e+06 |
| Running Env Steps   | 3190000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 3.04      |
| Running Update Time | 638       |
-----------------------------------
--2024-08-12 01:46:20.332858 UTC---
| Itration            | 639       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -3.74e+05 |
| Running Env Steps   | 3195000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.98      |
| Running Update Time | 639       |
-----------------------------------
--2024-08-12 01:47:52.606336 UTC---
| Itration            | 640       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -4.78e+05 |
| Running Env Steps   | 3200000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 2.81      |
| Running Update Time | 640       |
-----------------------------------
--2024-08-12 01:49:26.586344 UTC---
| Itration            | 641       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -8.91e+05 |
| Running Env Steps   | 3205000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 3.27      |
| Running Update Time | 641       |
-----------------------------------
--2024-08-12 01:50:59.205560 UTC---
| Itration            | 642       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -3.02e+06 |
| Running Env Steps   | 3210000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 65.7      |
| Running Update Time | 642       |
-----------------------------------
--2024-08-12 01:52:31.464153 UTC--
| Itration            | 643      |
| Real Det Return     | 5.28e+03 |
| Real Sto Return     | 5e+03    |
| Reward Loss         | -6.5e+05 |
| Running Env Steps   | 3215000  |
| Running Forward KL  | 5.19     |
| Running Reverse KL  | 2.34     |
| Running Update Time | 643      |
----------------------------------
--2024-08-12 01:54:05.538493 UTC---
| Itration            | 644       |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -8.47e+05 |
| Running Env Steps   | 3220000   |
| Running Forward KL  | 5.18      |
| Running Reverse KL  | 25.4      |
| Running Update Time | 644       |
-----------------------------------
--2024-08-12 01:55:37.792990 UTC---
| Itration            | 645       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.33e+06 |
| Running Env Steps   | 3225000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 39.6      |
| Running Update Time | 645       |
-----------------------------------
--2024-08-12 01:57:10.031189 UTC---
| Itration            | 646       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.17e+03  |
| Reward Loss         | -9.05e+05 |
| Running Env Steps   | 3230000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.32      |
| Running Update Time | 646       |
-----------------------------------
--2024-08-12 01:58:44.625706 UTC---
| Itration            | 647       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -3.58e+05 |
| Running Env Steps   | 3235000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 3.2       |
| Running Update Time | 647       |
-----------------------------------
--2024-08-12 02:00:14.342231 UTC---
| Itration            | 648       |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 3240000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 45.8      |
| Running Update Time | 648       |
-----------------------------------
--2024-08-12 02:01:48.657323 UTC---
| Itration            | 649       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 3245000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 34.6      |
| Running Update Time | 649       |
-----------------------------------
--2024-08-12 02:03:22.935538 UTC---
| Itration            | 650       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -4.57e+05 |
| Running Env Steps   | 3250000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 650       |
-----------------------------------
--2024-08-12 02:04:48.921974 UTC---
| Itration            | 651       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 3.17e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 3255000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 133       |
| Running Update Time | 651       |
-----------------------------------
--2024-08-12 02:06:24.136565 UTC---
| Itration            | 652       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -8.29e+05 |
| Running Env Steps   | 3260000   |
| Running Forward KL  | 5.14      |
| Running Reverse KL  | 2.62      |
| Running Update Time | 652       |
-----------------------------------
--2024-08-12 02:07:55.912473 UTC---
| Itration            | 653       |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -9.91e+05 |
| Running Env Steps   | 3265000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 653       |
-----------------------------------
--2024-08-12 02:09:20.244093 UTC---
| Itration            | 654       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 2.47e+03  |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 3270000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 129       |
| Running Update Time | 654       |
-----------------------------------
--2024-08-12 02:10:56.063526 UTC---
| Itration            | 655       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 3275000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 56.9      |
| Running Update Time | 655       |
-----------------------------------
--2024-08-12 02:12:25.031330 UTC---
| Itration            | 656       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -9.34e+05 |
| Running Env Steps   | 3280000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 656       |
-----------------------------------
--2024-08-12 02:13:59.132027 UTC---
| Itration            | 657       |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -5.31e+05 |
| Running Env Steps   | 3285000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 40.3      |
| Running Update Time | 657       |
-----------------------------------
--2024-08-12 02:15:30.873128 UTC---
| Itration            | 658       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -7.42e+05 |
| Running Env Steps   | 3290000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 38.7      |
| Running Update Time | 658       |
-----------------------------------
--2024-08-12 02:17:02.405066 UTC---
| Itration            | 659       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.57e+06 |
| Running Env Steps   | 3295000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 41.6      |
| Running Update Time | 659       |
-----------------------------------
--2024-08-12 02:18:38.603861 UTC---
| Itration            | 660       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -3.64e+05 |
| Running Env Steps   | 3300000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 660       |
-----------------------------------
--2024-08-12 02:20:09.965300 UTC---
| Itration            | 661       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -8.64e+05 |
| Running Env Steps   | 3305000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 3.49      |
| Running Update Time | 661       |
-----------------------------------
--2024-08-12 02:21:42.213279 UTC---
| Itration            | 662       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -3.12e+05 |
| Running Env Steps   | 3310000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 2.88      |
| Running Update Time | 662       |
-----------------------------------
--2024-08-12 02:23:19.401340 UTC---
| Itration            | 663       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -8.87e+05 |
| Running Env Steps   | 3315000   |
| Running Forward KL  | 5.27      |
| Running Reverse KL  | 2.52      |
| Running Update Time | 663       |
-----------------------------------
--2024-08-12 02:24:52.224288 UTC---
| Itration            | 664       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 3320000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 44.7      |
| Running Update Time | 664       |
-----------------------------------
--2024-08-12 02:26:24.854916 UTC---
| Itration            | 665       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 3325000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 31.6      |
| Running Update Time | 665       |
-----------------------------------
--2024-08-12 02:28:00.744681 UTC---
| Itration            | 666       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -5.05e+05 |
| Running Env Steps   | 3330000   |
| Running Forward KL  | 6.78      |
| Running Reverse KL  | 3.76      |
| Running Update Time | 666       |
-----------------------------------
--2024-08-12 02:29:32.334076 UTC---
| Itration            | 667       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -4.69e+05 |
| Running Env Steps   | 3335000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 3.48      |
| Running Update Time | 667       |
-----------------------------------
--2024-08-12 02:31:06.699055 UTC---
| Itration            | 668       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -8.48e+04 |
| Running Env Steps   | 3340000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 3.53      |
| Running Update Time | 668       |
-----------------------------------
--2024-08-12 02:32:41.892505 UTC---
| Itration            | 669       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -1.86e+05 |
| Running Env Steps   | 3345000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 4.11      |
| Running Update Time | 669       |
-----------------------------------
--2024-08-12 02:34:13.149994 UTC---
| Itration            | 670       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.62e+05 |
| Running Env Steps   | 3350000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 2.89      |
| Running Update Time | 670       |
-----------------------------------
--2024-08-12 02:35:48.086276 UTC--
| Itration            | 671      |
| Real Det Return     | 5.34e+03 |
| Real Sto Return     | 5.27e+03 |
| Reward Loss         | -4.5e+05 |
| Running Env Steps   | 3355000  |
| Running Forward KL  | 6.5      |
| Running Reverse KL  | 4.07     |
| Running Update Time | 671      |
----------------------------------
--2024-08-12 02:37:20.462820 UTC---
| Itration            | 672       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -7.75e+05 |
| Running Env Steps   | 3360000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 36.6      |
| Running Update Time | 672       |
-----------------------------------
--2024-08-12 02:38:55.136101 UTC---
| Itration            | 673       |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -5.52e+05 |
| Running Env Steps   | 3365000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 3.72      |
| Running Update Time | 673       |
-----------------------------------
--2024-08-12 02:40:28.807749 UTC---
| Itration            | 674       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.16e+05 |
| Running Env Steps   | 3370000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 11.7      |
| Running Update Time | 674       |
-----------------------------------
--2024-08-12 02:41:57.303093 UTC---
| Itration            | 675       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 3.99e+03  |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 3375000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 85.6      |
| Running Update Time | 675       |
-----------------------------------
--2024-08-12 02:43:29.160876 UTC---
| Itration            | 676       |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.94e+05 |
| Running Env Steps   | 3380000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 3.38      |
| Running Update Time | 676       |
-----------------------------------
--2024-08-12 02:45:04.015366 UTC--
| Itration            | 677      |
| Real Det Return     | 5.49e+03 |
| Real Sto Return     | 5.35e+03 |
| Reward Loss         | -1.7e+05 |
| Running Env Steps   | 3385000  |
| Running Forward KL  | 6.01     |
| Running Reverse KL  | 3.22     |
| Running Update Time | 677      |
----------------------------------
--2024-08-12 02:46:34.967870 UTC--
| Itration            | 678      |
| Real Det Return     | 5.18e+03 |
| Real Sto Return     | 4.92e+03 |
| Reward Loss         | -1.2e+06 |
| Running Env Steps   | 3390000  |
| Running Forward KL  | 6.28     |
| Running Reverse KL  | 23.3     |
| Running Update Time | 678      |
----------------------------------
--2024-08-12 02:48:09.591226 UTC---
| Itration            | 679       |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 3395000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 679       |
-----------------------------------
--2024-08-12 02:49:43.866635 UTC--
| Itration            | 680      |
| Real Det Return     | 5.22e+03 |
| Real Sto Return     | 5.12e+03 |
| Reward Loss         | -1.1e+06 |
| Running Env Steps   | 3400000  |
| Running Forward KL  | 5.91     |
| Running Reverse KL  | 3.46     |
| Running Update Time | 680      |
----------------------------------
--2024-08-12 02:51:13.564488 UTC---
| Itration            | 681       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -2.96e+06 |
| Running Env Steps   | 3405000   |
| Running Forward KL  | 6.78      |
| Running Reverse KL  | 37.8      |
| Running Update Time | 681       |
-----------------------------------
--2024-08-12 02:52:49.000764 UTC---
| Itration            | 682       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -9.21e+05 |
| Running Env Steps   | 3410000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 682       |
-----------------------------------
--2024-08-12 02:54:21.924910 UTC--
| Itration            | 683      |
| Real Det Return     | 5.26e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | -3.8e+05 |
| Running Env Steps   | 3415000  |
| Running Forward KL  | 6.56     |
| Running Reverse KL  | 4.21     |
| Running Update Time | 683      |
----------------------------------
--2024-08-12 02:55:53.265795 UTC---
| Itration            | 684       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -9.37e+05 |
| Running Env Steps   | 3420000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.61      |
| Running Update Time | 684       |
-----------------------------------
--2024-08-12 02:57:28.378748 UTC---
| Itration            | 685       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -4.12e+05 |
| Running Env Steps   | 3425000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 3.06      |
| Running Update Time | 685       |
-----------------------------------
--2024-08-12 02:59:01.249348 UTC---
| Itration            | 686       |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -1.17e+06 |
| Running Env Steps   | 3430000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 3.55      |
| Running Update Time | 686       |
-----------------------------------
--2024-08-12 03:00:33.497140 UTC---
| Itration            | 687       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 3435000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 20.4      |
| Running Update Time | 687       |
-----------------------------------
--2024-08-12 03:02:09.656398 UTC---
| Itration            | 688       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.94e+05 |
| Running Env Steps   | 3440000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 2.57      |
| Running Update Time | 688       |
-----------------------------------
--2024-08-12 03:03:41.964804 UTC---
| Itration            | 689       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -5.62e+05 |
| Running Env Steps   | 3445000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 2.73      |
| Running Update Time | 689       |
-----------------------------------
--2024-08-12 03:05:13.169838 UTC---
| Itration            | 690       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -7.55e+05 |
| Running Env Steps   | 3450000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 690       |
-----------------------------------
--2024-08-12 03:06:47.974970 UTC---
| Itration            | 691       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -5.19e+05 |
| Running Env Steps   | 3455000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 691       |
-----------------------------------
--2024-08-12 03:08:19.872976 UTC--
| Itration            | 692      |
| Real Det Return     | 5.43e+03 |
| Real Sto Return     | 5.36e+03 |
| Reward Loss         | -1e+05   |
| Running Env Steps   | 3460000  |
| Running Forward KL  | 6.46     |
| Running Reverse KL  | 3.28     |
| Running Update Time | 692      |
----------------------------------
--2024-08-12 03:09:50.583863 UTC---
| Itration            | 693       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -7.66e+05 |
| Running Env Steps   | 3465000   |
| Running Forward KL  | 5.34      |
| Running Reverse KL  | 29.9      |
| Running Update Time | 693       |
-----------------------------------
--2024-08-12 03:11:24.765945 UTC---
| Itration            | 694       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -5.24e+05 |
| Running Env Steps   | 3470000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 2.18      |
| Running Update Time | 694       |
-----------------------------------
--2024-08-12 03:12:57.478738 UTC---
| Itration            | 695       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -3.47e+05 |
| Running Env Steps   | 3475000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 695       |
-----------------------------------
--2024-08-12 03:14:33.361193 UTC---
| Itration            | 696       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -6.06e+05 |
| Running Env Steps   | 3480000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 3.36      |
| Running Update Time | 696       |
-----------------------------------
--2024-08-12 03:16:06.011100 UTC---
| Itration            | 697       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -6.15e+05 |
| Running Env Steps   | 3485000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 4.18      |
| Running Update Time | 697       |
-----------------------------------
--2024-08-12 03:17:37.323526 UTC---
| Itration            | 698       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -5.44e+05 |
| Running Env Steps   | 3490000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 3.15      |
| Running Update Time | 698       |
-----------------------------------
--2024-08-12 03:19:11.979274 UTC---
| Itration            | 699       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -3.09e+05 |
| Running Env Steps   | 3495000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 2.53      |
| Running Update Time | 699       |
-----------------------------------
--2024-08-12 03:20:44.515603 UTC---
| Itration            | 700       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -5.04e+05 |
| Running Env Steps   | 3500000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.54      |
| Running Update Time | 700       |
-----------------------------------
--2024-08-12 03:22:16.006919 UTC---
| Itration            | 701       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -7.26e+05 |
| Running Env Steps   | 3505000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 701       |
-----------------------------------
--2024-08-12 03:23:52.312436 UTC--
| Itration            | 702      |
| Real Det Return     | 5.29e+03 |
| Real Sto Return     | 5.21e+03 |
| Reward Loss         | -7.8e+05 |
| Running Env Steps   | 3510000  |
| Running Forward KL  | 6.34     |
| Running Reverse KL  | 3.25     |
| Running Update Time | 702      |
----------------------------------
--2024-08-12 03:25:27.021278 UTC---
| Itration            | 703       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -7.32e+05 |
| Running Env Steps   | 3515000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 3.92      |
| Running Update Time | 703       |
-----------------------------------
--2024-08-12 03:26:59.856861 UTC---
| Itration            | 704       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 3520000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 32.7      |
| Running Update Time | 704       |
-----------------------------------
--2024-08-12 03:28:36.501410 UTC---
| Itration            | 705       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -4.44e+05 |
| Running Env Steps   | 3525000   |
| Running Forward KL  | 5.48      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 705       |
-----------------------------------
--2024-08-12 03:30:09.742832 UTC---
| Itration            | 706       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -5.37e+05 |
| Running Env Steps   | 3530000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 3.62      |
| Running Update Time | 706       |
-----------------------------------
--2024-08-12 03:31:43.086443 UTC---
| Itration            | 707       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -7.55e+05 |
| Running Env Steps   | 3535000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 3.61      |
| Running Update Time | 707       |
-----------------------------------
--2024-08-12 03:33:19.171998 UTC---
| Itration            | 708       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.65e+05 |
| Running Env Steps   | 3540000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 3.5       |
| Running Update Time | 708       |
-----------------------------------
--2024-08-12 03:34:52.043360 UTC---
| Itration            | 709       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -8.52e+05 |
| Running Env Steps   | 3545000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 39.1      |
| Running Update Time | 709       |
-----------------------------------
--2024-08-12 03:36:25.858721 UTC---
| Itration            | 710       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -3.99e+05 |
| Running Env Steps   | 3550000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 3.58      |
| Running Update Time | 710       |
-----------------------------------
--2024-08-12 03:38:01.945248 UTC---
| Itration            | 711       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 3555000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 36.7      |
| Running Update Time | 711       |
-----------------------------------
--2024-08-12 03:39:35.063553 UTC---
| Itration            | 712       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -4.87e+05 |
| Running Env Steps   | 3560000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 36.7      |
| Running Update Time | 712       |
-----------------------------------
--2024-08-12 03:41:09.527298 UTC---
| Itration            | 713       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -6.26e+05 |
| Running Env Steps   | 3565000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 36.6      |
| Running Update Time | 713       |
-----------------------------------
--2024-08-12 03:42:42.847115 UTC---
| Itration            | 714       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 3570000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 29.8      |
| Running Update Time | 714       |
-----------------------------------
--2024-08-12 03:44:14.576975 UTC---
| Itration            | 715       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -2.44e+05 |
| Running Env Steps   | 3575000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 715       |
-----------------------------------
--2024-08-12 03:45:48.807685 UTC---
| Itration            | 716       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -5.93e+05 |
| Running Env Steps   | 3580000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 31        |
| Running Update Time | 716       |
-----------------------------------
--2024-08-12 03:47:20.766591 UTC--
| Itration            | 717      |
| Real Det Return     | 5.34e+03 |
| Real Sto Return     | 5.04e+03 |
| Reward Loss         | -1.9e+06 |
| Running Env Steps   | 3585000  |
| Running Forward KL  | 7.13     |
| Running Reverse KL  | 92.7     |
| Running Update Time | 717      |
----------------------------------
--2024-08-12 03:48:50.171853 UTC---
| Itration            | 718       |
| Real Det Return     | 4.77e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -8.38e+05 |
| Running Env Steps   | 3590000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 54        |
| Running Update Time | 718       |
-----------------------------------
--2024-08-12 03:50:27.220148 UTC---
| Itration            | 719       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -4.24e+05 |
| Running Env Steps   | 3595000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 21.4      |
| Running Update Time | 719       |
-----------------------------------
--2024-08-12 03:51:57.948472 UTC---
| Itration            | 720       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -2.85e+05 |
| Running Env Steps   | 3600000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 720       |
-----------------------------------
--2024-08-12 03:53:29.760526 UTC---
| Itration            | 721       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -7.33e+05 |
| Running Env Steps   | 3605000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 4.54      |
| Running Update Time | 721       |
-----------------------------------
--2024-08-12 03:55:06.203451 UTC--
| Itration            | 722      |
| Real Det Return     | 5.47e+03 |
| Real Sto Return     | 5.43e+03 |
| Reward Loss         | -1e+04   |
| Running Env Steps   | 3610000  |
| Running Forward KL  | 6.57     |
| Running Reverse KL  | 3.92     |
| Running Update Time | 722      |
----------------------------------
--2024-08-12 03:56:38.230879 UTC---
| Itration            | 723       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -4.91e+05 |
| Running Env Steps   | 3615000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 37.9      |
| Running Update Time | 723       |
-----------------------------------
--2024-08-12 03:58:10.476808 UTC---
| Itration            | 724       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -8.95e+04 |
| Running Env Steps   | 3620000   |
| Running Forward KL  | 7.55      |
| Running Reverse KL  | 5.16      |
| Running Update Time | 724       |
-----------------------------------
--2024-08-12 03:59:44.842085 UTC---
| Itration            | 725       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -6.46e+05 |
| Running Env Steps   | 3625000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 30.8      |
| Running Update Time | 725       |
-----------------------------------
--2024-08-12 04:01:16.570592 UTC---
| Itration            | 726       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -7.95e+05 |
| Running Env Steps   | 3630000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 34.1      |
| Running Update Time | 726       |
-----------------------------------
--2024-08-12 04:02:48.299210 UTC---
| Itration            | 727       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -2.82e+05 |
| Running Env Steps   | 3635000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 43.2      |
| Running Update Time | 727       |
-----------------------------------
--2024-08-12 04:04:22.060661 UTC---
| Itration            | 728       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -3.71e+05 |
| Running Env Steps   | 3640000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.21      |
| Running Update Time | 728       |
-----------------------------------
--2024-08-12 04:05:55.971779 UTC--
| Itration            | 729      |
| Real Det Return     | 5.34e+03 |
| Real Sto Return     | 5.38e+03 |
| Reward Loss         | -3.1e+05 |
| Running Env Steps   | 3645000  |
| Running Forward KL  | 6.24     |
| Running Reverse KL  | 4.19     |
| Running Update Time | 729      |
----------------------------------
--2024-08-12 04:07:31.781577 UTC---
| Itration            | 730       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.27e+06 |
| Running Env Steps   | 3650000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 71.2      |
| Running Update Time | 730       |
-----------------------------------
--2024-08-12 04:09:02.755309 UTC---
| Itration            | 731       |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.52e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 3655000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 35.5      |
| Running Update Time | 731       |
-----------------------------------
--2024-08-12 04:10:33.906894 UTC---
| Itration            | 732       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -4.93e+05 |
| Running Env Steps   | 3660000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 732       |
-----------------------------------
--2024-08-12 04:12:08.110687 UTC---
| Itration            | 733       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.45e+05 |
| Running Env Steps   | 3665000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 3.22      |
| Running Update Time | 733       |
-----------------------------------
--2024-08-12 04:13:40.267794 UTC---
| Itration            | 734       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.47e+05 |
| Running Env Steps   | 3670000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 4         |
| Running Update Time | 734       |
-----------------------------------
--2024-08-12 04:15:11.857686 UTC--
| Itration            | 735      |
| Real Det Return     | 5.36e+03 |
| Real Sto Return     | 4.57e+03 |
| Reward Loss         | -8.8e+05 |
| Running Env Steps   | 3675000  |
| Running Forward KL  | 6.25     |
| Running Reverse KL  | 47.7     |
| Running Update Time | 735      |
----------------------------------
--2024-08-12 04:16:46.770931 UTC---
| Itration            | 736       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.99e+05 |
| Running Env Steps   | 3680000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 7.52      |
| Running Update Time | 736       |
-----------------------------------
--2024-08-12 04:18:18.797354 UTC---
| Itration            | 737       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -9.24e+05 |
| Running Env Steps   | 3685000   |
| Running Forward KL  | 7.7       |
| Running Reverse KL  | 38.6      |
| Running Update Time | 737       |
-----------------------------------
--2024-08-12 04:19:54.423351 UTC--
| Itration            | 738      |
| Real Det Return     | 5.47e+03 |
| Real Sto Return     | 5.24e+03 |
| Reward Loss         | 1.71e+05 |
| Running Env Steps   | 3690000  |
| Running Forward KL  | 6.27     |
| Running Reverse KL  | 3.73     |
| Running Update Time | 738      |
----------------------------------
--2024-08-12 04:21:28.453909 UTC---
| Itration            | 739       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -3.66e+05 |
| Running Env Steps   | 3695000   |
| Running Forward KL  | 7         |
| Running Reverse KL  | 30.8      |
| Running Update Time | 739       |
-----------------------------------
--2024-08-12 04:22:58.524938 UTC---
| Itration            | 740       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 3700000   |
| Running Forward KL  | 6.92      |
| Running Reverse KL  | 35.7      |
| Running Update Time | 740       |
-----------------------------------
--2024-08-12 04:24:33.768172 UTC---
| Itration            | 741       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -7.46e+05 |
| Running Env Steps   | 3705000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 20.8      |
| Running Update Time | 741       |
-----------------------------------
--2024-08-12 04:26:07.177981 UTC---
| Itration            | 742       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -2.75e+05 |
| Running Env Steps   | 3710000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 4.06      |
| Running Update Time | 742       |
-----------------------------------
--2024-08-12 04:27:36.879640 UTC--
| Itration            | 743      |
| Real Det Return     | 5.49e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | 9.47e+04 |
| Running Env Steps   | 3715000  |
| Running Forward KL  | 6.63     |
| Running Reverse KL  | 4.66     |
| Running Update Time | 743      |
----------------------------------
--2024-08-12 04:29:14.010761 UTC---
| Itration            | 744       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -3.37e+05 |
| Running Env Steps   | 3720000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 4.09      |
| Running Update Time | 744       |
-----------------------------------
--2024-08-12 04:30:44.251888 UTC---
| Itration            | 745       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -6.27e+05 |
| Running Env Steps   | 3725000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 31.3      |
| Running Update Time | 745       |
-----------------------------------
--2024-08-12 04:32:17.388028 UTC---
| Itration            | 746       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -9.76e+05 |
| Running Env Steps   | 3730000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 53.4      |
| Running Update Time | 746       |
-----------------------------------
--2024-08-12 04:33:51.675255 UTC---
| Itration            | 747       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -7.17e+05 |
| Running Env Steps   | 3735000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 3.47      |
| Running Update Time | 747       |
-----------------------------------
--2024-08-12 04:35:23.796888 UTC---
| Itration            | 748       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -2.11e+05 |
| Running Env Steps   | 3740000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 748       |
-----------------------------------
--2024-08-12 04:36:59.194350 UTC---
| Itration            | 749       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -5.26e+05 |
| Running Env Steps   | 3745000   |
| Running Forward KL  | 7.07      |
| Running Reverse KL  | 4.11      |
| Running Update Time | 749       |
-----------------------------------
--2024-08-12 04:38:30.891204 UTC---
| Itration            | 750       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -1.72e+05 |
| Running Env Steps   | 3750000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 3.57      |
| Running Update Time | 750       |
-----------------------------------
--2024-08-12 04:40:02.432264 UTC--
| Itration            | 751      |
| Real Det Return     | 5.45e+03 |
| Real Sto Return     | 5.34e+03 |
| Reward Loss         | 2.18e+05 |
| Running Env Steps   | 3755000  |
| Running Forward KL  | 5.53     |
| Running Reverse KL  | 7.38     |
| Running Update Time | 751      |
----------------------------------
--2024-08-12 04:41:39.519074 UTC---
| Itration            | 752       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -9.67e+05 |
| Running Env Steps   | 3760000   |
| Running Forward KL  | 6.92      |
| Running Reverse KL  | 4.88      |
| Running Update Time | 752       |
-----------------------------------
--2024-08-12 04:43:12.733512 UTC--
| Itration            | 753      |
| Real Det Return     | 5.64e+03 |
| Real Sto Return     | 5.56e+03 |
| Reward Loss         | 1.65e+05 |
| Running Env Steps   | 3765000  |
| Running Forward KL  | 7.03     |
| Running Reverse KL  | 4.74     |
| Running Update Time | 753      |
----------------------------------
--2024-08-12 04:44:45.311553 UTC---
| Itration            | 754       |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.52e+04 |
| Running Env Steps   | 3770000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 4.23      |
| Running Update Time | 754       |
-----------------------------------
--2024-08-12 04:46:20.854204 UTC---
| Itration            | 755       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -2.04e+05 |
| Running Env Steps   | 3775000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 2.97      |
| Running Update Time | 755       |
-----------------------------------
--2024-08-12 04:48:01.741040 UTC---
| Itration            | 756       |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -9.95e+04 |
| Running Env Steps   | 3780000   |
| Running Forward KL  | 6.28      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 756       |
-----------------------------------
--2024-08-12 04:49:49.149268 UTC---
| Itration            | 757       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -3.56e+05 |
| Running Env Steps   | 3785000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 3.97      |
| Running Update Time | 757       |
-----------------------------------
--2024-08-12 04:51:39.968214 UTC---
| Itration            | 758       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -4.68e+05 |
| Running Env Steps   | 3790000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 36.8      |
| Running Update Time | 758       |
-----------------------------------
--2024-08-12 04:53:25.160118 UTC---
| Itration            | 759       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -2.92e+05 |
| Running Env Steps   | 3795000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 2.69      |
| Running Update Time | 759       |
-----------------------------------
--2024-08-12 04:55:16.101049 UTC--
| Itration            | 760      |
| Real Det Return     | 5.39e+03 |
| Real Sto Return     | 5.37e+03 |
| Reward Loss         | -1.4e+05 |
| Running Env Steps   | 3800000  |
| Running Forward KL  | 6.22     |
| Running Reverse KL  | 3.87     |
| Running Update Time | 760      |
----------------------------------
--2024-08-12 04:57:04.821809 UTC---
| Itration            | 761       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -6.08e+05 |
| Running Env Steps   | 3805000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 3.25      |
| Running Update Time | 761       |
-----------------------------------
--2024-08-12 04:58:50.512037 UTC---
| Itration            | 762       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.41e+05 |
| Running Env Steps   | 3810000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 3.76      |
| Running Update Time | 762       |
-----------------------------------
--2024-08-12 05:00:40.394765 UTC---
| Itration            | 763       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -4.01e+05 |
| Running Env Steps   | 3815000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 28.4      |
| Running Update Time | 763       |
-----------------------------------
--2024-08-12 05:02:29.494473 UTC---
| Itration            | 764       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -7.55e+05 |
| Running Env Steps   | 3820000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 7.5       |
| Running Update Time | 764       |
-----------------------------------
--2024-08-12 05:04:16.683387 UTC---
| Itration            | 765       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -6.37e+05 |
| Running Env Steps   | 3825000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 765       |
-----------------------------------
--2024-08-12 05:06:09.090017 UTC--
| Itration            | 766      |
| Real Det Return     | 5.37e+03 |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | -4.4e+05 |
| Running Env Steps   | 3830000  |
| Running Forward KL  | 6.13     |
| Running Reverse KL  | 37.4     |
| Running Update Time | 766      |
----------------------------------
--2024-08-12 05:07:58.126055 UTC--
| Itration            | 767      |
| Real Det Return     | 5.33e+03 |
| Real Sto Return     | 5.24e+03 |
| Reward Loss         | -7.3e+05 |
| Running Env Steps   | 3835000  |
| Running Forward KL  | 5.33     |
| Running Reverse KL  | 3.26     |
| Running Update Time | 767      |
----------------------------------
--2024-08-12 05:09:44.602427 UTC---
| Itration            | 768       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -3.63e+05 |
| Running Env Steps   | 3840000   |
| Running Forward KL  | 6.55      |
| Running Reverse KL  | 92.9      |
| Running Update Time | 768       |
-----------------------------------
--2024-08-12 05:11:35.815398 UTC---
| Itration            | 769       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -1.37e+05 |
| Running Env Steps   | 3845000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 2.59      |
| Running Update Time | 769       |
-----------------------------------
--2024-08-12 05:13:25.218113 UTC---
| Itration            | 770       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -2.67e+05 |
| Running Env Steps   | 3850000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 3.38      |
| Running Update Time | 770       |
-----------------------------------
--2024-08-12 05:15:16.699243 UTC---
| Itration            | 771       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -3.31e+05 |
| Running Env Steps   | 3855000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 23.4      |
| Running Update Time | 771       |
-----------------------------------
--2024-08-12 05:17:06.521974 UTC---
| Itration            | 772       |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -6.26e+05 |
| Running Env Steps   | 3860000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 62.5      |
| Running Update Time | 772       |
-----------------------------------
--2024-08-12 05:18:55.612192 UTC---
| Itration            | 773       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.43e+05 |
| Running Env Steps   | 3865000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 2.83      |
| Running Update Time | 773       |
-----------------------------------
--2024-08-12 05:20:46.655184 UTC---
| Itration            | 774       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -6.43e+04 |
| Running Env Steps   | 3870000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 4.31      |
| Running Update Time | 774       |
-----------------------------------
--2024-08-12 05:22:34.663717 UTC---
| Itration            | 775       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -5.06e+05 |
| Running Env Steps   | 3875000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 15.1      |
| Running Update Time | 775       |
-----------------------------------
--2024-08-12 05:24:23.709946 UTC---
| Itration            | 776       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -4.46e+05 |
| Running Env Steps   | 3880000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 3.01      |
| Running Update Time | 776       |
-----------------------------------
--2024-08-12 05:26:15.500632 UTC---
| Itration            | 777       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -1.98e+05 |
| Running Env Steps   | 3885000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 3.45      |
| Running Update Time | 777       |
-----------------------------------
--2024-08-12 05:28:03.806316 UTC--
| Itration            | 778      |
| Real Det Return     | 5.53e+03 |
| Real Sto Return     | 5.45e+03 |
| Reward Loss         | 7.45e+04 |
| Running Env Steps   | 3890000  |
| Running Forward KL  | 5.88     |
| Running Reverse KL  | 4.09     |
| Running Update Time | 778      |
----------------------------------
--2024-08-12 05:29:53.174474 UTC--
| Itration            | 779      |
| Real Det Return     | 5.51e+03 |
| Real Sto Return     | 5.28e+03 |
| Reward Loss         | -1.4e+05 |
| Running Env Steps   | 3895000  |
| Running Forward KL  | 5.67     |
| Running Reverse KL  | 2.59     |
| Running Update Time | 779      |
----------------------------------
--2024-08-12 05:31:31.906019 UTC---
| Itration            | 780       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 1.71e+03  |
| Reward Loss         | -2.35e+06 |
| Running Env Steps   | 3900000   |
| Running Forward KL  | 8.22      |
| Running Reverse KL  | 235       |
| Running Update Time | 780       |
-----------------------------------
--2024-08-12 05:33:21.592124 UTC--
| Itration            | 781      |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 5.47e+03 |
| Reward Loss         | 4.41e+04 |
| Running Env Steps   | 3905000  |
| Running Forward KL  | 6.11     |
| Running Reverse KL  | 4.29     |
| Running Update Time | 781      |
----------------------------------
--2024-08-12 05:35:15.094779 UTC---
| Itration            | 782       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -3.24e+05 |
| Running Env Steps   | 3910000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 3.98      |
| Running Update Time | 782       |
-----------------------------------
--2024-08-12 05:37:06.364738 UTC---
| Itration            | 783       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -3.83e+05 |
| Running Env Steps   | 3915000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 21.1      |
| Running Update Time | 783       |
-----------------------------------
--2024-08-12 05:38:56.831450 UTC---
| Itration            | 784       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -2.78e+05 |
| Running Env Steps   | 3920000   |
| Running Forward KL  | 5.88      |
| Running Reverse KL  | 4.03      |
| Running Update Time | 784       |
-----------------------------------
--2024-08-12 05:40:49.271307 UTC---
| Itration            | 785       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 3925000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 75        |
| Running Update Time | 785       |
-----------------------------------
--2024-08-12 05:42:04.240332 UTC---
| Itration            | 786       |
| Real Det Return     | 172       |
| Real Sto Return     | 1.73e+03  |
| Reward Loss         | -3.02e+06 |
| Running Env Steps   | 3930000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 311       |
| Running Update Time | 786       |
-----------------------------------
--2024-08-12 05:43:50.473665 UTC---
| Itration            | 787       |
| Real Det Return     | 5.04e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 3935000   |
| Running Forward KL  | 6.98      |
| Running Reverse KL  | 35.9      |
| Running Update Time | 787       |
-----------------------------------
--2024-08-12 05:45:22.097505 UTC---
| Itration            | 788       |
| Real Det Return     | 4.94e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.71e+06 |
| Running Env Steps   | 3940000   |
| Running Forward KL  | 7.21      |
| Running Reverse KL  | 6.87      |
| Running Update Time | 788       |
-----------------------------------
--2024-08-12 05:46:55.806138 UTC---
| Itration            | 789       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -8.38e+05 |
| Running Env Steps   | 3945000   |
| Running Forward KL  | 6.67      |
| Running Reverse KL  | 36.2      |
| Running Update Time | 789       |
-----------------------------------
--2024-08-12 05:48:31.541882 UTC---
| Itration            | 790       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 3950000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 37.8      |
| Running Update Time | 790       |
-----------------------------------
--2024-08-12 05:50:03.084467 UTC---
| Itration            | 791       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -8.59e+05 |
| Running Env Steps   | 3955000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 4.22      |
| Running Update Time | 791       |
-----------------------------------
--2024-08-12 05:51:33.813101 UTC---
| Itration            | 792       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -5.46e+05 |
| Running Env Steps   | 3960000   |
| Running Forward KL  | 6.6       |
| Running Reverse KL  | 3.67      |
| Running Update Time | 792       |
-----------------------------------
--2024-08-12 05:53:09.058564 UTC---
| Itration            | 793       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -1.66e+05 |
| Running Env Steps   | 3965000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 3.34      |
| Running Update Time | 793       |
-----------------------------------
--2024-08-12 05:54:41.316495 UTC---
| Itration            | 794       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -4.91e+05 |
| Running Env Steps   | 3970000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 4.57      |
| Running Update Time | 794       |
-----------------------------------
--2024-08-12 05:56:08.072363 UTC---
| Itration            | 795       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 3.13e+03  |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 3975000   |
| Running Forward KL  | 7.28      |
| Running Reverse KL  | 144       |
| Running Update Time | 795       |
-----------------------------------
--2024-08-12 05:57:42.159176 UTC---
| Itration            | 796       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -6.05e+04 |
| Running Env Steps   | 3980000   |
| Running Forward KL  | 6.4       |
| Running Reverse KL  | 3.77      |
| Running Update Time | 796       |
-----------------------------------
--2024-08-12 05:59:15.135835 UTC---
| Itration            | 797       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -3.61e+05 |
| Running Env Steps   | 3985000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 4.43      |
| Running Update Time | 797       |
-----------------------------------
--2024-08-12 06:00:51.333600 UTC---
| Itration            | 798       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.66e+05 |
| Running Env Steps   | 3990000   |
| Running Forward KL  | 6.56      |
| Running Reverse KL  | 3.78      |
| Running Update Time | 798       |
-----------------------------------
--2024-08-12 06:02:24.364475 UTC---
| Itration            | 799       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.11e+05 |
| Running Env Steps   | 3995000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 19.1      |
| Running Update Time | 799       |
-----------------------------------
--2024-08-12 06:03:57.153225 UTC---
| Itration            | 800       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -1.77e+05 |
| Running Env Steps   | 4000000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 800       |
-----------------------------------
--2024-08-12 06:05:31.719807 UTC---
| Itration            | 801       |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -9.54e+05 |
| Running Env Steps   | 4005000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 72.8      |
| Running Update Time | 801       |
-----------------------------------
--2024-08-12 06:07:04.166719 UTC---
| Itration            | 802       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -4.34e+05 |
| Running Env Steps   | 4010000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 4.02      |
| Running Update Time | 802       |
-----------------------------------
--2024-08-12 06:08:35.414934 UTC---
| Itration            | 803       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -2.97e+05 |
| Running Env Steps   | 4015000   |
| Running Forward KL  | 6.6       |
| Running Reverse KL  | 8.48      |
| Running Update Time | 803       |
-----------------------------------
--2024-08-12 06:10:12.812185 UTC---
| Itration            | 804       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -2.12e+05 |
| Running Env Steps   | 4020000   |
| Running Forward KL  | 7         |
| Running Reverse KL  | 4.48      |
| Running Update Time | 804       |
-----------------------------------
--2024-08-12 06:11:42.955074 UTC---
| Itration            | 805       |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.14e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 4025000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 64.3      |
| Running Update Time | 805       |
-----------------------------------
--2024-08-12 06:13:16.761222 UTC---
| Itration            | 806       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -3.14e+05 |
| Running Env Steps   | 4030000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 4.36      |
| Running Update Time | 806       |
-----------------------------------
--2024-08-12 06:14:51.949707 UTC---
| Itration            | 807       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -4.11e+05 |
| Running Env Steps   | 4035000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 59.7      |
| Running Update Time | 807       |
-----------------------------------
--2024-08-12 06:16:24.625318 UTC---
| Itration            | 808       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -4.88e+05 |
| Running Env Steps   | 4040000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 3.85      |
| Running Update Time | 808       |
-----------------------------------
--2024-08-12 06:17:59.363268 UTC---
| Itration            | 809       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -5.01e+05 |
| Running Env Steps   | 4045000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 809       |
-----------------------------------
--2024-08-12 06:19:31.831709 UTC---
| Itration            | 810       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 4050000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 38.4      |
| Running Update Time | 810       |
-----------------------------------
--2024-08-12 06:21:05.370370 UTC---
| Itration            | 811       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.78e+05 |
| Running Env Steps   | 4055000   |
| Running Forward KL  | 5.18      |
| Running Reverse KL  | 15.3      |
| Running Update Time | 811       |
-----------------------------------
--2024-08-12 06:22:40.710028 UTC---
| Itration            | 812       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -6.64e+05 |
| Running Env Steps   | 4060000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 3.78      |
| Running Update Time | 812       |
-----------------------------------
--2024-08-12 06:24:12.064236 UTC---
| Itration            | 813       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -5.51e+05 |
| Running Env Steps   | 4065000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 4.26      |
| Running Update Time | 813       |
-----------------------------------
--2024-08-12 06:25:44.162947 UTC---
| Itration            | 814       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -2.39e+06 |
| Running Env Steps   | 4070000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 38.5      |
| Running Update Time | 814       |
-----------------------------------
--2024-08-12 06:27:19.499451 UTC---
| Itration            | 815       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.12e+05 |
| Running Env Steps   | 4075000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 4.08      |
| Running Update Time | 815       |
-----------------------------------
--2024-08-12 06:28:53.666013 UTC---
| Itration            | 816       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 4080000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 52        |
| Running Update Time | 816       |
-----------------------------------
--2024-08-12 06:30:24.809963 UTC--
| Itration            | 817      |
| Real Det Return     | 5.53e+03 |
| Real Sto Return     | 4.76e+03 |
| Reward Loss         | -2.6e+05 |
| Running Env Steps   | 4085000  |
| Running Forward KL  | 5.43     |
| Running Reverse KL  | 52.2     |
| Running Update Time | 817      |
----------------------------------
--2024-08-12 06:31:59.753372 UTC---
| Itration            | 818       |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -3.02e+05 |
| Running Env Steps   | 4090000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 55.2      |
| Running Update Time | 818       |
-----------------------------------
--2024-08-12 06:33:33.446900 UTC---
| Itration            | 819       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -4.47e+05 |
| Running Env Steps   | 4095000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 819       |
-----------------------------------
--2024-08-12 06:35:07.824619 UTC---
| Itration            | 820       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -6.65e+05 |
| Running Env Steps   | 4100000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 17.8      |
| Running Update Time | 820       |
-----------------------------------
--2024-08-12 06:36:42.085547 UTC---
| Itration            | 821       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 4105000   |
| Running Forward KL  | 5.95      |
| Running Reverse KL  | 3.76      |
| Running Update Time | 821       |
-----------------------------------
--2024-08-12 06:38:14.916498 UTC---
| Itration            | 822       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -5.96e+05 |
| Running Env Steps   | 4110000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 3.72      |
| Running Update Time | 822       |
-----------------------------------
--2024-08-12 06:39:50.318187 UTC--
| Itration            | 823      |
| Real Det Return     | 5.36e+03 |
| Real Sto Return     | 4.91e+03 |
| Reward Loss         | -2.8e+06 |
| Running Env Steps   | 4115000  |
| Running Forward KL  | 7.12     |
| Running Reverse KL  | 39.1     |
| Running Update Time | 823      |
----------------------------------
--2024-08-12 06:41:23.280654 UTC---
| Itration            | 824       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -5.05e+05 |
| Running Env Steps   | 4120000   |
| Running Forward KL  | 6.46      |
| Running Reverse KL  | 3.44      |
| Running Update Time | 824       |
-----------------------------------
--2024-08-12 06:42:55.647226 UTC---
| Itration            | 825       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -3.65e+05 |
| Running Env Steps   | 4125000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 3.44      |
| Running Update Time | 825       |
-----------------------------------
--2024-08-12 06:44:31.035212 UTC---
| Itration            | 826       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 4130000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 39.7      |
| Running Update Time | 826       |
-----------------------------------
--2024-08-12 06:46:04.681592 UTC---
| Itration            | 827       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -5.05e+05 |
| Running Env Steps   | 4135000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 42.3      |
| Running Update Time | 827       |
-----------------------------------
--2024-08-12 06:47:37.264111 UTC---
| Itration            | 828       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -3.28e+05 |
| Running Env Steps   | 4140000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 4.02      |
| Running Update Time | 828       |
-----------------------------------
--2024-08-12 06:49:12.976916 UTC---
| Itration            | 829       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -6.96e+05 |
| Running Env Steps   | 4145000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 829       |
-----------------------------------
--2024-08-12 06:50:44.512406 UTC---
| Itration            | 830       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -5.17e+05 |
| Running Env Steps   | 4150000   |
| Running Forward KL  | 5.59      |
| Running Reverse KL  | 40.5      |
| Running Update Time | 830       |
-----------------------------------
--2024-08-12 06:52:19.460002 UTC---
| Itration            | 831       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -2.36e+06 |
| Running Env Steps   | 4155000   |
| Running Forward KL  | 6.91      |
| Running Reverse KL  | 54.4      |
| Running Update Time | 831       |
-----------------------------------
--2024-08-12 06:53:53.126689 UTC---
| Itration            | 832       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -1.93e+05 |
| Running Env Steps   | 4160000   |
| Running Forward KL  | 6.1       |
| Running Reverse KL  | 26.8      |
| Running Update Time | 832       |
-----------------------------------
--2024-08-12 06:54:54.120750 UTC---
| Itration            | 833       |
| Real Det Return     | 56.5      |
| Real Sto Return     | 61.5      |
| Reward Loss         | -2.72e+07 |
| Running Env Steps   | 4165000   |
| Running Forward KL  | 27.7      |
| Running Reverse KL  | 350       |
| Running Update Time | 833       |
-----------------------------------
--2024-08-12 06:55:56.686436 UTC---
| Itration            | 834       |
| Real Det Return     | 66.9      |
| Real Sto Return     | 71.7      |
| Reward Loss         | -2.94e+07 |
| Running Env Steps   | 4170000   |
| Running Forward KL  | 28.2      |
| Running Reverse KL  | 351       |
| Running Update Time | 834       |
-----------------------------------
--2024-08-12 06:57:26.398600 UTC--
| Itration            | 835      |
| Real Det Return     | 4.72e+03 |
| Real Sto Return     | 4.34e+03 |
| Reward Loss         | -2.4e+06 |
| Running Env Steps   | 4175000  |
| Running Forward KL  | 7        |
| Running Reverse KL  | 3.7      |
| Running Update Time | 835      |
----------------------------------
--2024-08-12 06:59:01.377350 UTC---
| Itration            | 836       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -8.86e+05 |
| Running Env Steps   | 4180000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 8.57      |
| Running Update Time | 836       |
-----------------------------------
--2024-08-12 07:00:29.261175 UTC---
| Itration            | 837       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 4185000   |
| Running Forward KL  | 6.71      |
| Running Reverse KL  | 58.1      |
| Running Update Time | 837       |
-----------------------------------
--2024-08-12 07:02:02.725078 UTC---
| Itration            | 838       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -8.11e+05 |
| Running Env Steps   | 4190000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 3.47      |
| Running Update Time | 838       |
-----------------------------------
--2024-08-12 07:03:37.733171 UTC---
| Itration            | 839       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -3.92e+05 |
| Running Env Steps   | 4195000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 839       |
-----------------------------------
--2024-08-12 07:05:09.012044 UTC---
| Itration            | 840       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -2.66e+05 |
| Running Env Steps   | 4200000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 3.95      |
| Running Update Time | 840       |
-----------------------------------
--2024-08-12 07:06:43.644212 UTC--
| Itration            | 841      |
| Real Det Return     | 5.57e+03 |
| Real Sto Return     | 5.14e+03 |
| Reward Loss         | -1.9e+05 |
| Running Env Steps   | 4205000  |
| Running Forward KL  | 6.36     |
| Running Reverse KL  | 33.8     |
| Running Update Time | 841      |
----------------------------------
--2024-08-12 07:08:19.190866 UTC---
| Itration            | 842       |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.62e+05 |
| Running Env Steps   | 4210000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 18.6      |
| Running Update Time | 842       |
-----------------------------------
--2024-08-12 07:09:57.877853 UTC---
| Itration            | 843       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -2.37e+05 |
| Running Env Steps   | 4215000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 3.71      |
| Running Update Time | 843       |
-----------------------------------
--2024-08-12 07:11:30.859301 UTC---
| Itration            | 844       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -6.45e+05 |
| Running Env Steps   | 4220000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 16.2      |
| Running Update Time | 844       |
-----------------------------------
--2024-08-12 07:13:09.135871 UTC---
| Itration            | 845       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -5.66e+05 |
| Running Env Steps   | 4225000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 845       |
-----------------------------------
--2024-08-12 07:14:41.217014 UTC--
| Itration            | 846      |
| Real Det Return     | 5.47e+03 |
| Real Sto Return     | 5.38e+03 |
| Reward Loss         | -2.6e+05 |
| Running Env Steps   | 4230000  |
| Running Forward KL  | 6.55     |
| Running Reverse KL  | 4.82     |
| Running Update Time | 846      |
----------------------------------
--2024-08-12 07:16:15.792260 UTC---
| Itration            | 847       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -2.51e+05 |
| Running Env Steps   | 4235000   |
| Running Forward KL  | 5.81      |
| Running Reverse KL  | 4.06      |
| Running Update Time | 847       |
-----------------------------------
--2024-08-12 07:17:53.527826 UTC---
| Itration            | 848       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -8.11e+05 |
| Running Env Steps   | 4240000   |
| Running Forward KL  | 6.78      |
| Running Reverse KL  | 4.54      |
| Running Update Time | 848       |
-----------------------------------
--2024-08-12 07:19:26.746282 UTC---
| Itration            | 849       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -6.18e+04 |
| Running Env Steps   | 4245000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 3.54      |
| Running Update Time | 849       |
-----------------------------------
--2024-08-12 07:21:10.711908 UTC--
| Itration            | 850      |
| Real Det Return     | 5.63e+03 |
| Real Sto Return     | 5.28e+03 |
| Reward Loss         | 1.57e+05 |
| Running Env Steps   | 4250000  |
| Running Forward KL  | 6.24     |
| Running Reverse KL  | 4.49     |
| Running Update Time | 850      |
----------------------------------
--2024-08-12 07:23:05.532679 UTC---
| Itration            | 851       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -5.35e+05 |
| Running Env Steps   | 4255000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 5.09      |
| Running Update Time | 851       |
-----------------------------------
--2024-08-12 07:24:55.827278 UTC---
| Itration            | 852       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -7.96e+05 |
| Running Env Steps   | 4260000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.49      |
| Running Update Time | 852       |
-----------------------------------
--2024-08-12 07:26:51.478602 UTC---
| Itration            | 853       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -5.14e+05 |
| Running Env Steps   | 4265000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 4.01      |
| Running Update Time | 853       |
-----------------------------------
--2024-08-12 07:28:43.094756 UTC---
| Itration            | 854       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -4.14e+05 |
| Running Env Steps   | 4270000   |
| Running Forward KL  | 6.67      |
| Running Reverse KL  | 3.97      |
| Running Update Time | 854       |
-----------------------------------
--2024-08-12 07:30:32.112313 UTC---
| Itration            | 855       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -9.27e+05 |
| Running Env Steps   | 4275000   |
| Running Forward KL  | 6.28      |
| Running Reverse KL  | 4.23      |
| Running Update Time | 855       |
-----------------------------------
--2024-08-12 07:32:28.938572 UTC---
| Itration            | 856       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -5.69e+05 |
| Running Env Steps   | 4280000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 4.65      |
| Running Update Time | 856       |
-----------------------------------
--2024-08-12 07:34:18.612653 UTC---
| Itration            | 857       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -3.17e+05 |
| Running Env Steps   | 4285000   |
| Running Forward KL  | 7.05      |
| Running Reverse KL  | 14.4      |
| Running Update Time | 857       |
-----------------------------------
--2024-08-12 07:36:04.884694 UTC---
| Itration            | 858       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -3.44e+05 |
| Running Env Steps   | 4290000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 4.25      |
| Running Update Time | 858       |
-----------------------------------
--2024-08-12 07:38:05.610363 UTC---
| Itration            | 859       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -4.51e+05 |
| Running Env Steps   | 4295000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 4.19      |
| Running Update Time | 859       |
-----------------------------------
--2024-08-12 07:39:52.639855 UTC---
| Itration            | 860       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -8.44e+05 |
| Running Env Steps   | 4300000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 4.44      |
| Running Update Time | 860       |
-----------------------------------
--2024-08-12 07:41:37.330062 UTC---
| Itration            | 861       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -7.77e+05 |
| Running Env Steps   | 4305000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 861       |
-----------------------------------
--2024-08-12 07:43:20.044433 UTC--
| Itration            | 862      |
| Real Det Return     | 5.65e+03 |
| Real Sto Return     | 5.3e+03  |
| Reward Loss         | 1.03e+05 |
| Running Env Steps   | 4310000  |
| Running Forward KL  | 6.07     |
| Running Reverse KL  | 4.56     |
| Running Update Time | 862      |
----------------------------------
--2024-08-12 07:44:51.719122 UTC---
| Itration            | 863       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -3.34e+05 |
| Running Env Steps   | 4315000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 35.2      |
| Running Update Time | 863       |
-----------------------------------
--2024-08-12 07:46:25.234378 UTC--
| Itration            | 864      |
| Real Det Return     | 5.61e+03 |
| Real Sto Return     | 5.3e+03  |
| Reward Loss         | 2.8e+04  |
| Running Env Steps   | 4320000  |
| Running Forward KL  | 6.34     |
| Running Reverse KL  | 3.96     |
| Running Update Time | 864      |
----------------------------------
--2024-08-12 07:48:00.011394 UTC---
| Itration            | 865       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -3.77e+05 |
| Running Env Steps   | 4325000   |
| Running Forward KL  | 6.6       |
| Running Reverse KL  | 4.69      |
| Running Update Time | 865       |
-----------------------------------
--2024-08-12 07:49:32.511472 UTC--
| Itration            | 866      |
| Real Det Return     | 5.6e+03  |
| Real Sto Return     | 5.6e+03  |
| Reward Loss         | 1.54e+05 |
| Running Env Steps   | 4330000  |
| Running Forward KL  | 6.77     |
| Running Reverse KL  | 5.05     |
| Running Update Time | 866      |
----------------------------------
--2024-08-12 07:51:07.415753 UTC---
| Itration            | 867       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -2.02e+05 |
| Running Env Steps   | 4335000   |
| Running Forward KL  | 6.46      |
| Running Reverse KL  | 4.03      |
| Running Update Time | 867       |
-----------------------------------
--2024-08-12 07:52:39.102701 UTC---
| Itration            | 868       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -9.66e+05 |
| Running Env Steps   | 4340000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 3.31      |
| Running Update Time | 868       |
-----------------------------------
--2024-08-12 07:54:11.921179 UTC---
| Itration            | 869       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -2.28e+05 |
| Running Env Steps   | 4345000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.7       |
| Running Update Time | 869       |
-----------------------------------
--2024-08-12 07:55:47.184471 UTC---
| Itration            | 870       |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -4.38e+05 |
| Running Env Steps   | 4350000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 870       |
-----------------------------------
--2024-08-12 07:57:22.394988 UTC---
| Itration            | 871       |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -1.39e+05 |
| Running Env Steps   | 4355000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 4.31      |
| Running Update Time | 871       |
-----------------------------------
--2024-08-12 07:59:20.684856 UTC---
| Itration            | 872       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -1.96e+05 |
| Running Env Steps   | 4360000   |
| Running Forward KL  | 7.75      |
| Running Reverse KL  | 6.54      |
| Running Update Time | 872       |
-----------------------------------
--2024-08-12 08:01:17.985635 UTC---
| Itration            | 873       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -3.78e+05 |
| Running Env Steps   | 4365000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 3.69      |
| Running Update Time | 873       |
-----------------------------------
--2024-08-12 08:03:00.026562 UTC---
| Itration            | 874       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -5.71e+05 |
| Running Env Steps   | 4370000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 3.93      |
| Running Update Time | 874       |
-----------------------------------
--2024-08-12 08:04:44.370072 UTC--
| Itration            | 875      |
| Real Det Return     | 5.59e+03 |
| Real Sto Return     | 5.27e+03 |
| Reward Loss         | -1.6e+05 |
| Running Env Steps   | 4375000  |
| Running Forward KL  | 6.89     |
| Running Reverse KL  | 30.1     |
| Running Update Time | 875      |
----------------------------------
--2024-08-12 08:06:21.646038 UTC---
| Itration            | 876       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -5.41e+05 |
| Running Env Steps   | 4380000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 4.03      |
| Running Update Time | 876       |
-----------------------------------
--2024-08-12 08:08:00.784559 UTC--
| Itration            | 877      |
| Real Det Return     | 5.67e+03 |
| Real Sto Return     | 5.4e+03  |
| Reward Loss         | 2.26e+05 |
| Running Env Steps   | 4385000  |
| Running Forward KL  | 6.38     |
| Running Reverse KL  | 4.14     |
| Running Update Time | 877      |
----------------------------------
--2024-08-12 08:09:56.774762 UTC--
| Itration            | 878      |
| Real Det Return     | 5.52e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | -2.6e+05 |
| Running Env Steps   | 4390000  |
| Running Forward KL  | 6.4      |
| Running Reverse KL  | 33       |
| Running Update Time | 878      |
----------------------------------
--2024-08-12 08:11:52.576408 UTC---
| Itration            | 879       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -4.58e+06 |
| Running Env Steps   | 4395000   |
| Running Forward KL  | 7.62      |
| Running Reverse KL  | 129       |
| Running Update Time | 879       |
-----------------------------------
--2024-08-12 08:13:53.276546 UTC---
| Itration            | 880       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -2.16e+06 |
| Running Env Steps   | 4400000   |
| Running Forward KL  | 6.86      |
| Running Reverse KL  | 34.7      |
| Running Update Time | 880       |
-----------------------------------
--2024-08-12 08:15:55.471107 UTC---
| Itration            | 881       |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 4405000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 38        |
| Running Update Time | 881       |
-----------------------------------
--2024-08-12 08:17:52.431762 UTC---
| Itration            | 882       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -1.75e+05 |
| Running Env Steps   | 4410000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 4.5       |
| Running Update Time | 882       |
-----------------------------------
--2024-08-12 08:19:52.155496 UTC---
| Itration            | 883       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -3.11e+05 |
| Running Env Steps   | 4415000   |
| Running Forward KL  | 6.7       |
| Running Reverse KL  | 4.55      |
| Running Update Time | 883       |
-----------------------------------
--2024-08-12 08:21:54.924612 UTC---
| Itration            | 884       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -7.68e+04 |
| Running Env Steps   | 4420000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 3.76      |
| Running Update Time | 884       |
-----------------------------------
--2024-08-12 08:23:53.683417 UTC---
| Itration            | 885       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -1.47e+05 |
| Running Env Steps   | 4425000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 3.92      |
| Running Update Time | 885       |
-----------------------------------
--2024-08-12 08:25:57.908560 UTC---
| Itration            | 886       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -1.56e+05 |
| Running Env Steps   | 4430000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 4.81      |
| Running Update Time | 886       |
-----------------------------------
--2024-08-12 08:28:06.221767 UTC---
| Itration            | 887       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -6.56e+05 |
| Running Env Steps   | 4435000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 4.79      |
| Running Update Time | 887       |
-----------------------------------
--2024-08-12 08:30:05.230788 UTC--
| Itration            | 888      |
| Real Det Return     | 5.67e+03 |
| Real Sto Return     | 5.1e+03  |
| Reward Loss         | 3.21e+04 |
| Running Env Steps   | 4440000  |
| Running Forward KL  | 6.25     |
| Running Reverse KL  | 4.28     |
| Running Update Time | 888      |
----------------------------------
--2024-08-12 08:32:06.665732 UTC--
| Itration            | 889      |
| Real Det Return     | 5.63e+03 |
| Real Sto Return     | 5.16e+03 |
| Reward Loss         | 1.67e+05 |
| Running Env Steps   | 4445000  |
| Running Forward KL  | 5.78     |
| Running Reverse KL  | 4.31     |
| Running Update Time | 889      |
----------------------------------
--2024-08-12 08:33:55.984140 UTC---
| Itration            | 890       |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -4.97e+05 |
| Running Env Steps   | 4450000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 4.2       |
| Running Update Time | 890       |
-----------------------------------
--2024-08-12 08:35:38.110828 UTC---
| Itration            | 891       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -5.27e+05 |
| Running Env Steps   | 4455000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 891       |
-----------------------------------
--2024-08-12 08:37:22.948500 UTC---
| Itration            | 892       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -3.02e+05 |
| Running Env Steps   | 4460000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 24.2      |
| Running Update Time | 892       |
-----------------------------------
--2024-08-12 08:39:05.493779 UTC---
| Itration            | 893       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -3.45e+05 |
| Running Env Steps   | 4465000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 34.3      |
| Running Update Time | 893       |
-----------------------------------
--2024-08-12 08:40:49.820607 UTC---
| Itration            | 894       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -5.42e+05 |
| Running Env Steps   | 4470000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 3.95      |
| Running Update Time | 894       |
-----------------------------------
--2024-08-12 08:42:31.278699 UTC---
| Itration            | 895       |
| Real Det Return     | 4.89e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -5.85e+05 |
| Running Env Steps   | 4475000   |
| Running Forward KL  | 6.28      |
| Running Reverse KL  | 37.7      |
| Running Update Time | 895       |
-----------------------------------
--2024-08-12 08:44:17.058031 UTC---
| Itration            | 896       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -3.52e+05 |
| Running Env Steps   | 4480000   |
| Running Forward KL  | 6.1       |
| Running Reverse KL  | 4.02      |
| Running Update Time | 896       |
-----------------------------------
--2024-08-12 08:45:49.792501 UTC---
| Itration            | 897       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -9.22e+05 |
| Running Env Steps   | 4485000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 30.7      |
| Running Update Time | 897       |
-----------------------------------
--2024-08-12 08:47:25.856777 UTC---
| Itration            | 898       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -4.32e+05 |
| Running Env Steps   | 4490000   |
| Running Forward KL  | 6.91      |
| Running Reverse KL  | 4.23      |
| Running Update Time | 898       |
-----------------------------------
--2024-08-12 08:48:59.956842 UTC---
| Itration            | 899       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 4495000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 899       |
-----------------------------------
--2024-08-12 08:50:31.586946 UTC---
| Itration            | 900       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -5.25e+05 |
| Running Env Steps   | 4500000   |
| Running Forward KL  | 6.85      |
| Running Reverse KL  | 13        |
| Running Update Time | 900       |
-----------------------------------
--2024-08-12 08:52:05.415208 UTC---
| Itration            | 901       |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 4505000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 40.5      |
| Running Update Time | 901       |
-----------------------------------
--2024-08-12 08:53:16.249295 UTC---
| Itration            | 902       |
| Real Det Return     | 301       |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -1.26e+07 |
| Running Env Steps   | 4510000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 264       |
| Running Update Time | 902       |
-----------------------------------
--2024-08-12 08:54:48.443458 UTC---
| Itration            | 903       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -1.69e+06 |
| Running Env Steps   | 4515000   |
| Running Forward KL  | 7.15      |
| Running Reverse KL  | 65.6      |
| Running Update Time | 903       |
-----------------------------------
--2024-08-12 08:56:20.550200 UTC--
| Itration            | 904      |
| Real Det Return     | 5.57e+03 |
| Real Sto Return     | 4.96e+03 |
| Reward Loss         | -2.2e+05 |
| Running Env Steps   | 4520000  |
| Running Forward KL  | 6.34     |
| Running Reverse KL  | 5.1      |
| Running Update Time | 904      |
----------------------------------
--2024-08-12 08:57:53.769397 UTC--
| Itration            | 905      |
| Real Det Return     | 5.46e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | -2e+06   |
| Running Env Steps   | 4525000  |
| Running Forward KL  | 6.62     |
| Running Reverse KL  | 37.4     |
| Running Update Time | 905      |
----------------------------------
--2024-08-12 08:59:28.746560 UTC---
| Itration            | 906       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -6.05e+05 |
| Running Env Steps   | 4530000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 22.4      |
| Running Update Time | 906       |
-----------------------------------
--2024-08-12 09:01:02.086326 UTC---
| Itration            | 907       |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -8.74e+05 |
| Running Env Steps   | 4535000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 35.5      |
| Running Update Time | 907       |
-----------------------------------
--2024-08-12 09:02:32.656453 UTC--
| Itration            | 908      |
| Real Det Return     | 5.52e+03 |
| Real Sto Return     | 5.02e+03 |
| Reward Loss         | -2.5e+05 |
| Running Env Steps   | 4540000  |
| Running Forward KL  | 6.11     |
| Running Reverse KL  | 3.98     |
| Running Update Time | 908      |
----------------------------------
--2024-08-12 09:04:08.013679 UTC---
| Itration            | 909       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -2.95e+05 |
| Running Env Steps   | 4545000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 14.7      |
| Running Update Time | 909       |
-----------------------------------
--2024-08-12 09:05:38.798713 UTC--
| Itration            | 910      |
| Real Det Return     | 5.77e+03 |
| Real Sto Return     | 5.38e+03 |
| Reward Loss         | 1.24e+05 |
| Running Env Steps   | 4550000  |
| Running Forward KL  | 6.91     |
| Running Reverse KL  | 4.75     |
| Running Update Time | 910      |
----------------------------------
--2024-08-12 09:07:13.151580 UTC--
| Itration            | 911      |
| Real Det Return     | 5.86e+03 |
| Real Sto Return     | 5.68e+03 |
| Reward Loss         | 3.89e+05 |
| Running Env Steps   | 4555000  |
| Running Forward KL  | 6.82     |
| Running Reverse KL  | 4.49     |
| Running Update Time | 911      |
----------------------------------
--2024-08-12 09:08:47.465554 UTC---
| Itration            | 912       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -4.22e+05 |
| Running Env Steps   | 4560000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 3.21      |
| Running Update Time | 912       |
-----------------------------------
--2024-08-12 09:10:17.194489 UTC---
| Itration            | 913       |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -2.18e+06 |
| Running Env Steps   | 4565000   |
| Running Forward KL  | 7.34      |
| Running Reverse KL  | 55.9      |
| Running Update Time | 913       |
-----------------------------------
--2024-08-12 09:11:53.605602 UTC--
| Itration            | 914      |
| Real Det Return     | 5.77e+03 |
| Real Sto Return     | 5.41e+03 |
| Reward Loss         | 3.13e+05 |
| Running Env Steps   | 4570000  |
| Running Forward KL  | 6.93     |
| Running Reverse KL  | 5.19     |
| Running Update Time | 914      |
----------------------------------
--2024-08-12 09:13:27.613255 UTC---
| Itration            | 915       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -4.98e+05 |
| Running Env Steps   | 4575000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 4.01      |
| Running Update Time | 915       |
-----------------------------------
--2024-08-12 09:14:57.926234 UTC---
| Itration            | 916       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -3.59e+05 |
| Running Env Steps   | 4580000   |
| Running Forward KL  | 7.4       |
| Running Reverse KL  | 92.6      |
| Running Update Time | 916       |
-----------------------------------
--2024-08-12 09:16:34.632753 UTC---
| Itration            | 917       |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -4.24e+05 |
| Running Env Steps   | 4585000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 3.91      |
| Running Update Time | 917       |
-----------------------------------
--2024-08-12 09:18:06.364508 UTC---
| Itration            | 918       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 4590000   |
| Running Forward KL  | 5.58      |
| Running Reverse KL  | 3.08      |
| Running Update Time | 918       |
-----------------------------------
--2024-08-12 09:19:37.310924 UTC--
| Itration            | 919      |
| Real Det Return     | 5.65e+03 |
| Real Sto Return     | 4.56e+03 |
| Reward Loss         | 1.13e+05 |
| Running Env Steps   | 4595000  |
| Running Forward KL  | 7.44     |
| Running Reverse KL  | 56.6     |
| Running Update Time | 919      |
----------------------------------
--2024-08-12 09:21:11.276631 UTC---
| Itration            | 920       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -3.69e+05 |
| Running Env Steps   | 4600000   |
| Running Forward KL  | 7.09      |
| Running Reverse KL  | 17.4      |
| Running Update Time | 920       |
-----------------------------------
--2024-08-12 09:22:43.056032 UTC---
| Itration            | 921       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -8.94e+05 |
| Running Env Steps   | 4605000   |
| Running Forward KL  | 6.56      |
| Running Reverse KL  | 3.83      |
| Running Update Time | 921       |
-----------------------------------
--2024-08-12 09:24:18.678556 UTC---
| Itration            | 922       |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -1.01e+05 |
| Running Env Steps   | 4610000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 4.94      |
| Running Update Time | 922       |
-----------------------------------
--2024-08-12 09:25:53.225102 UTC---
| Itration            | 923       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -7.69e+05 |
| Running Env Steps   | 4615000   |
| Running Forward KL  | 6.1       |
| Running Reverse KL  | 3.48      |
| Running Update Time | 923       |
-----------------------------------
--2024-08-12 09:27:25.098970 UTC---
| Itration            | 924       |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -3.64e+05 |
| Running Env Steps   | 4620000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 4.2       |
| Running Update Time | 924       |
-----------------------------------
--2024-08-12 09:29:01.860107 UTC---
| Itration            | 925       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -8.84e+05 |
| Running Env Steps   | 4625000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 3.31      |
| Running Update Time | 925       |
-----------------------------------
--2024-08-12 09:30:35.400725 UTC--
| Itration            | 926      |
| Real Det Return     | 5.93e+03 |
| Real Sto Return     | 5.64e+03 |
| Reward Loss         | 3.17e+05 |
| Running Env Steps   | 4630000  |
| Running Forward KL  | 7.29     |
| Running Reverse KL  | 25.2     |
| Running Update Time | 926      |
----------------------------------
--2024-08-12 09:32:07.919397 UTC---
| Itration            | 927       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -7.57e+05 |
| Running Env Steps   | 4635000   |
| Running Forward KL  | 5.86      |
| Running Reverse KL  | 3.62      |
| Running Update Time | 927       |
-----------------------------------
--2024-08-12 09:33:41.708964 UTC---
| Itration            | 928       |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -3.52e+05 |
| Running Env Steps   | 4640000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 19.1      |
| Running Update Time | 928       |
-----------------------------------
--2024-08-12 09:35:14.153858 UTC---
| Itration            | 929       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -2.84e+05 |
| Running Env Steps   | 4645000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 929       |
-----------------------------------
--2024-08-12 09:36:46.374127 UTC---
| Itration            | 930       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -1.34e+05 |
| Running Env Steps   | 4650000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 23.6      |
| Running Update Time | 930       |
-----------------------------------
--2024-08-12 09:38:21.883186 UTC---
| Itration            | 931       |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -3.86e+05 |
| Running Env Steps   | 4655000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 22.2      |
| Running Update Time | 931       |
-----------------------------------
--2024-08-12 09:39:54.235580 UTC---
| Itration            | 932       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -3.86e+05 |
| Running Env Steps   | 4660000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 3.13      |
| Running Update Time | 932       |
-----------------------------------
--2024-08-12 09:41:27.161600 UTC---
| Itration            | 933       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -5.17e+05 |
| Running Env Steps   | 4665000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 28.2      |
| Running Update Time | 933       |
-----------------------------------
--2024-08-12 09:42:59.897668 UTC---
| Itration            | 934       |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 4670000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 39.5      |
| Running Update Time | 934       |
-----------------------------------
--2024-08-12 09:44:31.423017 UTC---
| Itration            | 935       |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -3.12e+04 |
| Running Env Steps   | 4675000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 4.55      |
| Running Update Time | 935       |
-----------------------------------
--2024-08-12 09:46:07.130819 UTC---
| Itration            | 936       |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -4.16e+05 |
| Running Env Steps   | 4680000   |
| Running Forward KL  | 5.95      |
| Running Reverse KL  | 16.1      |
| Running Update Time | 936       |
-----------------------------------
--2024-08-12 09:47:39.646712 UTC---
| Itration            | 937       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -5.22e+05 |
| Running Env Steps   | 4685000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 19.4      |
| Running Update Time | 937       |
-----------------------------------
--2024-08-12 09:49:10.659075 UTC---
| Itration            | 938       |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.79e+05 |
| Running Env Steps   | 4690000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 4.27      |
| Running Update Time | 938       |
-----------------------------------
--2024-08-12 09:50:46.732170 UTC---
| Itration            | 939       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -6.25e+05 |
| Running Env Steps   | 4695000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 4.44      |
| Running Update Time | 939       |
-----------------------------------
--2024-08-12 09:52:18.850282 UTC---
| Itration            | 940       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -8.67e+05 |
| Running Env Steps   | 4700000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 4.03      |
| Running Update Time | 940       |
-----------------------------------
--2024-08-12 09:53:47.602273 UTC---
| Itration            | 941       |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 3.96e+03  |
| Reward Loss         | -5.82e+05 |
| Running Env Steps   | 4705000   |
| Running Forward KL  | 7.3       |
| Running Reverse KL  | 65.6      |
| Running Update Time | 941       |
-----------------------------------
--2024-08-12 09:55:20.761128 UTC---
| Itration            | 942       |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 4710000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 4.59      |
| Running Update Time | 942       |
-----------------------------------
--2024-08-12 09:56:51.994701 UTC---
| Itration            | 943       |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -4.51e+05 |
| Running Env Steps   | 4715000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 4.18      |
| Running Update Time | 943       |
-----------------------------------
--2024-08-12 09:58:23.725789 UTC---
| Itration            | 944       |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 3.96e+03  |
| Reward Loss         | -7.15e+05 |
| Running Env Steps   | 4720000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 5.07      |
| Running Update Time | 944       |
-----------------------------------
--2024-08-12 09:59:56.120727 UTC---
| Itration            | 945       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 4725000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 3.57      |
| Running Update Time | 945       |
-----------------------------------
--2024-08-12 10:01:25.254688 UTC---
| Itration            | 946       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -2.02e+06 |
| Running Env Steps   | 4730000   |
| Running Forward KL  | 7.27      |
| Running Reverse KL  | 56.9      |
| Running Update Time | 946       |
-----------------------------------
--2024-08-12 10:03:00.686799 UTC---
| Itration            | 947       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 4735000   |
| Running Forward KL  | 6.62      |
| Running Reverse KL  | 25.4      |
| Running Update Time | 947       |
-----------------------------------
--2024-08-12 10:04:31.619529 UTC---
| Itration            | 948       |
| Real Det Return     | 5.77e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -7.93e+05 |
| Running Env Steps   | 4740000   |
| Running Forward KL  | 7.14      |
| Running Reverse KL  | 30.3      |
| Running Update Time | 948       |
-----------------------------------
--2024-08-12 10:06:05.329800 UTC---
| Itration            | 949       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -1.87e+05 |
| Running Env Steps   | 4745000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 4.13      |
| Running Update Time | 949       |
-----------------------------------
--2024-08-12 10:07:40.812698 UTC---
| Itration            | 950       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -6.83e+05 |
| Running Env Steps   | 4750000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 4.16      |
| Running Update Time | 950       |
-----------------------------------
--2024-08-12 10:09:14.044052 UTC---
| Itration            | 951       |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -7.16e+05 |
| Running Env Steps   | 4755000   |
| Running Forward KL  | 7.06      |
| Running Reverse KL  | 26.7      |
| Running Update Time | 951       |
-----------------------------------
--2024-08-12 10:10:48.476749 UTC---
| Itration            | 952       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -7.41e+05 |
| Running Env Steps   | 4760000   |
| Running Forward KL  | 6.83      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 952       |
-----------------------------------
--2024-08-12 10:12:22.493788 UTC--
| Itration            | 953      |
| Real Det Return     | 5.41e+03 |
| Real Sto Return     | 5.04e+03 |
| Reward Loss         | -8.1e+05 |
| Running Env Steps   | 4765000  |
| Running Forward KL  | 6.06     |
| Running Reverse KL  | 3.53     |
| Running Update Time | 953      |
----------------------------------
--2024-08-12 10:13:54.490474 UTC---
| Itration            | 954       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -7.83e+05 |
| Running Env Steps   | 4770000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 3.71      |
| Running Update Time | 954       |
-----------------------------------
--2024-08-12 10:15:30.725653 UTC---
| Itration            | 955       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -5.54e+05 |
| Running Env Steps   | 4775000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 4.63      |
| Running Update Time | 955       |
-----------------------------------
--2024-08-12 10:17:04.327727 UTC---
| Itration            | 956       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -8.93e+05 |
| Running Env Steps   | 4780000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 4.13      |
| Running Update Time | 956       |
-----------------------------------
--2024-08-12 10:18:34.817460 UTC---
| Itration            | 957       |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -7.54e+05 |
| Running Env Steps   | 4785000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 31.8      |
| Running Update Time | 957       |
-----------------------------------
--2024-08-12 10:20:10.274674 UTC---
| Itration            | 958       |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -3.68e+05 |
| Running Env Steps   | 4790000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 23        |
| Running Update Time | 958       |
-----------------------------------
--2024-08-12 10:21:40.753853 UTC---
| Itration            | 959       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -7.15e+05 |
| Running Env Steps   | 4795000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 4.23      |
| Running Update Time | 959       |
-----------------------------------
--2024-08-12 10:23:15.538529 UTC---
| Itration            | 960       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -4.17e+05 |
| Running Env Steps   | 4800000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 3.85      |
| Running Update Time | 960       |
-----------------------------------
--2024-08-12 10:24:49.237132 UTC---
| Itration            | 961       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -8.96e+05 |
| Running Env Steps   | 4805000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 3.71      |
| Running Update Time | 961       |
-----------------------------------
--2024-08-12 10:26:19.913062 UTC---
| Itration            | 962       |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -3.08e+05 |
| Running Env Steps   | 4810000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 3.41      |
| Running Update Time | 962       |
-----------------------------------
--2024-08-12 10:27:54.595142 UTC---
| Itration            | 963       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -7.18e+05 |
| Running Env Steps   | 4815000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 11.1      |
| Running Update Time | 963       |
-----------------------------------
--2024-08-12 10:29:28.173416 UTC---
| Itration            | 964       |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -5.99e+05 |
| Running Env Steps   | 4820000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 964       |
-----------------------------------
--2024-08-12 10:30:59.549990 UTC---
| Itration            | 965       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -4.27e+05 |
| Running Env Steps   | 4825000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 3.85      |
| Running Update Time | 965       |
-----------------------------------
--2024-08-12 10:32:35.939371 UTC---
| Itration            | 966       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 4830000   |
| Running Forward KL  | 5.53      |
| Running Reverse KL  | 3.09      |
| Running Update Time | 966       |
-----------------------------------
--2024-08-12 10:34:08.641192 UTC---
| Itration            | 967       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -5.31e+05 |
| Running Env Steps   | 4835000   |
| Running Forward KL  | 6.4       |
| Running Reverse KL  | 20        |
| Running Update Time | 967       |
-----------------------------------
--2024-08-12 10:35:41.247385 UTC--
| Itration            | 968      |
| Real Det Return     | 5.44e+03 |
| Real Sto Return     | 5.33e+03 |
| Reward Loss         | -8.4e+05 |
| Running Env Steps   | 4840000  |
| Running Forward KL  | 6.36     |
| Running Reverse KL  | 4        |
| Running Update Time | 968      |
----------------------------------
--2024-08-12 10:37:16.066899 UTC--
| Itration            | 969      |
| Real Det Return     | 5.53e+03 |
| Real Sto Return     | 4.83e+03 |
| Reward Loss         | 6.91e+04 |
| Running Env Steps   | 4845000  |
| Running Forward KL  | 6.99     |
| Running Reverse KL  | 50.8     |
| Running Update Time | 969      |
----------------------------------
--2024-08-12 10:38:48.100929 UTC---
| Itration            | 970       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -7.43e+05 |
| Running Env Steps   | 4850000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 4.39      |
| Running Update Time | 970       |
-----------------------------------
--2024-08-12 10:40:22.738540 UTC---
| Itration            | 971       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 4855000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 3.51      |
| Running Update Time | 971       |
-----------------------------------
--2024-08-12 10:41:54.900733 UTC---
| Itration            | 972       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.52e+03  |
| Reward Loss         | -7.39e+05 |
| Running Env Steps   | 4860000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 11.9      |
| Running Update Time | 972       |
-----------------------------------
--2024-08-12 10:43:26.090064 UTC---
| Itration            | 973       |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -3.63e+05 |
| Running Env Steps   | 4865000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 973       |
-----------------------------------
--2024-08-12 10:45:02.493013 UTC---
| Itration            | 974       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -8.23e+05 |
| Running Env Steps   | 4870000   |
| Running Forward KL  | 6.46      |
| Running Reverse KL  | 3.8       |
| Running Update Time | 974       |
-----------------------------------
--2024-08-12 10:46:35.311889 UTC---
| Itration            | 975       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -4.22e+05 |
| Running Env Steps   | 4875000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 975       |
-----------------------------------
--2024-08-12 10:48:05.710678 UTC---
| Itration            | 976       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 4.35e+03  |
| Reward Loss         | -3.67e+06 |
| Running Env Steps   | 4880000   |
| Running Forward KL  | 9.49      |
| Running Reverse KL  | 194       |
| Running Update Time | 976       |
-----------------------------------
--2024-08-12 10:49:43.707127 UTC---
| Itration            | 977       |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 4885000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 33.1      |
| Running Update Time | 977       |
-----------------------------------
--2024-08-12 10:51:14.115648 UTC---
| Itration            | 978       |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -4.33e+05 |
| Running Env Steps   | 4890000   |
| Running Forward KL  | 6.31      |
| Running Reverse KL  | 6.27      |
| Running Update Time | 978       |
-----------------------------------
--2024-08-12 10:52:47.292426 UTC---
| Itration            | 979       |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -7.28e+05 |
| Running Env Steps   | 4895000   |
| Running Forward KL  | 5.75      |
| Running Reverse KL  | 3.48      |
| Running Update Time | 979       |
-----------------------------------
--2024-08-12 10:54:20.875238 UTC--
| Itration            | 980      |
| Real Det Return     | 5.52e+03 |
| Real Sto Return     | 5.11e+03 |
| Reward Loss         | -1.3e+06 |
| Running Env Steps   | 4900000  |
| Running Forward KL  | 6.78     |
| Running Reverse KL  | 24.4     |
| Running Update Time | 980      |
----------------------------------
--2024-08-12 10:55:51.894493 UTC---
| Itration            | 981       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -7.05e+05 |
| Running Env Steps   | 4905000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 4.15      |
| Running Update Time | 981       |
-----------------------------------
--2024-08-12 10:57:26.817678 UTC---
| Itration            | 982       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -6.89e+05 |
| Running Env Steps   | 4910000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 4.3       |
| Running Update Time | 982       |
-----------------------------------
--2024-08-12 10:58:59.969655 UTC---
| Itration            | 983       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -9.99e+05 |
| Running Env Steps   | 4915000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 15.9      |
| Running Update Time | 983       |
-----------------------------------
--2024-08-12 11:00:28.625450 UTC---
| Itration            | 984       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -9.43e+05 |
| Running Env Steps   | 4920000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 40.5      |
| Running Update Time | 984       |
-----------------------------------
--2024-08-12 11:02:04.482044 UTC---
| Itration            | 985       |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 4925000   |
| Running Forward KL  | 6.93      |
| Running Reverse KL  | 75.5      |
| Running Update Time | 985       |
-----------------------------------
--2024-08-12 11:03:38.173527 UTC---
| Itration            | 986       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -6.76e+05 |
| Running Env Steps   | 4930000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 4.06      |
| Running Update Time | 986       |
-----------------------------------
--2024-08-12 11:05:03.864313 UTC---
| Itration            | 987       |
| Real Det Return     | 4.06e+03  |
| Real Sto Return     | 4.37e+03  |
| Reward Loss         | -3.42e+06 |
| Running Env Steps   | 4935000   |
| Running Forward KL  | 6.83      |
| Running Reverse KL  | 67.4      |
| Running Update Time | 987       |
-----------------------------------
--2024-08-12 11:06:12.574099 UTC--
| Itration            | 988      |
| Real Det Return     | 407      |
| Real Sto Return     | 494      |
| Reward Loss         | -1.5e+07 |
| Running Env Steps   | 4940000  |
| Running Forward KL  | 14.8     |
| Running Reverse KL  | 257      |
| Running Update Time | 988      |
----------------------------------
--2024-08-12 11:07:16.782503 UTC---
| Itration            | 989       |
| Real Det Return     | 102       |
| Real Sto Return     | 337       |
| Reward Loss         | -2.79e+07 |
| Running Env Steps   | 4945000   |
| Running Forward KL  | 28.9      |
| Running Reverse KL  | 348       |
| Running Update Time | 989       |
-----------------------------------
--2024-08-12 11:08:22.827123 UTC---
| Itration            | 990       |
| Real Det Return     | 156       |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -2.08e+07 |
| Running Env Steps   | 4950000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 287       |
| Running Update Time | 990       |
-----------------------------------
--2024-08-12 11:09:41.929267 UTC---
| Itration            | 991       |
| Real Det Return     | 2.43e+03  |
| Real Sto Return     | 2.51e+03  |
| Reward Loss         | -9.99e+06 |
| Running Env Steps   | 4955000   |
| Running Forward KL  | 9.06      |
| Running Reverse KL  | 166       |
| Running Update Time | 991       |
-----------------------------------
--2024-08-12 11:11:16.072901 UTC---
| Itration            | 992       |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -3.65e+05 |
| Running Env Steps   | 4960000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 4.61      |
| Running Update Time | 992       |
-----------------------------------
--2024-08-12 11:12:47.540096 UTC---
| Itration            | 993       |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -4.99e+05 |
| Running Env Steps   | 4965000   |
| Running Forward KL  | 7.41      |
| Running Reverse KL  | 6.02      |
| Running Update Time | 993       |
-----------------------------------
--2024-08-12 11:14:24.231457 UTC---
| Itration            | 994       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -6.35e+05 |
| Running Env Steps   | 4970000   |
| Running Forward KL  | 7.31      |
| Running Reverse KL  | 28.4      |
| Running Update Time | 994       |
-----------------------------------
--2024-08-12 11:15:56.792524 UTC---
| Itration            | 995       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -5.92e+05 |
| Running Env Steps   | 4975000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 3.75      |
| Running Update Time | 995       |
-----------------------------------
--2024-08-12 11:17:30.287617 UTC---
| Itration            | 996       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -5.34e+05 |
| Running Env Steps   | 4980000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 4.22      |
| Running Update Time | 996       |
-----------------------------------
--2024-08-12 11:19:07.701319 UTC---
| Itration            | 997       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 4985000   |
| Running Forward KL  | 7.09      |
| Running Reverse KL  | 4.64      |
| Running Update Time | 997       |
-----------------------------------
--2024-08-12 11:20:40.050856 UTC---
| Itration            | 998       |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -3.68e+05 |
| Running Env Steps   | 4990000   |
| Running Forward KL  | 6.98      |
| Running Reverse KL  | 17.9      |
| Running Update Time | 998       |
-----------------------------------
--2024-08-12 11:22:13.204126 UTC---
| Itration            | 999       |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -9.93e+05 |
| Running Env Steps   | 4995000   |
| Running Forward KL  | 7.03      |
| Running Reverse KL  | 35.5      |
| Running Update Time | 999       |
-----------------------------------
--2024-08-12 11:23:48.339409 UTC---
| Itration            | 1000      |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 5000000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 34.4      |
| Running Update Time | 1000      |
-----------------------------------
--2024-08-12 11:25:02.496864 UTC--
| Itration            | 1001     |
| Real Det Return     | 1.79e+03 |
| Real Sto Return     | 3.32e+03 |
| Reward Loss         | 2.52e+05 |
| Running Env Steps   | 5005000  |
| Running Forward KL  | 7.81     |
| Running Reverse KL  | 98.2     |
| Running Update Time | 1001     |
----------------------------------
--2024-08-12 11:26:38.615480 UTC---
| Itration            | 1002      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 5010000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 4.89      |
| Running Update Time | 1002      |
-----------------------------------
--2024-08-12 11:28:11.070601 UTC---
| Itration            | 1003      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 5015000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 35.7      |
| Running Update Time | 1003      |
-----------------------------------
--2024-08-12 11:29:43.818589 UTC---
| Itration            | 1004      |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 5020000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 4.91      |
| Running Update Time | 1004      |
-----------------------------------
--2024-08-12 11:31:20.669092 UTC---
| Itration            | 1005      |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 5025000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 3.39      |
| Running Update Time | 1005      |
-----------------------------------
--2024-08-12 11:32:52.231729 UTC---
| Itration            | 1006      |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -7.38e+05 |
| Running Env Steps   | 5030000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 4.71      |
| Running Update Time | 1006      |
-----------------------------------
--2024-08-12 11:34:27.650326 UTC---
| Itration            | 1007      |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -7.05e+05 |
| Running Env Steps   | 5035000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 34.2      |
| Running Update Time | 1007      |
-----------------------------------
--2024-08-12 11:36:01.497369 UTC---
| Itration            | 1008      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -6.91e+05 |
| Running Env Steps   | 5040000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 35.4      |
| Running Update Time | 1008      |
-----------------------------------
--2024-08-12 11:37:33.465956 UTC---
| Itration            | 1009      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.84e+05 |
| Running Env Steps   | 5045000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 30.5      |
| Running Update Time | 1009      |
-----------------------------------
--2024-08-12 11:39:09.204705 UTC---
| Itration            | 1010      |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -5.92e+06 |
| Running Env Steps   | 5050000   |
| Running Forward KL  | 6.94      |
| Running Reverse KL  | 77.7      |
| Running Update Time | 1010      |
-----------------------------------
--2024-08-12 11:40:43.547542 UTC---
| Itration            | 1011      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 5055000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 4.55      |
| Running Update Time | 1011      |
-----------------------------------
--2024-08-12 11:42:14.588357 UTC---
| Itration            | 1012      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -6.82e+05 |
| Running Env Steps   | 5060000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 4.92      |
| Running Update Time | 1012      |
-----------------------------------
--2024-08-12 11:43:52.851769 UTC--
| Itration            | 1013     |
| Real Det Return     | 5.36e+03 |
| Real Sto Return     | 5.3e+03  |
| Reward Loss         | -1.1e+06 |
| Running Env Steps   | 5065000  |
| Running Forward KL  | 6.33     |
| Running Reverse KL  | 4.32     |
| Running Update Time | 1013     |
----------------------------------
--2024-08-12 11:45:26.450816 UTC--
| Itration            | 1014     |
| Real Det Return     | 5.62e+03 |
| Real Sto Return     | 5.48e+03 |
| Reward Loss         | -4.3e+05 |
| Running Env Steps   | 5070000  |
| Running Forward KL  | 6.65     |
| Running Reverse KL  | 4.81     |
| Running Update Time | 1014     |
----------------------------------
--2024-08-12 11:47:00.877310 UTC---
| Itration            | 1015      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -4.72e+05 |
| Running Env Steps   | 5075000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 4.95      |
| Running Update Time | 1015      |
-----------------------------------
--2024-08-12 11:48:38.288438 UTC---
| Itration            | 1016      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -9.87e+05 |
| Running Env Steps   | 5080000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 3.86      |
| Running Update Time | 1016      |
-----------------------------------
--2024-08-12 11:50:07.880019 UTC--
| Itration            | 1017     |
| Real Det Return     | 5.57e+03 |
| Real Sto Return     | 4.58e+03 |
| Reward Loss         | 3.02e+05 |
| Running Env Steps   | 5085000  |
| Running Forward KL  | 7.53     |
| Running Reverse KL  | 92.4     |
| Running Update Time | 1017     |
----------------------------------
--2024-08-12 11:51:42.479681 UTC---
| Itration            | 1018      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -5.69e+05 |
| Running Env Steps   | 5090000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 4.08      |
| Running Update Time | 1018      |
-----------------------------------
--2024-08-12 11:53:17.610027 UTC--
| Itration            | 1019     |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.34e+03 |
| Reward Loss         | -9.9e+05 |
| Running Env Steps   | 5095000  |
| Running Forward KL  | 6.07     |
| Running Reverse KL  | 3.84     |
| Running Update Time | 1019     |
----------------------------------
--2024-08-12 11:54:41.775383 UTC---
| Itration            | 1020      |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 2.92e+03  |
| Reward Loss         | -1.74e+06 |
| Running Env Steps   | 5100000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 119       |
| Running Update Time | 1020      |
-----------------------------------
--2024-08-12 11:56:18.500094 UTC---
| Itration            | 1021      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 5105000   |
| Running Forward KL  | 6.28      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 1021      |
-----------------------------------
--2024-08-12 11:57:52.151444 UTC--
| Itration            | 1022     |
| Real Det Return     | 5.4e+03  |
| Real Sto Return     | 5.02e+03 |
| Reward Loss         | -1.3e+06 |
| Running Env Steps   | 5110000  |
| Running Forward KL  | 5.96     |
| Running Reverse KL  | 4.13     |
| Running Update Time | 1022     |
----------------------------------
--2024-08-12 11:59:23.685094 UTC--
| Itration            | 1023     |
| Real Det Return     | 5.58e+03 |
| Real Sto Return     | 5.66e+03 |
| Reward Loss         | 2.51e+05 |
| Running Env Steps   | 5115000  |
| Running Forward KL  | 6.31     |
| Running Reverse KL  | 4.62     |
| Running Update Time | 1023     |
----------------------------------
--2024-08-12 12:00:59.202807 UTC---
| Itration            | 1024      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -9.96e+05 |
| Running Env Steps   | 5120000   |
| Running Forward KL  | 6.69      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 1024      |
-----------------------------------
--2024-08-12 12:02:33.973688 UTC---
| Itration            | 1025      |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -5.53e+06 |
| Running Env Steps   | 5125000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 37.5      |
| Running Update Time | 1025      |
-----------------------------------
--2024-08-12 12:04:07.348874 UTC---
| Itration            | 1026      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -9.55e+05 |
| Running Env Steps   | 5130000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 1026      |
-----------------------------------
--2024-08-12 12:05:43.287137 UTC---
| Itration            | 1027      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -7.93e+05 |
| Running Env Steps   | 5135000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 5.24      |
| Running Update Time | 1027      |
-----------------------------------
--2024-08-12 12:07:16.397897 UTC---
| Itration            | 1028      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.36e+06 |
| Running Env Steps   | 5140000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 4.27      |
| Running Update Time | 1028      |
-----------------------------------
--2024-08-12 12:08:52.170664 UTC---
| Itration            | 1029      |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.97e+06 |
| Running Env Steps   | 5145000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 1029      |
-----------------------------------
--2024-08-12 12:10:28.764034 UTC--
| Itration            | 1030     |
| Real Det Return     | 5.74e+03 |
| Real Sto Return     | 5.29e+03 |
| Reward Loss         | 4.39e+05 |
| Running Env Steps   | 5150000  |
| Running Forward KL  | 7.28     |
| Running Reverse KL  | 49.3     |
| Running Update Time | 1030     |
----------------------------------
--2024-08-12 12:11:47.456803 UTC---
| Itration            | 1031      |
| Real Det Return     | 978       |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -3.12e+05 |
| Running Env Steps   | 5155000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 4.57      |
| Running Update Time | 1031      |
-----------------------------------
--2024-08-12 12:13:24.540684 UTC---
| Itration            | 1032      |
| Real Det Return     | 5.71e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -4.73e+05 |
| Running Env Steps   | 5160000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 1032      |
-----------------------------------
--2024-08-12 12:14:58.263303 UTC---
| Itration            | 1033      |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 5165000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 4.59      |
| Running Update Time | 1033      |
-----------------------------------
--2024-08-12 12:16:31.542877 UTC---
| Itration            | 1034      |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 5170000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 35        |
| Running Update Time | 1034      |
-----------------------------------
--2024-08-12 12:18:08.536940 UTC---
| Itration            | 1035      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -3.36e+05 |
| Running Env Steps   | 5175000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 18.7      |
| Running Update Time | 1035      |
-----------------------------------
--2024-08-12 12:19:42.956342 UTC---
| Itration            | 1036      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -6.37e+05 |
| Running Env Steps   | 5180000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 4.54      |
| Running Update Time | 1036      |
-----------------------------------
--2024-08-12 12:21:16.330556 UTC---
| Itration            | 1037      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -9.64e+05 |
| Running Env Steps   | 5185000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 27.1      |
| Running Update Time | 1037      |
-----------------------------------
--2024-08-12 12:22:51.570584 UTC---
| Itration            | 1038      |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -7.71e+04 |
| Running Env Steps   | 5190000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 4.64      |
| Running Update Time | 1038      |
-----------------------------------
--2024-08-12 12:24:24.460421 UTC---
| Itration            | 1039      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -6.95e+05 |
| Running Env Steps   | 5195000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 20.6      |
| Running Update Time | 1039      |
-----------------------------------
--2024-08-12 12:25:53.383022 UTC---
| Itration            | 1040      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 3.46e+03  |
| Reward Loss         | -7.18e+05 |
| Running Env Steps   | 5200000   |
| Running Forward KL  | 7.78      |
| Running Reverse KL  | 153       |
| Running Update Time | 1040      |
-----------------------------------
--2024-08-12 12:27:27.903724 UTC---
| Itration            | 1041      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 5205000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 3.73      |
| Running Update Time | 1041      |
-----------------------------------
--2024-08-12 12:29:03.883646 UTC---
| Itration            | 1042      |
| Real Det Return     | 5.73e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -7.72e+05 |
| Running Env Steps   | 5210000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 36.3      |
| Running Update Time | 1042      |
-----------------------------------
--2024-08-12 12:30:40.801950 UTC---
| Itration            | 1043      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -8.05e+05 |
| Running Env Steps   | 5215000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 3.98      |
| Running Update Time | 1043      |
-----------------------------------
--2024-08-12 12:32:14.941029 UTC---
| Itration            | 1044      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -8.49e+05 |
| Running Env Steps   | 5220000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 3.82      |
| Running Update Time | 1044      |
-----------------------------------
--2024-08-12 12:33:49.106908 UTC---
| Itration            | 1045      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 5225000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 4.19      |
| Running Update Time | 1045      |
-----------------------------------
--2024-08-12 12:35:27.301397 UTC---
| Itration            | 1046      |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -9.81e+05 |
| Running Env Steps   | 5230000   |
| Running Forward KL  | 7.31      |
| Running Reverse KL  | 5.21      |
| Running Update Time | 1046      |
-----------------------------------
--2024-08-12 12:37:00.989355 UTC---
| Itration            | 1047      |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -9.18e+05 |
| Running Env Steps   | 5235000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 3.86      |
| Running Update Time | 1047      |
-----------------------------------
--2024-08-12 12:38:33.899762 UTC---
| Itration            | 1048      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 5240000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 3.92      |
| Running Update Time | 1048      |
-----------------------------------
--2024-08-12 12:40:09.963370 UTC---
| Itration            | 1049      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -5.35e+05 |
| Running Env Steps   | 5245000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 27.9      |
| Running Update Time | 1049      |
-----------------------------------
--2024-08-12 12:41:43.158461 UTC---
| Itration            | 1050      |
| Real Det Return     | 5.75e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 5250000   |
| Running Forward KL  | 6.75      |
| Running Reverse KL  | 69.4      |
| Running Update Time | 1050      |
-----------------------------------
--2024-08-12 12:43:13.685359 UTC---
| Itration            | 1051      |
| Real Det Return     | 5.77e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -7.25e+05 |
| Running Env Steps   | 5255000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 26.4      |
| Running Update Time | 1051      |
-----------------------------------
--2024-08-12 12:44:49.660146 UTC--
| Itration            | 1052     |
| Real Det Return     | 5.74e+03 |
| Real Sto Return     | 5.28e+03 |
| Reward Loss         | -4.5e+05 |
| Running Env Steps   | 5260000  |
| Running Forward KL  | 6.33     |
| Running Reverse KL  | 4.02     |
| Running Update Time | 1052     |
----------------------------------
--2024-08-12 12:46:22.854737 UTC---
| Itration            | 1053      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -7.07e+05 |
| Running Env Steps   | 5265000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 1053      |
-----------------------------------
--2024-08-12 12:48:00.081039 UTC---
| Itration            | 1054      |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.59e+06 |
| Running Env Steps   | 5270000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 2.98      |
| Running Update Time | 1054      |
-----------------------------------
--2024-08-12 12:49:34.428459 UTC---
| Itration            | 1055      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -5.91e+05 |
| Running Env Steps   | 5275000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 3.69      |
| Running Update Time | 1055      |
-----------------------------------
--2024-08-12 12:51:08.629739 UTC---
| Itration            | 1056      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -9.01e+05 |
| Running Env Steps   | 5280000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 26.1      |
| Running Update Time | 1056      |
-----------------------------------
--2024-08-12 12:52:30.455587 UTC---
| Itration            | 1057      |
| Real Det Return     | 3.44e+03  |
| Real Sto Return     | 2.68e+03  |
| Reward Loss         | -7.66e+05 |
| Running Env Steps   | 5285000   |
| Running Forward KL  | 9.21      |
| Running Reverse KL  | 212       |
| Running Update Time | 1057      |
-----------------------------------
--2024-08-12 12:53:58.756649 UTC---
| Itration            | 1058      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 3.83e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 5290000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 85.2      |
| Running Update Time | 1058      |
-----------------------------------
--2024-08-12 12:55:34.247682 UTC--
| Itration            | 1059     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.29e+03 |
| Reward Loss         | 2.72e+04 |
| Running Env Steps   | 5295000  |
| Running Forward KL  | 6.21     |
| Running Reverse KL  | 34.4     |
| Running Update Time | 1059     |
----------------------------------
--2024-08-12 12:56:58.235565 UTC---
| Itration            | 1060      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 2.09e+03  |
| Reward Loss         | -1.14e+07 |
| Running Env Steps   | 5300000   |
| Running Forward KL  | 13.5      |
| Running Reverse KL  | 343       |
| Running Update Time | 1060      |
-----------------------------------
--2024-08-12 12:57:58.550809 UTC---
| Itration            | 1061      |
| Real Det Return     | 141       |
| Real Sto Return     | 142       |
| Reward Loss         | -9.39e+06 |
| Running Env Steps   | 5305000   |
| Running Forward KL  | 23        |
| Running Reverse KL  | 386       |
| Running Update Time | 1061      |
-----------------------------------
--2024-08-12 12:59:01.715539 UTC---
| Itration            | 1062      |
| Real Det Return     | 185       |
| Real Sto Return     | 1.46e+03  |
| Reward Loss         | -9.98e+06 |
| Running Env Steps   | 5310000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 325       |
| Running Update Time | 1062      |
-----------------------------------
--2024-08-12 13:00:33.822207 UTC---
| Itration            | 1063      |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -3.71e+06 |
| Running Env Steps   | 5315000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 84.3      |
| Running Update Time | 1063      |
-----------------------------------
--2024-08-12 13:02:07.879101 UTC---
| Itration            | 1064      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -4.13e+05 |
| Running Env Steps   | 5320000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 6.26      |
| Running Update Time | 1064      |
-----------------------------------
--2024-08-12 13:03:40.444775 UTC---
| Itration            | 1065      |
| Real Det Return     | 5.03e+03  |
| Real Sto Return     | 4.59e+03  |
| Reward Loss         | -4.37e+06 |
| Running Env Steps   | 5325000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 62.3      |
| Running Update Time | 1065      |
-----------------------------------
--2024-08-12 13:05:11.740113 UTC---
| Itration            | 1066      |
| Real Det Return     | 5.84e+03  |
| Real Sto Return     | 3.59e+03  |
| Reward Loss         | -1.03e+05 |
| Running Env Steps   | 5330000   |
| Running Forward KL  | 7.22      |
| Running Reverse KL  | 121       |
| Running Update Time | 1066      |
-----------------------------------
--2024-08-12 13:06:46.000397 UTC---
| Itration            | 1067      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -4.47e+05 |
| Running Env Steps   | 5335000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 4.2       |
| Running Update Time | 1067      |
-----------------------------------
--2024-08-12 13:08:21.379619 UTC---
| Itration            | 1068      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.61e+03  |
| Reward Loss         | -1.92e+05 |
| Running Env Steps   | 5340000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 4.68      |
| Running Update Time | 1068      |
-----------------------------------
--2024-08-12 13:09:57.488681 UTC--
| Itration            | 1069     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.48e+03 |
| Reward Loss         | 1.64e+05 |
| Running Env Steps   | 5345000  |
| Running Forward KL  | 5.77     |
| Running Reverse KL  | 8.58     |
| Running Update Time | 1069     |
----------------------------------
--2024-08-12 13:11:31.570928 UTC---
| Itration            | 1070      |
| Real Det Return     | 5.75e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -1.99e+05 |
| Running Env Steps   | 5350000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 4.72      |
| Running Update Time | 1070      |
-----------------------------------
--2024-08-12 13:13:06.669921 UTC---
| Itration            | 1071      |
| Real Det Return     | 5.74e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.58e+05 |
| Running Env Steps   | 5355000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 4.18      |
| Running Update Time | 1071      |
-----------------------------------
--2024-08-12 13:14:41.570024 UTC---
| Itration            | 1072      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -1.69e+06 |
| Running Env Steps   | 5360000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 41.2      |
| Running Update Time | 1072      |
-----------------------------------
--2024-08-12 13:16:16.265896 UTC---
| Itration            | 1073      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -4.54e+05 |
| Running Env Steps   | 5365000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 17.4      |
| Running Update Time | 1073      |
-----------------------------------
--2024-08-12 13:17:52.429222 UTC---
| Itration            | 1074      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -7.19e+05 |
| Running Env Steps   | 5370000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 4.67      |
| Running Update Time | 1074      |
-----------------------------------
--2024-08-12 13:19:27.799498 UTC---
| Itration            | 1075      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -2.86e+05 |
| Running Env Steps   | 5375000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 4.12      |
| Running Update Time | 1075      |
-----------------------------------
--2024-08-12 13:21:00.325914 UTC---
| Itration            | 1076      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -4.92e+05 |
| Running Env Steps   | 5380000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 3.5       |
| Running Update Time | 1076      |
-----------------------------------
--2024-08-12 13:22:37.842007 UTC---
| Itration            | 1077      |
| Real Det Return     | 5.74e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -3.41e+05 |
| Running Env Steps   | 5385000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 4.11      |
| Running Update Time | 1077      |
-----------------------------------
--2024-08-12 13:24:11.452281 UTC---
| Itration            | 1078      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.52e+05 |
| Running Env Steps   | 5390000   |
| Running Forward KL  | 6.56      |
| Running Reverse KL  | 4.03      |
| Running Update Time | 1078      |
-----------------------------------
--2024-08-12 13:25:46.334320 UTC---
| Itration            | 1079      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -3.46e+05 |
| Running Env Steps   | 5395000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 3.98      |
| Running Update Time | 1079      |
-----------------------------------
--2024-08-12 13:27:23.868415 UTC---
| Itration            | 1080      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -5.91e+05 |
| Running Env Steps   | 5400000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 27.6      |
| Running Update Time | 1080      |
-----------------------------------
--2024-08-12 13:28:56.008635 UTC---
| Itration            | 1081      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -5.83e+05 |
| Running Env Steps   | 5405000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 36        |
| Running Update Time | 1081      |
-----------------------------------
--2024-08-12 13:30:31.434759 UTC---
| Itration            | 1082      |
| Real Det Return     | 5.74e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -2.36e+05 |
| Running Env Steps   | 5410000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 22.3      |
| Running Update Time | 1082      |
-----------------------------------
--2024-08-12 13:32:08.085805 UTC--
| Itration            | 1083     |
| Real Det Return     | 5.82e+03 |
| Real Sto Return     | 5.51e+03 |
| Reward Loss         | 3.11e+05 |
| Running Env Steps   | 5415000  |
| Running Forward KL  | 6.45     |
| Running Reverse KL  | 4.92     |
| Running Update Time | 1083     |
----------------------------------
--2024-08-12 13:33:39.509129 UTC---
| Itration            | 1084      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -3.59e+05 |
| Running Env Steps   | 5420000   |
| Running Forward KL  | 7.04      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 1084      |
-----------------------------------
--2024-08-12 13:35:15.296983 UTC---
| Itration            | 1085      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -4.67e+05 |
| Running Env Steps   | 5425000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 15.4      |
| Running Update Time | 1085      |
-----------------------------------
--2024-08-12 13:36:50.139600 UTC---
| Itration            | 1086      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -3.69e+05 |
| Running Env Steps   | 5430000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 4.09      |
| Running Update Time | 1086      |
-----------------------------------
--2024-08-12 13:38:17.884492 UTC--
| Itration            | 1087     |
| Real Det Return     | 4.02e+03 |
| Real Sto Return     | 5.34e+03 |
| Reward Loss         | 3.31e+05 |
| Running Env Steps   | 5435000  |
| Running Forward KL  | 6.53     |
| Running Reverse KL  | 38.3     |
| Running Update Time | 1087     |
----------------------------------
--2024-08-12 13:39:54.118013 UTC---
| Itration            | 1088      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -4.63e+05 |
| Running Env Steps   | 5440000   |
| Running Forward KL  | 6.4       |
| Running Reverse KL  | 4.05      |
| Running Update Time | 1088      |
-----------------------------------
--2024-08-12 13:41:28.206158 UTC---
| Itration            | 1089      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -3.26e+05 |
| Running Env Steps   | 5445000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 1089      |
-----------------------------------
--2024-08-12 13:42:57.796051 UTC---
| Itration            | 1090      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -3.65e+05 |
| Running Env Steps   | 5450000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 4.42      |
| Running Update Time | 1090      |
-----------------------------------
--2024-08-12 13:44:34.177754 UTC---
| Itration            | 1091      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -5.08e+05 |
| Running Env Steps   | 5455000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 4.44      |
| Running Update Time | 1091      |
-----------------------------------
--2024-08-12 13:46:05.814584 UTC--
| Itration            | 1092     |
| Real Det Return     | 5.83e+03 |
| Real Sto Return     | 5.44e+03 |
| Reward Loss         | 1.03e+05 |
| Running Env Steps   | 5460000  |
| Running Forward KL  | 7.08     |
| Running Reverse KL  | 35.4     |
| Running Update Time | 1092     |
----------------------------------
--2024-08-12 13:47:39.512384 UTC---
| Itration            | 1093      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -7.89e+05 |
| Running Env Steps   | 5465000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 4.64      |
| Running Update Time | 1093      |
-----------------------------------
--2024-08-12 13:49:12.680049 UTC---
| Itration            | 1094      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -3.61e+05 |
| Running Env Steps   | 5470000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 1094      |
-----------------------------------
--2024-08-12 13:50:43.305566 UTC---
| Itration            | 1095      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.95e+05 |
| Running Env Steps   | 5475000   |
| Running Forward KL  | 6.62      |
| Running Reverse KL  | 4.47      |
| Running Update Time | 1095      |
-----------------------------------
--2024-08-12 13:52:18.332231 UTC---
| Itration            | 1096      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -1.12e+04 |
| Running Env Steps   | 5480000   |
| Running Forward KL  | 6.62      |
| Running Reverse KL  | 4.89      |
| Running Update Time | 1096      |
-----------------------------------
--2024-08-12 13:53:52.530225 UTC---
| Itration            | 1097      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -9.02e+05 |
| Running Env Steps   | 5485000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 18.3      |
| Running Update Time | 1097      |
-----------------------------------
--2024-08-12 13:55:24.692543 UTC---
| Itration            | 1098      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -6.19e+05 |
| Running Env Steps   | 5490000   |
| Running Forward KL  | 6.72      |
| Running Reverse KL  | 4.52      |
| Running Update Time | 1098      |
-----------------------------------
--2024-08-12 13:56:59.277169 UTC---
| Itration            | 1099      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -6.86e+05 |
| Running Env Steps   | 5495000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 3.85      |
| Running Update Time | 1099      |
-----------------------------------
--2024-08-12 13:58:30.867039 UTC--
| Itration            | 1100     |
| Real Det Return     | 5.78e+03 |
| Real Sto Return     | 5.59e+03 |
| Reward Loss         | 2.64e+04 |
| Running Env Steps   | 5500000  |
| Running Forward KL  | 6.12     |
| Running Reverse KL  | 4.29     |
| Running Update Time | 1100     |
----------------------------------
--2024-08-12 14:00:04.306033 UTC---
| Itration            | 1101      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -8.69e+05 |
| Running Env Steps   | 5505000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 39.5      |
| Running Update Time | 1101      |
-----------------------------------
--2024-08-12 14:01:41.011572 UTC--
| Itration            | 1102     |
| Real Det Return     | 5.59e+03 |
| Real Sto Return     | 5.43e+03 |
| Reward Loss         | -6.8e+05 |
| Running Env Steps   | 5510000  |
| Running Forward KL  | 6.24     |
| Running Reverse KL  | 3.82     |
| Running Update Time | 1102     |
----------------------------------
--2024-08-12 14:03:15.387214 UTC---
| Itration            | 1103      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -2.98e+05 |
| Running Env Steps   | 5515000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 3.93      |
| Running Update Time | 1103      |
-----------------------------------
--2024-08-12 14:04:51.743460 UTC--
| Itration            | 1104     |
| Real Det Return     | 5.64e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | -3.5e+05 |
| Running Env Steps   | 5520000  |
| Running Forward KL  | 6.54     |
| Running Reverse KL  | 4.3      |
| Running Update Time | 1104     |
----------------------------------
--2024-08-12 14:06:27.522005 UTC---
| Itration            | 1105      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -1.49e+06 |
| Running Env Steps   | 5525000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 38        |
| Running Update Time | 1105      |
-----------------------------------
--2024-08-12 14:08:01.111095 UTC--
| Itration            | 1106     |
| Real Det Return     | 5.86e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | 5.99e+05 |
| Running Env Steps   | 5530000  |
| Running Forward KL  | 6.61     |
| Running Reverse KL  | 5.13     |
| Running Update Time | 1106     |
----------------------------------
--2024-08-12 14:09:37.391885 UTC--
| Itration            | 1107     |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 5.38e+03 |
| Reward Loss         | -6.6e+05 |
| Running Env Steps   | 5535000  |
| Running Forward KL  | 5.95     |
| Running Reverse KL  | 4.1      |
| Running Update Time | 1107     |
----------------------------------
--2024-08-12 14:11:12.236089 UTC---
| Itration            | 1108      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -5.25e+05 |
| Running Env Steps   | 5540000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 22.4      |
| Running Update Time | 1108      |
-----------------------------------
--2024-08-12 14:12:46.125384 UTC---
| Itration            | 1109      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -7.43e+05 |
| Running Env Steps   | 5545000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 4.15      |
| Running Update Time | 1109      |
-----------------------------------
--2024-08-12 14:14:24.264127 UTC---
| Itration            | 1110      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -3.42e+05 |
| Running Env Steps   | 5550000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 4.21      |
| Running Update Time | 1110      |
-----------------------------------
--2024-08-12 14:15:58.203564 UTC---
| Itration            | 1111      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -6.09e+05 |
| Running Env Steps   | 5555000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 1111      |
-----------------------------------
--2024-08-12 14:17:30.281781 UTC--
| Itration            | 1112     |
| Real Det Return     | 6.01e+03 |
| Real Sto Return     | 4.79e+03 |
| Reward Loss         | 1.47e+05 |
| Running Env Steps   | 5560000  |
| Running Forward KL  | 7.87     |
| Running Reverse KL  | 122      |
| Running Update Time | 1112     |
----------------------------------
--2024-08-12 14:19:07.655186 UTC---
| Itration            | 1113      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -6.89e+05 |
| Running Env Steps   | 5565000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 4.01      |
| Running Update Time | 1113      |
-----------------------------------
--2024-08-12 14:20:40.379398 UTC---
| Itration            | 1114      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.99e+05 |
| Running Env Steps   | 5570000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 4.87      |
| Running Update Time | 1114      |
-----------------------------------
--2024-08-12 14:22:14.621371 UTC---
| Itration            | 1115      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -5.71e+05 |
| Running Env Steps   | 5575000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 3.97      |
| Running Update Time | 1115      |
-----------------------------------
--2024-08-12 14:23:51.028494 UTC---
| Itration            | 1116      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.62e+03  |
| Reward Loss         | -1.93e+05 |
| Running Env Steps   | 5580000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 4.59      |
| Running Update Time | 1116      |
-----------------------------------
--2024-08-12 14:25:25.929153 UTC--
| Itration            | 1117     |
| Real Det Return     | 5.65e+03 |
| Real Sto Return     | 5.65e+03 |
| Reward Loss         | 1.29e+05 |
| Running Env Steps   | 5585000  |
| Running Forward KL  | 6.2      |
| Running Reverse KL  | 4.55     |
| Running Update Time | 1117     |
----------------------------------
--2024-08-12 14:27:01.438543 UTC--
| Itration            | 1118     |
| Real Det Return     | 5.74e+03 |
| Real Sto Return     | 5.15e+03 |
| Reward Loss         | 1.95e+05 |
| Running Env Steps   | 5590000  |
| Running Forward KL  | 6.82     |
| Running Reverse KL  | 31.8     |
| Running Update Time | 1118     |
----------------------------------
--2024-08-12 14:28:36.408924 UTC---
| Itration            | 1119      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -8.66e+05 |
| Running Env Steps   | 5595000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 4.35      |
| Running Update Time | 1119      |
-----------------------------------
--2024-08-12 14:30:09.882139 UTC---
| Itration            | 1120      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -3.77e+05 |
| Running Env Steps   | 5600000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 4.42      |
| Running Update Time | 1120      |
-----------------------------------
--2024-08-12 14:31:46.141108 UTC---
| Itration            | 1121      |
| Real Det Return     | 5.71e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -6.77e+05 |
| Running Env Steps   | 5605000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 68.3      |
| Running Update Time | 1121      |
-----------------------------------
--2024-08-12 14:33:20.869667 UTC--
| Itration            | 1122     |
| Real Det Return     | 5.78e+03 |
| Real Sto Return     | 5.68e+03 |
| Reward Loss         | 4.13e+04 |
| Running Env Steps   | 5610000  |
| Running Forward KL  | 6.52     |
| Running Reverse KL  | 4.92     |
| Running Update Time | 1122     |
----------------------------------
--2024-08-12 14:34:54.056164 UTC---
| Itration            | 1123      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -4.02e+05 |
| Running Env Steps   | 5615000   |
| Running Forward KL  | 6.89      |
| Running Reverse KL  | 4.72      |
| Running Update Time | 1123      |
-----------------------------------
--2024-08-12 14:36:29.742147 UTC---
| Itration            | 1124      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -5.53e+05 |
| Running Env Steps   | 5620000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 3.93      |
| Running Update Time | 1124      |
-----------------------------------
--2024-08-12 14:38:04.435022 UTC---
| Itration            | 1125      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -6.45e+05 |
| Running Env Steps   | 5625000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 4.39      |
| Running Update Time | 1125      |
-----------------------------------
--2024-08-12 14:39:38.511116 UTC---
| Itration            | 1126      |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 5630000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 4.77      |
| Running Update Time | 1126      |
-----------------------------------
--2024-08-12 14:41:17.491024 UTC--
| Itration            | 1127     |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.33e+03 |
| Reward Loss         | -1.2e+06 |
| Running Env Steps   | 5635000  |
| Running Forward KL  | 7.03     |
| Running Reverse KL  | 17.2     |
| Running Update Time | 1127     |
----------------------------------
--2024-08-12 14:42:52.149290 UTC--
| Itration            | 1128     |
| Real Det Return     | 5.58e+03 |
| Real Sto Return     | 5.45e+03 |
| Reward Loss         | -4.3e+05 |
| Running Env Steps   | 5640000  |
| Running Forward KL  | 5.95     |
| Running Reverse KL  | 3.51     |
| Running Update Time | 1128     |
----------------------------------
--2024-08-12 14:44:26.344443 UTC---
| Itration            | 1129      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -2.48e+05 |
| Running Env Steps   | 5645000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 24.5      |
| Running Update Time | 1129      |
-----------------------------------
--2024-08-12 14:46:01.522934 UTC---
| Itration            | 1130      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.59e+03  |
| Reward Loss         | -1.46e+05 |
| Running Env Steps   | 5650000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 4.97      |
| Running Update Time | 1130      |
-----------------------------------
--2024-08-12 14:47:34.292161 UTC---
| Itration            | 1131      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -5.29e+05 |
| Running Env Steps   | 5655000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 1131      |
-----------------------------------
--2024-08-12 14:49:10.403526 UTC--
| Itration            | 1132     |
| Real Det Return     | 5.81e+03 |
| Real Sto Return     | 5.57e+03 |
| Reward Loss         | 2.31e+05 |
| Running Env Steps   | 5660000  |
| Running Forward KL  | 6.42     |
| Running Reverse KL  | 29.4     |
| Running Update Time | 1132     |
----------------------------------
--2024-08-12 14:50:46.697103 UTC---
| Itration            | 1133      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -4.64e+05 |
| Running Env Steps   | 5665000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 4.22      |
| Running Update Time | 1133      |
-----------------------------------
--2024-08-12 14:52:21.967978 UTC---
| Itration            | 1134      |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -9.01e+05 |
| Running Env Steps   | 5670000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 4.31      |
| Running Update Time | 1134      |
-----------------------------------
--2024-08-12 14:53:58.797964 UTC---
| Itration            | 1135      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -4.49e+05 |
| Running Env Steps   | 5675000   |
| Running Forward KL  | 6.56      |
| Running Reverse KL  | 4.99      |
| Running Update Time | 1135      |
-----------------------------------
--2024-08-12 14:55:34.618123 UTC---
| Itration            | 1136      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -4.73e+05 |
| Running Env Steps   | 5680000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 4.74      |
| Running Update Time | 1136      |
-----------------------------------
--2024-08-12 14:57:08.767171 UTC---
| Itration            | 1137      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -7.01e+05 |
| Running Env Steps   | 5685000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 1137      |
-----------------------------------
--2024-08-12 14:58:45.025998 UTC--
| Itration            | 1138     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.56e+03 |
| Reward Loss         | 1.15e+05 |
| Running Env Steps   | 5690000  |
| Running Forward KL  | 6.41     |
| Running Reverse KL  | 4.7      |
| Running Update Time | 1138     |
----------------------------------
--2024-08-12 15:00:20.631992 UTC--
| Itration            | 1139     |
| Real Det Return     | 5.84e+03 |
| Real Sto Return     | 5.68e+03 |
| Reward Loss         | 4.69e+04 |
| Running Env Steps   | 5695000  |
| Running Forward KL  | 7.15     |
| Running Reverse KL  | 4.92     |
| Running Update Time | 1139     |
----------------------------------
--2024-08-12 15:01:55.162972 UTC---
| Itration            | 1140      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.56e+03  |
| Reward Loss         | -2.59e+04 |
| Running Env Steps   | 5700000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 32.4      |
| Running Update Time | 1140      |
-----------------------------------
--2024-08-12 15:03:31.098173 UTC--
| Itration            | 1141     |
| Real Det Return     | 5.7e+03  |
| Real Sto Return     | 5.63e+03 |
| Reward Loss         | 2.9e+04  |
| Running Env Steps   | 5705000  |
| Running Forward KL  | 6.21     |
| Running Reverse KL  | 4.59     |
| Running Update Time | 1141     |
----------------------------------
--2024-08-12 15:05:03.608029 UTC---
| Itration            | 1142      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -8.31e+05 |
| Running Env Steps   | 5710000   |
| Running Forward KL  | 6.92      |
| Running Reverse KL  | 4.65      |
| Running Update Time | 1142      |
-----------------------------------
--2024-08-12 15:06:38.007319 UTC--
| Itration            | 1143     |
| Real Det Return     | 5.68e+03 |
| Real Sto Return     | 5.61e+03 |
| Reward Loss         | 2.57e+03 |
| Running Env Steps   | 5715000  |
| Running Forward KL  | 6.07     |
| Running Reverse KL  | 11.5     |
| Running Update Time | 1143     |
----------------------------------
--2024-08-12 15:08:14.465443 UTC--
| Itration            | 1144     |
| Real Det Return     | 5.62e+03 |
| Real Sto Return     | 5.58e+03 |
| Reward Loss         | -2.4e+05 |
| Running Env Steps   | 5720000  |
| Running Forward KL  | 6.17     |
| Running Reverse KL  | 3.82     |
| Running Update Time | 1144     |
----------------------------------
--2024-08-12 15:09:49.811199 UTC---
| Itration            | 1145      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -5.47e+05 |
| Running Env Steps   | 5725000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 4.58      |
| Running Update Time | 1145      |
-----------------------------------
--2024-08-12 15:11:25.209362 UTC---
| Itration            | 1146      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -2.79e+05 |
| Running Env Steps   | 5730000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 3.46      |
| Running Update Time | 1146      |
-----------------------------------
--2024-08-12 15:13:02.966223 UTC---
| Itration            | 1147      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.55e+03  |
| Reward Loss         | -2.56e+05 |
| Running Env Steps   | 5735000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.22      |
| Running Update Time | 1147      |
-----------------------------------
--2024-08-12 15:14:36.672859 UTC---
| Itration            | 1148      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.62e+03  |
| Reward Loss         | -9.31e+04 |
| Running Env Steps   | 5740000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 3.87      |
| Running Update Time | 1148      |
-----------------------------------
--2024-08-12 15:16:12.200028 UTC---
| Itration            | 1149      |
| Real Det Return     | 5.8e+03   |
| Real Sto Return     | 5.79e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 5745000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 30.9      |
| Running Update Time | 1149      |
-----------------------------------
--2024-08-12 15:17:48.845037 UTC--
| Itration            | 1150     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.64e+03 |
| Reward Loss         | 1.73e+04 |
| Running Env Steps   | 5750000  |
| Running Forward KL  | 5.89     |
| Running Reverse KL  | 3.61     |
| Running Update Time | 1150     |
----------------------------------
--2024-08-12 15:19:22.987831 UTC--
| Itration            | 1151     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.62e+03 |
| Reward Loss         | -2.3e+05 |
| Running Env Steps   | 5755000  |
| Running Forward KL  | 6.87     |
| Running Reverse KL  | 31.2     |
| Running Update Time | 1151     |
----------------------------------
--2024-08-12 15:20:56.329421 UTC---
| Itration            | 1152      |
| Real Det Return     | 5.73e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -5.82e+06 |
| Running Env Steps   | 5760000   |
| Running Forward KL  | 7.72      |
| Running Reverse KL  | 143       |
| Running Update Time | 1152      |
-----------------------------------
--2024-08-12 15:22:29.673235 UTC--
| Itration            | 1153     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.13e+03 |
| Reward Loss         | 7.21e+04 |
| Running Env Steps   | 5765000  |
| Running Forward KL  | 6.14     |
| Running Reverse KL  | 6.36     |
| Running Update Time | 1153     |
----------------------------------
--2024-08-12 15:24:00.232173 UTC---
| Itration            | 1154      |
| Real Det Return     | 5.78e+03  |
| Real Sto Return     | 4.39e+03  |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 5770000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 100       |
| Running Update Time | 1154      |
-----------------------------------
--2024-08-12 15:25:35.950666 UTC--
| Itration            | 1155     |
| Real Det Return     | 5.57e+03 |
| Real Sto Return     | 5.39e+03 |
| Reward Loss         | 3.78e+04 |
| Running Env Steps   | 5775000  |
| Running Forward KL  | 6.12     |
| Running Reverse KL  | 3.8      |
| Running Update Time | 1155     |
----------------------------------
--2024-08-12 15:27:10.779059 UTC---
| Itration            | 1156      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -4.75e+05 |
| Running Env Steps   | 5780000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 1156      |
-----------------------------------
--2024-08-12 15:28:44.824126 UTC---
| Itration            | 1157      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -2.73e+05 |
| Running Env Steps   | 5785000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 1157      |
-----------------------------------
--2024-08-12 15:30:21.604914 UTC--
| Itration            | 1158     |
| Real Det Return     | 5.88e+03 |
| Real Sto Return     | 5.42e+03 |
| Reward Loss         | 6.5e+05  |
| Running Env Steps   | 5790000  |
| Running Forward KL  | 6.34     |
| Running Reverse KL  | 9.55     |
| Running Update Time | 1158     |
----------------------------------
--2024-08-12 15:31:54.480504 UTC--
| Itration            | 1159     |
| Real Det Return     | 5.75e+03 |
| Real Sto Return     | 5.54e+03 |
| Reward Loss         | 2.2e+05  |
| Running Env Steps   | 5795000  |
| Running Forward KL  | 6.33     |
| Running Reverse KL  | 5.65     |
| Running Update Time | 1159     |
----------------------------------
--2024-08-12 15:33:29.948345 UTC--
| Itration            | 1160     |
| Real Det Return     | 5.7e+03  |
| Real Sto Return     | 5.57e+03 |
| Reward Loss         | 2.61e+05 |
| Running Env Steps   | 5800000  |
| Running Forward KL  | 6.23     |
| Running Reverse KL  | 8.11     |
| Running Update Time | 1160     |
----------------------------------
--2024-08-12 15:35:06.146740 UTC---
| Itration            | 1161      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.55e+03  |
| Reward Loss         | -2.86e+05 |
| Running Env Steps   | 5805000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 3.81      |
| Running Update Time | 1161      |
-----------------------------------
--2024-08-12 15:36:40.213834 UTC--
| Itration            | 1162     |
| Real Det Return     | 5.84e+03 |
| Real Sto Return     | 5.78e+03 |
| Reward Loss         | 8.42e+05 |
| Running Env Steps   | 5810000  |
| Running Forward KL  | 6.47     |
| Running Reverse KL  | 4.44     |
| Running Update Time | 1162     |
----------------------------------
--2024-08-12 15:38:14.118957 UTC---
| Itration            | 1163      |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -1.51e+05 |
| Running Env Steps   | 5815000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 25.5      |
| Running Update Time | 1163      |
-----------------------------------
--2024-08-12 15:39:48.897187 UTC---
| Itration            | 1164      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -3.64e+05 |
| Running Env Steps   | 5820000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 4.1       |
| Running Update Time | 1164      |
-----------------------------------
--2024-08-12 15:41:22.651621 UTC---
| Itration            | 1165      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -2.24e+05 |
| Running Env Steps   | 5825000   |
| Running Forward KL  | 5.89      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 1165      |
-----------------------------------
--2024-08-12 15:42:59.919310 UTC---
| Itration            | 1166      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -6.85e+04 |
| Running Env Steps   | 5830000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 4.06      |
| Running Update Time | 1166      |
-----------------------------------
--2024-08-12 15:44:34.179687 UTC---
| Itration            | 1167      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -3.52e+05 |
| Running Env Steps   | 5835000   |
| Running Forward KL  | 6.62      |
| Running Reverse KL  | 4.54      |
| Running Update Time | 1167      |
-----------------------------------
--2024-08-12 15:46:06.633767 UTC---
| Itration            | 1168      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -6.34e+05 |
| Running Env Steps   | 5840000   |
| Running Forward KL  | 6.4       |
| Running Reverse KL  | 3.97      |
| Running Update Time | 1168      |
-----------------------------------
--2024-08-12 15:47:44.153603 UTC---
| Itration            | 1169      |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -4.31e+05 |
| Running Env Steps   | 5845000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 18.6      |
| Running Update Time | 1169      |
-----------------------------------
--2024-08-12 15:49:18.415882 UTC---
| Itration            | 1170      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.65e+03  |
| Reward Loss         | -1.94e+05 |
| Running Env Steps   | 5850000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.02      |
| Running Update Time | 1170      |
-----------------------------------
--2024-08-12 15:50:53.458783 UTC---
| Itration            | 1171      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -4.51e+05 |
| Running Env Steps   | 5855000   |
| Running Forward KL  | 6.02      |
| Running Reverse KL  | 3.36      |
| Running Update Time | 1171      |
-----------------------------------
--2024-08-12 15:52:35.169241 UTC---
| Itration            | 1172      |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -2.76e+05 |
| Running Env Steps   | 5860000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 42.3      |
| Running Update Time | 1172      |
-----------------------------------
--2024-08-12 15:54:07.556363 UTC---
| Itration            | 1173      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -1.05e+05 |
| Running Env Steps   | 5865000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 4.02      |
| Running Update Time | 1173      |
-----------------------------------
--2024-08-12 15:55:43.868201 UTC--
| Itration            | 1174     |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 5.43e+03 |
| Reward Loss         | -5.6e+05 |
| Running Env Steps   | 5870000  |
| Running Forward KL  | 5.73     |
| Running Reverse KL  | 3.64     |
| Running Update Time | 1174     |
----------------------------------
--2024-08-12 15:57:19.357569 UTC---
| Itration            | 1175      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -1.89e+05 |
| Running Env Steps   | 5875000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 1175      |
-----------------------------------
--2024-08-12 15:58:51.619175 UTC---
| Itration            | 1176      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -7.06e+05 |
| Running Env Steps   | 5880000   |
| Running Forward KL  | 7.27      |
| Running Reverse KL  | 53.7      |
| Running Update Time | 1176      |
-----------------------------------
--2024-08-12 16:00:26.677377 UTC---
| Itration            | 1177      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -6.26e+05 |
| Running Env Steps   | 5885000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 45.7      |
| Running Update Time | 1177      |
-----------------------------------
--2024-08-12 16:02:00.430625 UTC--
| Itration            | 1178     |
| Real Det Return     | 5.78e+03 |
| Real Sto Return     | 5.48e+03 |
| Reward Loss         | 9.7e+04  |
| Running Env Steps   | 5890000  |
| Running Forward KL  | 6.63     |
| Running Reverse KL  | 11       |
| Running Update Time | 1178     |
----------------------------------
--2024-08-12 16:03:32.414666 UTC--
| Itration            | 1179     |
| Real Det Return     | 5.62e+03 |
| Real Sto Return     | 5e+03    |
| Reward Loss         | 1.77e+05 |
| Running Env Steps   | 5895000  |
| Running Forward KL  | 6.84     |
| Running Reverse KL  | 36.1     |
| Running Update Time | 1179     |
----------------------------------
--2024-08-12 16:05:09.152677 UTC--
| Itration            | 1180     |
| Real Det Return     | 5.81e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | 6.86e+05 |
| Running Env Steps   | 5900000  |
| Running Forward KL  | 6.37     |
| Running Reverse KL  | 28.9     |
| Running Update Time | 1180     |
----------------------------------
--2024-08-12 16:06:42.217974 UTC---
| Itration            | 1181      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -3.49e+05 |
| Running Env Steps   | 5905000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 14.7      |
| Running Update Time | 1181      |
-----------------------------------
--2024-08-12 16:08:17.343773 UTC--
| Itration            | 1182     |
| Real Det Return     | 5.6e+03  |
| Real Sto Return     | 5.54e+03 |
| Reward Loss         | -1.7e+05 |
| Running Env Steps   | 5910000  |
| Running Forward KL  | 6.04     |
| Running Reverse KL  | 3.56     |
| Running Update Time | 1182     |
----------------------------------
--2024-08-12 16:09:49.441987 UTC--
| Itration            | 1183     |
| Real Det Return     | 5.6e+03  |
| Real Sto Return     | 4.04e+03 |
| Reward Loss         | -1.9e+06 |
| Running Env Steps   | 5915000  |
| Running Forward KL  | 6.91     |
| Running Reverse KL  | 133      |
| Running Update Time | 1183     |
----------------------------------
--2024-08-12 16:11:23.740035 UTC--
| Itration            | 1184     |
| Real Det Return     | 5.68e+03 |
| Real Sto Return     | 5.58e+03 |
| Reward Loss         | 1.22e+05 |
| Running Env Steps   | 5920000  |
| Running Forward KL  | 6.1      |
| Running Reverse KL  | 4.46     |
| Running Update Time | 1184     |
----------------------------------
--2024-08-12 16:13:00.277305 UTC--
| Itration            | 1185     |
| Real Det Return     | 5.79e+03 |
| Real Sto Return     | 5.76e+03 |
| Reward Loss         | 5.78e+05 |
| Running Env Steps   | 5925000  |
| Running Forward KL  | 6.69     |
| Running Reverse KL  | 4.95     |
| Running Update Time | 1185     |
----------------------------------
--2024-08-12 16:14:34.800620 UTC---
| Itration            | 1186      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -1.96e+05 |
| Running Env Steps   | 5930000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 1186      |
-----------------------------------
--2024-08-12 16:16:07.817360 UTC--
| Itration            | 1187     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.24e+03 |
| Reward Loss         | 1.27e+05 |
| Running Env Steps   | 5935000  |
| Running Forward KL  | 6.58     |
| Running Reverse KL  | 3.85     |
| Running Update Time | 1187     |
----------------------------------
--2024-08-12 16:17:44.098234 UTC---
| Itration            | 1188      |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -9.28e+05 |
| Running Env Steps   | 5940000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 32.2      |
| Running Update Time | 1188      |
-----------------------------------
--2024-08-12 16:19:17.845449 UTC---
| Itration            | 1189      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -3.38e+05 |
| Running Env Steps   | 5945000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 4.35      |
| Running Update Time | 1189      |
-----------------------------------
--2024-08-12 16:20:49.141043 UTC--
| Itration            | 1190     |
| Real Det Return     | 5.63e+03 |
| Real Sto Return     | 5.37e+03 |
| Reward Loss         | 3.08e+04 |
| Running Env Steps   | 5950000  |
| Running Forward KL  | 6.56     |
| Running Reverse KL  | 4.65     |
| Running Update Time | 1190     |
----------------------------------
--2024-08-12 16:22:27.556003 UTC---
| Itration            | 1191      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -2.94e+05 |
| Running Env Steps   | 5955000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 3.9       |
| Running Update Time | 1191      |
-----------------------------------
--2024-08-12 16:24:03.088024 UTC---
| Itration            | 1192      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -7.77e+05 |
| Running Env Steps   | 5960000   |
| Running Forward KL  | 7.03      |
| Running Reverse KL  | 4.77      |
| Running Update Time | 1192      |
-----------------------------------
--2024-08-12 16:25:36.714091 UTC--
| Itration            | 1193     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.46e+03 |
| Reward Loss         | 4.63e+05 |
| Running Env Steps   | 5965000  |
| Running Forward KL  | 6.42     |
| Running Reverse KL  | 13.3     |
| Running Update Time | 1193     |
----------------------------------
--2024-08-12 16:27:13.996096 UTC---
| Itration            | 1194      |
| Real Det Return     | 5.74e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -1.38e+06 |
| Running Env Steps   | 5970000   |
| Running Forward KL  | 7.34      |
| Running Reverse KL  | 52.1      |
| Running Update Time | 1194      |
-----------------------------------
--2024-08-12 16:28:47.642501 UTC---
| Itration            | 1195      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 5975000   |
| Running Forward KL  | 6.99      |
| Running Reverse KL  | 39.9      |
| Running Update Time | 1195      |
-----------------------------------
--2024-08-12 16:30:23.568226 UTC--
| Itration            | 1196     |
| Real Det Return     | 5.85e+03 |
| Real Sto Return     | 5.77e+03 |
| Reward Loss         | 5.61e+05 |
| Running Env Steps   | 5980000  |
| Running Forward KL  | 6.41     |
| Running Reverse KL  | 4.39     |
| Running Update Time | 1196     |
----------------------------------
--2024-08-12 16:31:59.690998 UTC---
| Itration            | 1197      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.56e+03  |
| Reward Loss         | -2.35e+05 |
| Running Env Steps   | 5985000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 1197      |
-----------------------------------
--2024-08-12 16:33:34.176758 UTC--
| Itration            | 1198     |
| Real Det Return     | 5.66e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 3.53e+04 |
| Running Env Steps   | 5990000  |
| Running Forward KL  | 6.88     |
| Running Reverse KL  | 4.69     |
| Running Update Time | 1198     |
----------------------------------
--2024-08-12 16:35:11.180040 UTC---
| Itration            | 1199      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -3.83e+05 |
| Running Env Steps   | 5995000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 4.91      |
| Running Update Time | 1199      |
-----------------------------------
--2024-08-12 16:36:46.466115 UTC---
| Itration            | 1200      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -5.58e+05 |
| Running Env Steps   | 6000000   |
| Running Forward KL  | 6.85      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 1200      |
-----------------------------------
--2024-08-12 16:38:21.742731 UTC---
| Itration            | 1201      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -1.47e+05 |
| Running Env Steps   | 6005000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 4.12      |
| Running Update Time | 1201      |
-----------------------------------
--2024-08-12 16:39:59.217922 UTC---
| Itration            | 1202      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 6010000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 34.6      |
| Running Update Time | 1202      |
-----------------------------------
--2024-08-12 16:41:33.364478 UTC---
| Itration            | 1203      |
| Real Det Return     | 5.74e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -9.05e+04 |
| Running Env Steps   | 6015000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 36.2      |
| Running Update Time | 1203      |
-----------------------------------
--2024-08-12 16:43:08.143722 UTC---
| Itration            | 1204      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.61e+03  |
| Reward Loss         | -1.54e+04 |
| Running Env Steps   | 6020000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 3.9       |
| Running Update Time | 1204      |
-----------------------------------
--2024-08-12 16:44:42.755757 UTC---
| Itration            | 1205      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 6025000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 106       |
| Running Update Time | 1205      |
-----------------------------------
--2024-08-12 16:46:16.951718 UTC---
| Itration            | 1206      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -3.11e+05 |
| Running Env Steps   | 6030000   |
| Running Forward KL  | 6.31      |
| Running Reverse KL  | 4.47      |
| Running Update Time | 1206      |
-----------------------------------
--2024-08-12 16:47:51.050696 UTC--
| Itration            | 1207     |
| Real Det Return     | 5.63e+03 |
| Real Sto Return     | 4.79e+03 |
| Reward Loss         | 1.08e+04 |
| Running Env Steps   | 6035000  |
| Running Forward KL  | 6.15     |
| Running Reverse KL  | 31.2     |
| Running Update Time | 1207     |
----------------------------------
--2024-08-12 16:49:40.954584 UTC--
| Itration            | 1208     |
| Real Det Return     | 5.59e+03 |
| Real Sto Return     | 5.55e+03 |
| Reward Loss         | -2.7e+06 |
| Running Env Steps   | 6040000  |
| Running Forward KL  | 6.66     |
| Running Reverse KL  | 37.3     |
| Running Update Time | 1208     |
----------------------------------
--2024-08-12 16:51:33.459656 UTC--
| Itration            | 1209     |
| Real Det Return     | 5.7e+03  |
| Real Sto Return     | 5.59e+03 |
| Reward Loss         | 9.23e+04 |
| Running Env Steps   | 6045000  |
| Running Forward KL  | 5.91     |
| Running Reverse KL  | 3.96     |
| Running Update Time | 1209     |
----------------------------------
--2024-08-12 16:53:25.095217 UTC--
| Itration            | 1210     |
| Real Det Return     | 5.44e+03 |
| Real Sto Return     | 4.44e+03 |
| Reward Loss         | -4.4e+06 |
| Running Env Steps   | 6050000  |
| Running Forward KL  | 7.16     |
| Running Reverse KL  | 91.4     |
| Running Update Time | 1210     |
----------------------------------
--2024-08-12 16:55:16.893908 UTC---
| Itration            | 1211      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.26e+05 |
| Running Env Steps   | 6055000   |
| Running Forward KL  | 6.85      |
| Running Reverse KL  | 40.5      |
| Running Update Time | 1211      |
-----------------------------------
--2024-08-12 16:57:05.946240 UTC---
| Itration            | 1212      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -6.61e+05 |
| Running Env Steps   | 6060000   |
| Running Forward KL  | 6.78      |
| Running Reverse KL  | 56.6      |
| Running Update Time | 1212      |
-----------------------------------
--2024-08-12 16:58:55.907338 UTC---
| Itration            | 1213      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -3.14e+05 |
| Running Env Steps   | 6065000   |
| Running Forward KL  | 5.88      |
| Running Reverse KL  | 27.2      |
| Running Update Time | 1213      |
-----------------------------------
--2024-08-12 17:00:40.223364 UTC--
| Itration            | 1214     |
| Real Det Return     | 5.66e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 2.72e+04 |
| Running Env Steps   | 6070000  |
| Running Forward KL  | 6.36     |
| Running Reverse KL  | 4.28     |
| Running Update Time | 1214     |
----------------------------------
--2024-08-12 17:02:24.161244 UTC---
| Itration            | 1215      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.74e+05 |
| Running Env Steps   | 6075000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 3.48      |
| Running Update Time | 1215      |
-----------------------------------
--2024-08-12 17:04:14.343017 UTC--
| Itration            | 1216     |
| Real Det Return     | 5.61e+03 |
| Real Sto Return     | 5.57e+03 |
| Reward Loss         | -2.7e+05 |
| Running Env Steps   | 6080000  |
| Running Forward KL  | 6.92     |
| Running Reverse KL  | 23.4     |
| Running Update Time | 1216     |
----------------------------------
--2024-08-12 17:05:53.472980 UTC---
| Itration            | 1217      |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -3.71e+05 |
| Running Env Steps   | 6085000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 18.6      |
| Running Update Time | 1217      |
-----------------------------------
--2024-08-12 17:07:23.906773 UTC--
| Itration            | 1218     |
| Real Det Return     | 5.69e+03 |
| Real Sto Return     | 3.68e+03 |
| Reward Loss         | 5.19e+05 |
| Running Env Steps   | 6090000  |
| Running Forward KL  | 7.13     |
| Running Reverse KL  | 36.6     |
| Running Update Time | 1218     |
----------------------------------
--2024-08-12 17:09:02.381790 UTC---
| Itration            | 1219      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -2.34e+05 |
| Running Env Steps   | 6095000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 4.13      |
| Running Update Time | 1219      |
-----------------------------------
--2024-08-12 17:10:49.868754 UTC---
| Itration            | 1220      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -5.84e+04 |
| Running Env Steps   | 6100000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 7.84      |
| Running Update Time | 1220      |
-----------------------------------
--2024-08-12 17:12:39.165392 UTC---
| Itration            | 1221      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.61e+03  |
| Reward Loss         | -2.41e+04 |
| Running Env Steps   | 6105000   |
| Running Forward KL  | 6.43      |
| Running Reverse KL  | 10.4      |
| Running Update Time | 1221      |
-----------------------------------
--2024-08-12 17:14:18.920588 UTC--
| Itration            | 1222     |
| Real Det Return     | 5.68e+03 |
| Real Sto Return     | 4.32e+03 |
| Reward Loss         | 2.09e+05 |
| Running Env Steps   | 6110000  |
| Running Forward KL  | 6.16     |
| Running Reverse KL  | 3.98     |
| Running Update Time | 1222     |
----------------------------------
--2024-08-12 17:16:03.142629 UTC--
| Itration            | 1223     |
| Real Det Return     | 5.67e+03 |
| Real Sto Return     | 5.32e+03 |
| Reward Loss         | 6.8e+04  |
| Running Env Steps   | 6115000  |
| Running Forward KL  | 6.37     |
| Running Reverse KL  | 8.75     |
| Running Update Time | 1223     |
----------------------------------
--2024-08-12 17:18:00.871050 UTC---
| Itration            | 1224      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.56e+03  |
| Reward Loss         | -1.84e+05 |
| Running Env Steps   | 6120000   |
| Running Forward KL  | 6.02      |
| Running Reverse KL  | 4.16      |
| Running Update Time | 1224      |
-----------------------------------
--2024-08-12 17:19:45.611316 UTC---
| Itration            | 1225      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.59e+03  |
| Reward Loss         | -7.52e+04 |
| Running Env Steps   | 6125000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 4.13      |
| Running Update Time | 1225      |
-----------------------------------
--2024-08-12 17:21:32.235626 UTC--
| Itration            | 1226     |
| Real Det Return     | 5.69e+03 |
| Real Sto Return     | 5.64e+03 |
| Reward Loss         | 1.52e+05 |
| Running Env Steps   | 6130000  |
| Running Forward KL  | 6.25     |
| Running Reverse KL  | 4.29     |
| Running Update Time | 1226     |
----------------------------------
--2024-08-12 17:23:25.636568 UTC--
| Itration            | 1227     |
| Real Det Return     | 5.82e+03 |
| Real Sto Return     | 4.87e+03 |
| Reward Loss         | 3.63e+05 |
| Running Env Steps   | 6135000  |
| Running Forward KL  | 6.9      |
| Running Reverse KL  | 22.8     |
| Running Update Time | 1227     |
----------------------------------
--2024-08-12 17:25:18.389960 UTC---
| Itration            | 1228      |
| Real Det Return     | 5.75e+03  |
| Real Sto Return     | 5.63e+03  |
| Reward Loss         | -2.66e+04 |
| Running Env Steps   | 6140000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 4.59      |
| Running Update Time | 1228      |
-----------------------------------
--2024-08-12 17:27:04.172734 UTC--
| Itration            | 1229     |
| Real Det Return     | 5.88e+03 |
| Real Sto Return     | 4.38e+03 |
| Reward Loss         | 5.82e+05 |
| Running Env Steps   | 6145000  |
| Running Forward KL  | 6.66     |
| Running Reverse KL  | 27       |
| Running Update Time | 1229     |
----------------------------------
--2024-08-12 17:28:54.996920 UTC--
| Itration            | 1230     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.63e+03 |
| Reward Loss         | 2.17e+05 |
| Running Env Steps   | 6150000  |
| Running Forward KL  | 6.52     |
| Running Reverse KL  | 4.02     |
| Running Update Time | 1230     |
----------------------------------
--2024-08-12 17:30:36.094843 UTC---
| Itration            | 1231      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -1.68e+05 |
| Running Env Steps   | 6155000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 20        |
| Running Update Time | 1231      |
-----------------------------------
--2024-08-12 17:32:09.943266 UTC--
| Itration            | 1232     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 4.75e+03 |
| Reward Loss         | 2.19e+03 |
| Running Env Steps   | 6160000  |
| Running Forward KL  | 6.45     |
| Running Reverse KL  | 42       |
| Running Update Time | 1232     |
----------------------------------
--2024-08-12 17:33:44.413822 UTC---
| Itration            | 1233      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -8.13e+04 |
| Running Env Steps   | 6165000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 3.7       |
| Running Update Time | 1233      |
-----------------------------------
--2024-08-12 17:35:19.036450 UTC---
| Itration            | 1234      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.56e+03  |
| Reward Loss         | -1.33e+05 |
| Running Env Steps   | 6170000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 3.73      |
| Running Update Time | 1234      |
-----------------------------------
--2024-08-12 17:36:55.296607 UTC---
| Itration            | 1235      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -4.64e+05 |
| Running Env Steps   | 6175000   |
| Running Forward KL  | 6.46      |
| Running Reverse KL  | 27.5      |
| Running Update Time | 1235      |
-----------------------------------
--2024-08-12 17:38:29.076032 UTC--
| Itration            | 1236     |
| Real Det Return     | 5.8e+03  |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | 4.03e+05 |
| Running Env Steps   | 6180000  |
| Running Forward KL  | 6.75     |
| Running Reverse KL  | 22.5     |
| Running Update Time | 1236     |
----------------------------------
--2024-08-12 17:40:01.136432 UTC--
| Itration            | 1237     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.65e+03 |
| Reward Loss         | 2.55e+05 |
| Running Env Steps   | 6185000  |
| Running Forward KL  | 6.27     |
| Running Reverse KL  | 3.91     |
| Running Update Time | 1237     |
----------------------------------
--2024-08-12 17:41:36.020570 UTC---
| Itration            | 1238      |
| Real Det Return     | 5.75e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -2.59e+05 |
| Running Env Steps   | 6190000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 46.4      |
| Running Update Time | 1238      |
-----------------------------------
--2024-08-12 17:43:08.209622 UTC--
| Itration            | 1239     |
| Real Det Return     | 5.79e+03 |
| Real Sto Return     | 5.68e+03 |
| Reward Loss         | 5.29e+05 |
| Running Env Steps   | 6195000  |
| Running Forward KL  | 5.87     |
| Running Reverse KL  | 3.74     |
| Running Update Time | 1239     |
----------------------------------
--2024-08-12 17:44:45.048780 UTC---
| Itration            | 1240      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 6200000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 40.9      |
| Running Update Time | 1240      |
-----------------------------------
--2024-08-12 17:46:20.992023 UTC--
| Itration            | 1241     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 1.12e+05 |
| Running Env Steps   | 6205000  |
| Running Forward KL  | 5.95     |
| Running Reverse KL  | 4.08     |
| Running Update Time | 1241     |
----------------------------------
--2024-08-12 17:47:49.823596 UTC---
| Itration            | 1242      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 4.37e+03  |
| Reward Loss         | -3.97e+05 |
| Running Env Steps   | 6210000   |
| Running Forward KL  | 7.26      |
| Running Reverse KL  | 56.8      |
| Running Update Time | 1242      |
-----------------------------------
--2024-08-12 17:49:26.738473 UTC--
| Itration            | 1243     |
| Real Det Return     | 5.67e+03 |
| Real Sto Return     | 5.02e+03 |
| Reward Loss         | 2.42e+05 |
| Running Env Steps   | 6215000  |
| Running Forward KL  | 6.07     |
| Running Reverse KL  | 4.09     |
| Running Update Time | 1243     |
----------------------------------
--2024-08-12 17:51:01.280178 UTC--
| Itration            | 1244     |
| Real Det Return     | 5.84e+03 |
| Real Sto Return     | 5.69e+03 |
| Reward Loss         | 5.87e+05 |
| Running Env Steps   | 6220000  |
| Running Forward KL  | 6.45     |
| Running Reverse KL  | 4.53     |
| Running Update Time | 1244     |
----------------------------------
--2024-08-12 17:52:33.689420 UTC---
| Itration            | 1245      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -3.46e+05 |
| Running Env Steps   | 6225000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 1245      |
-----------------------------------
--2024-08-12 17:54:11.853579 UTC---
| Itration            | 1246      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.59e+03  |
| Reward Loss         | -4.08e+04 |
| Running Env Steps   | 6230000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 8.73      |
| Running Update Time | 1246      |
-----------------------------------
--2024-08-12 17:55:46.431720 UTC--
| Itration            | 1247     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.65e+03 |
| Reward Loss         | 1.42e+05 |
| Running Env Steps   | 6235000  |
| Running Forward KL  | 6.1      |
| Running Reverse KL  | 4.61     |
| Running Update Time | 1247     |
----------------------------------
--2024-08-12 17:57:21.739308 UTC--
| Itration            | 1248     |
| Real Det Return     | 5.8e+03  |
| Real Sto Return     | 5.69e+03 |
| Reward Loss         | 2.87e+05 |
| Running Env Steps   | 6240000  |
| Running Forward KL  | 6.25     |
| Running Reverse KL  | 4.12     |
| Running Update Time | 1248     |
----------------------------------
--2024-08-12 17:58:56.683869 UTC---
| Itration            | 1249      |
| Real Det Return     | 5.79e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -7.08e+04 |
| Running Env Steps   | 6245000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 20.1      |
| Running Update Time | 1249      |
-----------------------------------
--2024-08-12 18:00:28.923085 UTC---
| Itration            | 1250      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -1.88e+05 |
| Running Env Steps   | 6250000   |
| Running Forward KL  | 6.02      |
| Running Reverse KL  | 3.98      |
| Running Update Time | 1250      |
-----------------------------------
--2024-08-12 18:02:04.431257 UTC---
| Itration            | 1251      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -6.63e+04 |
| Running Env Steps   | 6255000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 1251      |
-----------------------------------
--2024-08-12 18:03:37.726550 UTC--
| Itration            | 1252     |
| Real Det Return     | 5.82e+03 |
| Real Sto Return     | 5.57e+03 |
| Reward Loss         | 5.1e+05  |
| Running Env Steps   | 6260000  |
| Running Forward KL  | 6.09     |
| Running Reverse KL  | 3.99     |
| Running Update Time | 1252     |
----------------------------------
--2024-08-12 18:05:09.877940 UTC--
| Itration            | 1253     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.31e+03 |
| Reward Loss         | 1.21e+05 |
| Running Env Steps   | 6265000  |
| Running Forward KL  | 5.76     |
| Running Reverse KL  | 3.89     |
| Running Update Time | 1253     |
----------------------------------
--2024-08-12 18:06:44.730455 UTC--
| Itration            | 1254     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | 1.22e+05 |
| Running Env Steps   | 6270000  |
| Running Forward KL  | 6.1      |
| Running Reverse KL  | 4.1      |
| Running Update Time | 1254     |
----------------------------------
--2024-08-12 18:08:17.425107 UTC--
| Itration            | 1255     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | 1.74e+05 |
| Running Env Steps   | 6275000  |
| Running Forward KL  | 6.08     |
| Running Reverse KL  | 4.27     |
| Running Update Time | 1255     |
----------------------------------
--2024-08-12 18:09:50.700606 UTC--
| Itration            | 1256     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.65e+03 |
| Reward Loss         | 2.88e+05 |
| Running Env Steps   | 6280000  |
| Running Forward KL  | 6.05     |
| Running Reverse KL  | 25.8     |
| Running Update Time | 1256     |
----------------------------------
--2024-08-12 18:11:26.426816 UTC--
| Itration            | 1257     |
| Real Det Return     | 5.66e+03 |
| Real Sto Return     | 5.57e+03 |
| Reward Loss         | 8.53e+04 |
| Running Env Steps   | 6285000  |
| Running Forward KL  | 5.74     |
| Running Reverse KL  | 3.45     |
| Running Update Time | 1257     |
----------------------------------
--2024-08-12 18:13:00.028773 UTC--
| Itration            | 1258     |
| Real Det Return     | 5.8e+03  |
| Real Sto Return     | 5.46e+03 |
| Reward Loss         | 1.88e+05 |
| Running Env Steps   | 6290000  |
| Running Forward KL  | 5.62     |
| Running Reverse KL  | 3.56     |
| Running Update Time | 1258     |
----------------------------------
--2024-08-12 18:14:34.276399 UTC---
| Itration            | 1259      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -5.95e+04 |
| Running Env Steps   | 6295000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 7.37      |
| Running Update Time | 1259      |
-----------------------------------
--2024-08-12 18:16:10.669320 UTC--
| Itration            | 1260     |
| Real Det Return     | 5.69e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | 3.51e+05 |
| Running Env Steps   | 6300000  |
| Running Forward KL  | 5.62     |
| Running Reverse KL  | 3.33     |
| Running Update Time | 1260     |
----------------------------------
--2024-08-12 18:17:44.525079 UTC---
| Itration            | 1261      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -2.25e+05 |
| Running Env Steps   | 6305000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 4.33      |
| Running Update Time | 1261      |
-----------------------------------
--2024-08-12 18:19:19.304352 UTC--
| Itration            | 1262     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.57e+03 |
| Reward Loss         | 9.36e+04 |
| Running Env Steps   | 6310000  |
| Running Forward KL  | 5.99     |
| Running Reverse KL  | 3.8      |
| Running Update Time | 1262     |
----------------------------------
--2024-08-12 18:20:54.027728 UTC---
| Itration            | 1263      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.55e+03  |
| Reward Loss         | -2.91e+05 |
| Running Env Steps   | 6315000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 3.55      |
| Running Update Time | 1263      |
-----------------------------------
--2024-08-12 18:22:27.694103 UTC---
| Itration            | 1264      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.64e+03  |
| Reward Loss         | -1.35e+05 |
| Running Env Steps   | 6320000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 4.07      |
| Running Update Time | 1264      |
-----------------------------------
--2024-08-12 18:24:04.941437 UTC---
| Itration            | 1265      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -2.44e+05 |
| Running Env Steps   | 6325000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 4.23      |
| Running Update Time | 1265      |
-----------------------------------
--2024-08-12 18:25:39.346917 UTC--
| Itration            | 1266     |
| Real Det Return     | 5.61e+03 |
| Real Sto Return     | 5.54e+03 |
| Reward Loss         | 4.29e+04 |
| Running Env Steps   | 6330000  |
| Running Forward KL  | 5.56     |
| Running Reverse KL  | 3.43     |
| Running Update Time | 1266     |
----------------------------------
--2024-08-12 18:27:12.540938 UTC---
| Itration            | 1267      |
| Real Det Return     | 5.73e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -5.27e+04 |
| Running Env Steps   | 6335000   |
| Running Forward KL  | 5.53      |
| Running Reverse KL  | 3.69      |
| Running Update Time | 1267      |
-----------------------------------
--2024-08-12 18:28:48.465764 UTC---
| Itration            | 1268      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -5.64e+05 |
| Running Env Steps   | 6340000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 38.8      |
| Running Update Time | 1268      |
-----------------------------------
--2024-08-12 18:30:22.128185 UTC---
| Itration            | 1269      |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -3.87e+04 |
| Running Env Steps   | 6345000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 15.4      |
| Running Update Time | 1269      |
-----------------------------------
--2024-08-12 18:31:57.024164 UTC---
| Itration            | 1270      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -2.17e+04 |
| Running Env Steps   | 6350000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 3.81      |
| Running Update Time | 1270      |
-----------------------------------
--2024-08-12 18:33:33.942812 UTC---
| Itration            | 1271      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 6355000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 52.2      |
| Running Update Time | 1271      |
-----------------------------------
--2024-08-12 18:35:06.358426 UTC--
| Itration            | 1272     |
| Real Det Return     | 5.66e+03 |
| Real Sto Return     | 5.36e+03 |
| Reward Loss         | 5.99e+04 |
| Running Env Steps   | 6360000  |
| Running Forward KL  | 6.11     |
| Running Reverse KL  | 25.9     |
| Running Update Time | 1272     |
----------------------------------
--2024-08-12 18:36:42.820733 UTC---
| Itration            | 1273      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.59e+03  |
| Reward Loss         | -9.22e+04 |
| Running Env Steps   | 6365000   |
| Running Forward KL  | 5.81      |
| Running Reverse KL  | 3.76      |
| Running Update Time | 1273      |
-----------------------------------
--2024-08-12 18:38:17.833173 UTC---
| Itration            | 1274      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.55e+03  |
| Reward Loss         | -8.97e+04 |
| Running Env Steps   | 6370000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 3.97      |
| Running Update Time | 1274      |
-----------------------------------
--2024-08-12 18:39:52.142782 UTC--
| Itration            | 1275     |
| Real Det Return     | 5.78e+03 |
| Real Sto Return     | 5.45e+03 |
| Reward Loss         | 5.17e+05 |
| Running Env Steps   | 6375000  |
| Running Forward KL  | 6.2      |
| Running Reverse KL  | 57.1     |
| Running Update Time | 1275     |
----------------------------------
--2024-08-12 18:41:26.959851 UTC---
| Itration            | 1276      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -2.63e+05 |
| Running Env Steps   | 6380000   |
| Running Forward KL  | 6.7       |
| Running Reverse KL  | 99.7      |
| Running Update Time | 1276      |
-----------------------------------
--2024-08-12 18:43:00.822051 UTC--
| Itration            | 1277     |
| Real Det Return     | 5.79e+03 |
| Real Sto Return     | 5.66e+03 |
| Reward Loss         | 1.89e+05 |
| Running Env Steps   | 6385000  |
| Running Forward KL  | 5.34     |
| Running Reverse KL  | 2.98     |
| Running Update Time | 1277     |
----------------------------------
--2024-08-12 18:44:36.663428 UTC--
| Itration            | 1278     |
| Real Det Return     | 5.77e+03 |
| Real Sto Return     | 5.66e+03 |
| Reward Loss         | 3.83e+05 |
| Running Env Steps   | 6390000  |
| Running Forward KL  | 5.78     |
| Running Reverse KL  | 4.07     |
| Running Update Time | 1278     |
----------------------------------
--2024-08-12 18:46:15.070473 UTC--
| Itration            | 1279     |
| Real Det Return     | 5.77e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | 7.14e+04 |
| Running Env Steps   | 6395000  |
| Running Forward KL  | 6.13     |
| Running Reverse KL  | 3.74     |
| Running Update Time | 1279     |
----------------------------------
--2024-08-12 18:47:49.139558 UTC---
| Itration            | 1280      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.64e+03  |
| Reward Loss         | -2.53e+05 |
| Running Env Steps   | 6400000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 1280      |
-----------------------------------
--2024-08-12 18:49:22.210329 UTC--
| Itration            | 1281     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.6e+03  |
| Reward Loss         | 4.37e+04 |
| Running Env Steps   | 6405000  |
| Running Forward KL  | 5.53     |
| Running Reverse KL  | 3.65     |
| Running Update Time | 1281     |
----------------------------------
--2024-08-12 18:50:53.275800 UTC---
| Itration            | 1282      |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 4.03e+03  |
| Reward Loss         | -6.52e+05 |
| Running Env Steps   | 6410000   |
| Running Forward KL  | 6.89      |
| Running Reverse KL  | 123       |
| Running Update Time | 1282      |
-----------------------------------
--2024-08-12 18:52:25.087125 UTC--
| Itration            | 1283     |
| Real Det Return     | 5.65e+03 |
| Real Sto Return     | 4.85e+03 |
| Reward Loss         | 4.97e+04 |
| Running Env Steps   | 6415000  |
| Running Forward KL  | 5.59     |
| Running Reverse KL  | 3.62     |
| Running Update Time | 1283     |
----------------------------------
--2024-08-12 18:54:00.517899 UTC--
| Itration            | 1284     |
| Real Det Return     | 5.82e+03 |
| Real Sto Return     | 5.47e+03 |
| Reward Loss         | 4.04e+05 |
| Running Env Steps   | 6420000  |
| Running Forward KL  | 6.14     |
| Running Reverse KL  | 26.9     |
| Running Update Time | 1284     |
----------------------------------
--2024-08-12 18:55:30.439876 UTC---
| Itration            | 1285      |
| Real Det Return     | 5.88e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -4.04e+05 |
| Running Env Steps   | 6425000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 85.8      |
| Running Update Time | 1285      |
-----------------------------------
--2024-08-12 18:57:03.443014 UTC---
| Itration            | 1286      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -5.43e+05 |
| Running Env Steps   | 6430000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 22.9      |
| Running Update Time | 1286      |
-----------------------------------
--2024-08-12 18:58:36.258215 UTC--
| Itration            | 1287     |
| Real Det Return     | 5.9e+03  |
| Real Sto Return     | 5.31e+03 |
| Reward Loss         | 1.06e+06 |
| Running Env Steps   | 6435000  |
| Running Forward KL  | 6.04     |
| Running Reverse KL  | 47.8     |
| Running Update Time | 1287     |
----------------------------------
--2024-08-12 19:00:09.190053 UTC--
| Itration            | 1288     |
| Real Det Return     | 5.84e+03 |
| Real Sto Return     | 5.67e+03 |
| Reward Loss         | 6.7e+04  |
| Running Env Steps   | 6440000  |
| Running Forward KL  | 5.99     |
| Running Reverse KL  | 34.2     |
| Running Update Time | 1288     |
----------------------------------
--2024-08-12 19:01:42.658410 UTC---
| Itration            | 1289      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -4.33e+05 |
| Running Env Steps   | 6445000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 3.24      |
| Running Update Time | 1289      |
-----------------------------------
--2024-08-12 19:03:15.631342 UTC---
| Itration            | 1290      |
| Real Det Return     | 5.79e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -2.46e+05 |
| Running Env Steps   | 6450000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 33.7      |
| Running Update Time | 1290      |
-----------------------------------
--2024-08-12 19:04:49.714189 UTC--
| Itration            | 1291     |
| Real Det Return     | 5.78e+03 |
| Real Sto Return     | 5.71e+03 |
| Reward Loss         | 3.02e+05 |
| Running Env Steps   | 6455000  |
| Running Forward KL  | 6.01     |
| Running Reverse KL  | 4.54     |
| Running Update Time | 1291     |
----------------------------------
--2024-08-12 19:06:26.786578 UTC---
| Itration            | 1292      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -3.35e+05 |
| Running Env Steps   | 6460000   |
| Running Forward KL  | 5.38      |
| Running Reverse KL  | 3.5       |
| Running Update Time | 1292      |
-----------------------------------
--2024-08-12 19:07:59.357033 UTC--
| Itration            | 1293     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.35e+03 |
| Reward Loss         | 8.65e+04 |
| Running Env Steps   | 6465000  |
| Running Forward KL  | 5.77     |
| Running Reverse KL  | 9.87     |
| Running Update Time | 1293     |
----------------------------------
--2024-08-12 19:09:34.028958 UTC--
| Itration            | 1294     |
| Real Det Return     | 5.85e+03 |
| Real Sto Return     | 5.69e+03 |
| Reward Loss         | 2.05e+05 |
| Running Env Steps   | 6470000  |
| Running Forward KL  | 5.88     |
| Running Reverse KL  | 3.17     |
| Running Update Time | 1294     |
----------------------------------
--2024-08-12 19:11:09.805763 UTC---
| Itration            | 1295      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -7.72e+05 |
| Running Env Steps   | 6475000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 28.9      |
| Running Update Time | 1295      |
-----------------------------------
--2024-08-12 19:12:44.238922 UTC--
| Itration            | 1296     |
| Real Det Return     | 5.7e+03  |
| Real Sto Return     | 5.58e+03 |
| Reward Loss         | -2e+05   |
| Running Env Steps   | 6480000  |
| Running Forward KL  | 5.54     |
| Running Reverse KL  | 3.13     |
| Running Update Time | 1296     |
----------------------------------
--2024-08-12 19:14:17.541086 UTC--
| Itration            | 1297     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.62e+03 |
| Reward Loss         | 5.39e+04 |
| Running Env Steps   | 6485000  |
| Running Forward KL  | 5.89     |
| Running Reverse KL  | 4.3      |
| Running Update Time | 1297     |
----------------------------------
--2024-08-12 19:15:52.885037 UTC---
| Itration            | 1298      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 6490000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 111       |
| Running Update Time | 1298      |
-----------------------------------
--2024-08-12 19:17:16.821747 UTC--
| Itration            | 1299     |
| Real Det Return     | 5.81e+03 |
| Real Sto Return     | 3.24e+03 |
| Reward Loss         | 5.67e+05 |
| Running Env Steps   | 6495000  |
| Running Forward KL  | 7.35     |
| Running Reverse KL  | 196      |
| Running Update Time | 1299     |
----------------------------------
--2024-08-12 19:18:50.980257 UTC---
| Itration            | 1300      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -2.94e+05 |
| Running Env Steps   | 6500000   |
| Running Forward KL  | 5.31      |
| Running Reverse KL  | 5.05      |
| Running Update Time | 1300      |
-----------------------------------
--2024-08-12 19:20:19.605391 UTC---
| Itration            | 1301      |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -4.87e+05 |
| Running Env Steps   | 6505000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 20.4      |
| Running Update Time | 1301      |
-----------------------------------
--2024-08-12 19:21:49.554622 UTC---
| Itration            | 1302      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 6510000   |
| Running Forward KL  | 7.36      |
| Running Reverse KL  | 117       |
| Running Update Time | 1302      |
-----------------------------------
--2024-08-12 19:23:25.231300 UTC---
| Itration            | 1303      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -2.06e+05 |
| Running Env Steps   | 6515000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 3.81      |
| Running Update Time | 1303      |
-----------------------------------
--2024-08-12 19:24:56.674059 UTC---
| Itration            | 1304      |
| Real Det Return     | 5.74e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 6520000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 66.5      |
| Running Update Time | 1304      |
-----------------------------------
--2024-08-12 19:26:29.242517 UTC--
| Itration            | 1305     |
| Real Det Return     | 5.74e+03 |
| Real Sto Return     | 4.96e+03 |
| Reward Loss         | 1.45e+05 |
| Running Env Steps   | 6525000  |
| Running Forward KL  | 6.21     |
| Running Reverse KL  | 33.7     |
| Running Update Time | 1305     |
----------------------------------
--2024-08-12 19:28:04.717910 UTC---
| Itration            | 1306      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.31e+05 |
| Running Env Steps   | 6530000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 3.74      |
| Running Update Time | 1306      |
-----------------------------------
--2024-08-12 19:29:36.363876 UTC---
| Itration            | 1307      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -3.38e+04 |
| Running Env Steps   | 6535000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 4.12      |
| Running Update Time | 1307      |
-----------------------------------
--2024-08-12 19:31:11.376911 UTC---
| Itration            | 1308      |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.47e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 6540000   |
| Running Forward KL  | 6.6       |
| Running Reverse KL  | 111       |
| Running Update Time | 1308      |
-----------------------------------
--2024-08-12 19:32:47.495222 UTC---
| Itration            | 1309      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -6.21e+05 |
| Running Env Steps   | 6545000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 35.5      |
| Running Update Time | 1309      |
-----------------------------------
--2024-08-12 19:34:21.659022 UTC---
| Itration            | 1310      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.61e+03  |
| Reward Loss         | -7.63e+04 |
| Running Env Steps   | 6550000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 1310      |
-----------------------------------
--2024-08-12 19:35:55.619405 UTC---
| Itration            | 1311      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -7.79e+05 |
| Running Env Steps   | 6555000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 103       |
| Running Update Time | 1311      |
-----------------------------------
--2024-08-12 19:37:27.726359 UTC---
| Itration            | 1312      |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -1.16e+05 |
| Running Env Steps   | 6560000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 1312      |
-----------------------------------
--2024-08-12 19:39:02.617505 UTC---
| Itration            | 1313      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -5.94e+04 |
| Running Env Steps   | 6565000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 3.69      |
| Running Update Time | 1313      |
-----------------------------------
--2024-08-12 19:40:36.971586 UTC--
| Itration            | 1314     |
| Real Det Return     | 5.85e+03 |
| Real Sto Return     | 4.92e+03 |
| Reward Loss         | 3.59e+05 |
| Running Env Steps   | 6570000  |
| Running Forward KL  | 5.74     |
| Running Reverse KL  | 3.9      |
| Running Update Time | 1314     |
----------------------------------
--2024-08-12 19:42:11.098135 UTC---
| Itration            | 1315      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -5.12e+05 |
| Running Env Steps   | 6575000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 3.86      |
| Running Update Time | 1315      |
-----------------------------------
--2024-08-12 19:43:47.418039 UTC---
| Itration            | 1316      |
| Real Det Return     | 5.79e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -5.05e+05 |
| Running Env Steps   | 6580000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 36.7      |
| Running Update Time | 1316      |
-----------------------------------
--2024-08-12 19:45:23.473912 UTC---
| Itration            | 1317      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -5.03e+03 |
| Running Env Steps   | 6585000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 4.12      |
| Running Update Time | 1317      |
-----------------------------------
--2024-08-12 19:46:54.959747 UTC--
| Itration            | 1318     |
| Real Det Return     | 5.63e+03 |
| Real Sto Return     | 4.47e+03 |
| Reward Loss         | 1.39e+05 |
| Running Env Steps   | 6590000  |
| Running Forward KL  | 6.18     |
| Running Reverse KL  | 88       |
| Running Update Time | 1318     |
----------------------------------
--2024-08-12 19:48:30.056006 UTC---
| Itration            | 1319      |
| Real Det Return     | 5.73e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -1.24e+05 |
| Running Env Steps   | 6595000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 33.8      |
| Running Update Time | 1319      |
-----------------------------------
--2024-08-12 19:50:02.443585 UTC--
| Itration            | 1320     |
| Real Det Return     | 5.88e+03 |
| Real Sto Return     | 4.89e+03 |
| Reward Loss         | 2.81e+05 |
| Running Env Steps   | 6600000  |
| Running Forward KL  | 6.47     |
| Running Reverse KL  | 80.6     |
| Running Update Time | 1320     |
----------------------------------
--2024-08-12 19:51:36.179905 UTC---
| Itration            | 1321      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -3.29e+05 |
| Running Env Steps   | 6605000   |
| Running Forward KL  | 5.85      |
| Running Reverse KL  | 23.5      |
| Running Update Time | 1321      |
-----------------------------------
--2024-08-12 19:53:10.741354 UTC--
| Itration            | 1322     |
| Real Det Return     | 5.78e+03 |
| Real Sto Return     | 5.18e+03 |
| Reward Loss         | 2.07e+05 |
| Running Env Steps   | 6610000  |
| Running Forward KL  | 5.7      |
| Running Reverse KL  | 3.78     |
| Running Update Time | 1322     |
----------------------------------
--2024-08-12 19:54:44.770555 UTC--
| Itration            | 1323     |
| Real Det Return     | 5.86e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 4.37e+05 |
| Running Env Steps   | 6615000  |
| Running Forward KL  | 5.98     |
| Running Reverse KL  | 4.32     |
| Running Update Time | 1323     |
----------------------------------
--2024-08-12 19:56:17.303189 UTC---
| Itration            | 1324      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -1.42e+05 |
| Running Env Steps   | 6620000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 31.9      |
| Running Update Time | 1324      |
-----------------------------------
--2024-08-12 19:57:54.573973 UTC--
| Itration            | 1325     |
| Real Det Return     | 5.81e+03 |
| Real Sto Return     | 5.69e+03 |
| Reward Loss         | 1.55e+04 |
| Running Env Steps   | 6625000  |
| Running Forward KL  | 5.73     |
| Running Reverse KL  | 3.72     |
| Running Update Time | 1325     |
----------------------------------
--2024-08-12 19:59:28.686911 UTC--
| Itration            | 1326     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.6e+03  |
| Reward Loss         | 3.53e+05 |
| Running Env Steps   | 6630000  |
| Running Forward KL  | 5.47     |
| Running Reverse KL  | 3.83     |
| Running Update Time | 1326     |
----------------------------------
--2024-08-12 20:01:04.930286 UTC--
| Itration            | 1327     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.41e+03 |
| Reward Loss         | 1.82e+05 |
| Running Env Steps   | 6635000  |
| Running Forward KL  | 5.6      |
| Running Reverse KL  | 3.42     |
| Running Update Time | 1327     |
----------------------------------
--2024-08-12 20:02:40.187309 UTC---
| Itration            | 1328      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -9.27e+05 |
| Running Env Steps   | 6640000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 53.7      |
| Running Update Time | 1328      |
-----------------------------------
--2024-08-12 20:04:13.112719 UTC---
| Itration            | 1329      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -8.77e+04 |
| Running Env Steps   | 6645000   |
| Running Forward KL  | 5.52      |
| Running Reverse KL  | 3.58      |
| Running Update Time | 1329      |
-----------------------------------
--2024-08-12 20:05:51.021260 UTC--
| Itration            | 1330     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.6e+03  |
| Reward Loss         | 1.74e+04 |
| Running Env Steps   | 6650000  |
| Running Forward KL  | 5.81     |
| Running Reverse KL  | 32.2     |
| Running Update Time | 1330     |
----------------------------------
--2024-08-12 20:07:21.247764 UTC---
| Itration            | 1331      |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -6.29e+05 |
| Running Env Steps   | 6655000   |
| Running Forward KL  | 5.59      |
| Running Reverse KL  | 20.9      |
| Running Update Time | 1331      |
-----------------------------------
--2024-08-12 20:08:53.344951 UTC--
| Itration            | 1332     |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 1.13e+04 |
| Running Env Steps   | 6660000  |
| Running Forward KL  | 5.65     |
| Running Reverse KL  | 3.77     |
| Running Update Time | 1332     |
----------------------------------
--2024-08-12 20:10:31.094385 UTC---
| Itration            | 1333      |
| Real Det Return     | 5.75e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -5.54e+04 |
| Running Env Steps   | 6665000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 35.4      |
| Running Update Time | 1333      |
-----------------------------------
--2024-08-12 20:12:04.390245 UTC--
| Itration            | 1334     |
| Real Det Return     | 5.68e+03 |
| Real Sto Return     | 5.34e+03 |
| Reward Loss         | 1.25e+05 |
| Running Env Steps   | 6670000  |
| Running Forward KL  | 5.52     |
| Running Reverse KL  | 3.69     |
| Running Update Time | 1334     |
----------------------------------
--2024-08-12 20:13:38.098819 UTC--
| Itration            | 1335     |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.13e+03 |
| Reward Loss         | -1.5e+05 |
| Running Env Steps   | 6675000  |
| Running Forward KL  | 5.75     |
| Running Reverse KL  | 4.23     |
| Running Update Time | 1335     |
----------------------------------
--2024-08-12 20:15:15.086341 UTC--
| Itration            | 1336     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.62e+03 |
| Reward Loss         | 2.76e+05 |
| Running Env Steps   | 6680000  |
| Running Forward KL  | 5.77     |
| Running Reverse KL  | 3.95     |
| Running Update Time | 1336     |
----------------------------------
--2024-08-12 20:16:49.544474 UTC---
| Itration            | 1337      |
| Real Det Return     | 5.81e+03  |
| Real Sto Return     | 5.64e+03  |
| Reward Loss         | -1.53e+05 |
| Running Env Steps   | 6685000   |
| Running Forward KL  | 5.91      |
| Running Reverse KL  | 30.2      |
| Running Update Time | 1337      |
-----------------------------------
--2024-08-12 20:18:25.078935 UTC--
| Itration            | 1338     |
| Real Det Return     | 5.76e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | 2.49e+05 |
| Running Env Steps   | 6690000  |
| Running Forward KL  | 5.31     |
| Running Reverse KL  | 3.2      |
| Running Update Time | 1338     |
----------------------------------
--2024-08-12 20:19:59.570483 UTC--
| Itration            | 1339     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.05e+03 |
| Reward Loss         | 1.74e+05 |
| Running Env Steps   | 6695000  |
| Running Forward KL  | 5.78     |
| Running Reverse KL  | 16.9     |
| Running Update Time | 1339     |
----------------------------------
--2024-08-12 20:21:33.019318 UTC---
| Itration            | 1340      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -2.45e+05 |
| Running Env Steps   | 6700000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 3.59      |
| Running Update Time | 1340      |
-----------------------------------
--2024-08-12 20:23:07.306649 UTC--
| Itration            | 1341     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | 7.98e+03 |
| Running Env Steps   | 6705000  |
| Running Forward KL  | 5.44     |
| Running Reverse KL  | 21.1     |
| Running Update Time | 1341     |
----------------------------------
--2024-08-12 20:24:38.460965 UTC---
| Itration            | 1342      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -4.28e+04 |
| Running Env Steps   | 6710000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 3.07      |
| Running Update Time | 1342      |
-----------------------------------
--2024-08-12 20:26:08.121166 UTC---
| Itration            | 1343      |
| Real Det Return     | 5.77e+03  |
| Real Sto Return     | 5.67e+03  |
| Reward Loss         | -2.29e+05 |
| Running Env Steps   | 6715000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 29.2      |
| Running Update Time | 1343      |
-----------------------------------
--2024-08-12 20:27:44.791729 UTC---
| Itration            | 1344      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -6.05e+04 |
| Running Env Steps   | 6720000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 3.35      |
| Running Update Time | 1344      |
-----------------------------------
--2024-08-12 20:29:17.885130 UTC--
| Itration            | 1345     |
| Real Det Return     | 5.64e+03 |
| Real Sto Return     | 5.4e+03  |
| Reward Loss         | 2.99e+05 |
| Running Env Steps   | 6725000  |
| Running Forward KL  | 5.75     |
| Running Reverse KL  | 23.2     |
| Running Update Time | 1345     |
----------------------------------
--2024-08-12 20:30:52.893352 UTC--
| Itration            | 1346     |
| Real Det Return     | 5.66e+03 |
| Real Sto Return     | 5.45e+03 |
| Reward Loss         | 2.1e+05  |
| Running Env Steps   | 6730000  |
| Running Forward KL  | 5.22     |
| Running Reverse KL  | 3.09     |
| Running Update Time | 1346     |
----------------------------------
--2024-08-12 20:32:28.785477 UTC---
| Itration            | 1347      |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -9.35e+04 |
| Running Env Steps   | 6735000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 3.45      |
| Running Update Time | 1347      |
-----------------------------------
--2024-08-12 20:33:58.965756 UTC---
| Itration            | 1348      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 4.26e+03  |
| Reward Loss         | -7.98e+05 |
| Running Env Steps   | 6740000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 38.7      |
| Running Update Time | 1348      |
-----------------------------------
--2024-08-12 20:35:34.897425 UTC--
| Itration            | 1349     |
| Real Det Return     | 5.64e+03 |
| Real Sto Return     | 5.29e+03 |
| Reward Loss         | -1e+05   |
| Running Env Steps   | 6745000  |
| Running Forward KL  | 5.38     |
| Running Reverse KL  | 3.69     |
| Running Update Time | 1349     |
----------------------------------
--2024-08-12 20:37:08.660432 UTC---
| Itration            | 1350      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -7.82e+04 |
| Running Env Steps   | 6750000   |
| Running Forward KL  | 5.23      |
| Running Reverse KL  | 3.09      |
| Running Update Time | 1350      |
-----------------------------------
--2024-08-12 20:38:39.179669 UTC--
| Itration            | 1351     |
| Real Det Return     | 5.7e+03  |
| Real Sto Return     | 4.75e+03 |
| Reward Loss         | 3.16e+05 |
| Running Env Steps   | 6755000  |
| Running Forward KL  | 5.64     |
| Running Reverse KL  | 8.03     |
| Running Update Time | 1351     |
----------------------------------
--2024-08-12 20:40:16.965721 UTC---
| Itration            | 1352      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -1.42e+05 |
| Running Env Steps   | 6760000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 1352      |
-----------------------------------
--2024-08-12 20:41:52.257918 UTC---
| Itration            | 1353      |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -3.03e+05 |
| Running Env Steps   | 6765000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 21.9      |
| Running Update Time | 1353      |
-----------------------------------
--2024-08-12 20:43:25.614630 UTC--
| Itration            | 1354     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.38e+03 |
| Reward Loss         | 1.92e+05 |
| Running Env Steps   | 6770000  |
| Running Forward KL  | 5.73     |
| Running Reverse KL  | 3.85     |
| Running Update Time | 1354     |
----------------------------------
--2024-08-12 20:45:03.929658 UTC--
| Itration            | 1355     |
| Real Det Return     | 5.53e+03 |
| Real Sto Return     | 5.54e+03 |
| Reward Loss         | 3.45e+05 |
| Running Env Steps   | 6775000  |
| Running Forward KL  | 5.08     |
| Running Reverse KL  | 3.47     |
| Running Update Time | 1355     |
----------------------------------
--2024-08-12 20:46:35.776677 UTC---
| Itration            | 1356      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -8.45e+04 |
| Running Env Steps   | 6780000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 31.3      |
| Running Update Time | 1356      |
-----------------------------------
--2024-08-12 20:48:11.795770 UTC--
| Itration            | 1357     |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.41e+03 |
| Reward Loss         | -1.3e+05 |
| Running Env Steps   | 6785000  |
| Running Forward KL  | 5.31     |
| Running Reverse KL  | 3.29     |
| Running Update Time | 1357     |
----------------------------------
--2024-08-12 20:49:47.319088 UTC---
| Itration            | 1358      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -2.34e+05 |
| Running Env Steps   | 6790000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 23.1      |
| Running Update Time | 1358      |
-----------------------------------
--2024-08-12 20:51:19.034431 UTC---
| Itration            | 1359      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.52e+03  |
| Reward Loss         | -3.21e+04 |
| Running Env Steps   | 6795000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 9.49      |
| Running Update Time | 1359      |
-----------------------------------
--2024-08-12 20:52:55.207892 UTC--
| Itration            | 1360     |
| Real Det Return     | 5.75e+03 |
| Real Sto Return     | 5.42e+03 |
| Reward Loss         | 4.17e+05 |
| Running Env Steps   | 6800000  |
| Running Forward KL  | 5.44     |
| Running Reverse KL  | 23.6     |
| Running Update Time | 1360     |
----------------------------------
--2024-08-12 20:54:25.975242 UTC---
| Itration            | 1361      |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -1.92e+06 |
| Running Env Steps   | 6805000   |
| Running Forward KL  | 7.23      |
| Running Reverse KL  | 141       |
| Running Update Time | 1361      |
-----------------------------------
--2024-08-12 20:55:59.966746 UTC--
| Itration            | 1362     |
| Real Det Return     | 5.72e+03 |
| Real Sto Return     | 5.45e+03 |
| Reward Loss         | 5.24e+05 |
| Running Env Steps   | 6810000  |
| Running Forward KL  | 5.91     |
| Running Reverse KL  | 3.74     |
| Running Update Time | 1362     |
----------------------------------
--2024-08-12 20:57:37.139854 UTC--
| Itration            | 1363     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.37e+03 |
| Reward Loss         | 1.12e+05 |
| Running Env Steps   | 6815000  |
| Running Forward KL  | 5.62     |
| Running Reverse KL  | 3.69     |
| Running Update Time | 1363     |
----------------------------------
--2024-08-12 20:59:12.216401 UTC--
| Itration            | 1364     |
| Real Det Return     | 5.67e+03 |
| Real Sto Return     | 5.59e+03 |
| Reward Loss         | 8.91e+04 |
| Running Env Steps   | 6820000  |
| Running Forward KL  | 5.88     |
| Running Reverse KL  | 4.11     |
| Running Update Time | 1364     |
----------------------------------
--2024-08-12 21:00:46.339658 UTC---
| Itration            | 1365      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -3.47e+05 |
| Running Env Steps   | 6825000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 16.6      |
| Running Update Time | 1365      |
-----------------------------------
--2024-08-12 21:02:22.167217 UTC---
| Itration            | 1366      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -9.42e+05 |
| Running Env Steps   | 6830000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 60.3      |
| Running Update Time | 1366      |
-----------------------------------
--2024-08-12 21:03:54.751180 UTC---
| Itration            | 1367      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.34e+05 |
| Running Env Steps   | 6835000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 27.2      |
| Running Update Time | 1367      |
-----------------------------------
--2024-08-12 21:05:27.195919 UTC---
| Itration            | 1368      |
| Real Det Return     | 5.78e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -5.34e+05 |
| Running Env Steps   | 6840000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 53.4      |
| Running Update Time | 1368      |
-----------------------------------
--2024-08-12 21:07:03.087689 UTC---
| Itration            | 1369      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 6845000   |
| Running Forward KL  | 5.86      |
| Running Reverse KL  | 35.5      |
| Running Update Time | 1369      |
-----------------------------------
--2024-08-12 21:08:36.714189 UTC--
| Itration            | 1370     |
| Real Det Return     | 5.82e+03 |
| Real Sto Return     | 5.5e+03  |
| Reward Loss         | 9.18e+04 |
| Running Env Steps   | 6850000  |
| Running Forward KL  | 5.66     |
| Running Reverse KL  | 5.77     |
| Running Update Time | 1370     |
----------------------------------
--2024-08-12 21:10:13.218371 UTC---
| Itration            | 1371      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.93e+05 |
| Running Env Steps   | 6855000   |
| Running Forward KL  | 4.95      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 1371      |
-----------------------------------
--2024-08-12 21:11:43.534356 UTC--
| Itration            | 1372     |
| Real Det Return     | 5.85e+03 |
| Real Sto Return     | 3.99e+03 |
| Reward Loss         | 7.35e+04 |
| Running Env Steps   | 6860000  |
| Running Forward KL  | 5.69     |
| Running Reverse KL  | 23.6     |
| Running Update Time | 1372     |
----------------------------------
--2024-08-12 21:13:16.293602 UTC---
| Itration            | 1373      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -4.93e+05 |
| Running Env Steps   | 6865000   |
| Running Forward KL  | 5.53      |
| Running Reverse KL  | 22.2      |
| Running Update Time | 1373      |
-----------------------------------
--2024-08-12 21:14:52.924711 UTC---
| Itration            | 1374      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -2.58e+05 |
| Running Env Steps   | 6870000   |
| Running Forward KL  | 5.13      |
| Running Reverse KL  | 3.24      |
| Running Update Time | 1374      |
-----------------------------------
--2024-08-12 21:16:25.665028 UTC--
| Itration            | 1375     |
| Real Det Return     | 5.68e+03 |
| Real Sto Return     | 4.96e+03 |
| Reward Loss         | 2.94e+05 |
| Running Env Steps   | 6875000  |
| Running Forward KL  | 5.63     |
| Running Reverse KL  | 3.99     |
| Running Update Time | 1375     |
----------------------------------
--2024-08-12 21:17:57.117163 UTC--
| Itration            | 1376     |
| Real Det Return     | 5.73e+03 |
| Real Sto Return     | 5.37e+03 |
| Reward Loss         | 4.73e+04 |
| Running Env Steps   | 6880000  |
| Running Forward KL  | 5.27     |
| Running Reverse KL  | 3.33     |
| Running Update Time | 1376     |
----------------------------------
--2024-08-12 21:19:35.741487 UTC---
| Itration            | 1377      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -1.35e+05 |
| Running Env Steps   | 6885000   |
| Running Forward KL  | 4.72      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 1377      |
-----------------------------------
--2024-08-12 21:21:09.834852 UTC---
| Itration            | 1378      |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -4.08e+05 |
| Running Env Steps   | 6890000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 49.8      |
| Running Update Time | 1378      |
-----------------------------------
--2024-08-12 21:22:45.718067 UTC---
| Itration            | 1379      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -2.99e+05 |
| Running Env Steps   | 6895000   |
| Running Forward KL  | 5.12      |
| Running Reverse KL  | 2.94      |
| Running Update Time | 1379      |
-----------------------------------
--2024-08-12 21:24:20.930554 UTC---
| Itration            | 1380      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -3.96e+05 |
| Running Env Steps   | 6900000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 3.58      |
| Running Update Time | 1380      |
-----------------------------------
--2024-08-12 21:25:54.154432 UTC--
| Itration            | 1381     |
| Real Det Return     | 5.77e+03 |
| Real Sto Return     | 5.34e+03 |
| Reward Loss         | 9.63e+04 |
| Running Env Steps   | 6905000  |
| Running Forward KL  | 5.57     |
| Running Reverse KL  | 4.05     |
| Running Update Time | 1381     |
----------------------------------
--2024-08-12 21:27:30.701034 UTC---
| Itration            | 1382      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -2.49e+05 |
| Running Env Steps   | 6910000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 3.37      |
| Running Update Time | 1382      |
-----------------------------------
--2024-08-12 21:29:05.644202 UTC--
| Itration            | 1383     |
| Real Det Return     | 5.7e+03  |
| Real Sto Return     | 5.58e+03 |
| Reward Loss         | 1.85e+05 |
| Running Env Steps   | 6915000  |
| Running Forward KL  | 5.53     |
| Running Reverse KL  | 3.93     |
| Running Update Time | 1383     |
----------------------------------
--2024-08-12 21:30:36.685505 UTC---
| Itration            | 1384      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -8.17e+04 |
| Running Env Steps   | 6920000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 5.32      |
| Running Update Time | 1384      |
-----------------------------------
--2024-08-12 21:32:15.145963 UTC---
| Itration            | 1385      |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.03e+05 |
| Running Env Steps   | 6925000   |
| Running Forward KL  | 5.58      |
| Running Reverse KL  | 26.1      |
| Running Update Time | 1385      |
-----------------------------------
--2024-08-12 21:33:48.530549 UTC---
| Itration            | 1386      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.03e+05 |
| Running Env Steps   | 6930000   |
| Running Forward KL  | 5.38      |
| Running Reverse KL  | 3.4       |
| Running Update Time | 1386      |
-----------------------------------
--2024-08-12 21:35:24.918294 UTC---
| Itration            | 1387      |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -8.45e+05 |
| Running Env Steps   | 6935000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 44.4      |
| Running Update Time | 1387      |
-----------------------------------
--2024-08-12 21:37:01.967753 UTC---
| Itration            | 1388      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 5.6e+03   |
| Reward Loss         | -2.58e+05 |
| Running Env Steps   | 6940000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 22.1      |
| Running Update Time | 1388      |
-----------------------------------
--2024-08-12 21:38:34.299361 UTC---
| Itration            | 1389      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.68e+05 |
| Running Env Steps   | 6945000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 8.25      |
| Running Update Time | 1389      |
-----------------------------------
--2024-08-12 21:40:09.852825 UTC---
| Itration            | 1390      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -5.38e+05 |
| Running Env Steps   | 6950000   |
| Running Forward KL  | 5.23      |
| Running Reverse KL  | 2.92      |
| Running Update Time | 1390      |
-----------------------------------
--2024-08-12 21:41:44.969408 UTC---
| Itration            | 1391      |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -2.72e+05 |
| Running Env Steps   | 6955000   |
| Running Forward KL  | 5.42      |
| Running Reverse KL  | 3         |
| Running Update Time | 1391      |
-----------------------------------
--2024-08-12 21:43:18.281661 UTC---
| Itration            | 1392      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -2.58e+05 |
| Running Env Steps   | 6960000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 1392      |
-----------------------------------
--2024-08-12 21:44:53.083171 UTC---
| Itration            | 1393      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -2.18e+05 |
| Running Env Steps   | 6965000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 3.7       |
| Running Update Time | 1393      |
-----------------------------------
--2024-08-12 21:46:27.087534 UTC---
| Itration            | 1394      |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -5.94e+05 |
| Running Env Steps   | 6970000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 3.11      |
| Running Update Time | 1394      |
-----------------------------------
--2024-08-12 21:48:02.802512 UTC--
| Itration            | 1395     |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 5.49e+03 |
| Reward Loss         | 2.29e+04 |
| Running Env Steps   | 6975000  |
| Running Forward KL  | 5.33     |
| Running Reverse KL  | 3.55     |
| Running Update Time | 1395     |
----------------------------------
--2024-08-12 21:49:39.516963 UTC---
| Itration            | 1396      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.52e+05 |
| Running Env Steps   | 6980000   |
| Running Forward KL  | 5.41      |
| Running Reverse KL  | 3.7       |
| Running Update Time | 1396      |
-----------------------------------
--2024-08-12 21:51:10.569303 UTC---
| Itration            | 1397      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 4.28e+03  |
| Reward Loss         | -5.74e+05 |
| Running Env Steps   | 6985000   |
| Running Forward KL  | 5.38      |
| Running Reverse KL  | 26.8      |
| Running Update Time | 1397      |
-----------------------------------
--2024-08-12 21:52:46.032249 UTC---
| Itration            | 1398      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -2.27e+04 |
| Running Env Steps   | 6990000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 3.36      |
| Running Update Time | 1398      |
-----------------------------------
--2024-08-12 21:54:20.767890 UTC---
| Itration            | 1399      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -3.92e+05 |
| Running Env Steps   | 6995000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 23.6      |
| Running Update Time | 1399      |
-----------------------------------
--2024-08-12 21:55:55.089949 UTC---
| Itration            | 1400      |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -6.91e+05 |
| Running Env Steps   | 7000000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 16.3      |
| Running Update Time | 1400      |
-----------------------------------
--2024-08-12 21:57:26.849944 UTC--
| Itration            | 1401     |
| Real Det Return     | 5.68e+03 |
| Real Sto Return     | 5.33e+03 |
| Reward Loss         | 1.13e+05 |
| Running Env Steps   | 7005000  |
| Running Forward KL  | 5.42     |
| Running Reverse KL  | 3.03     |
| Running Update Time | 1401     |
----------------------------------
--2024-08-12 21:59:01.157875 UTC---
| Itration            | 1402      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.35e+05 |
| Running Env Steps   | 7010000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 4.53      |
| Running Update Time | 1402      |
-----------------------------------
--2024-08-12 22:00:34.526257 UTC--
| Itration            | 1403     |
| Real Det Return     | 5.59e+03 |
| Real Sto Return     | 5.58e+03 |
| Reward Loss         | 3.22e+05 |
| Running Env Steps   | 7015000  |
| Running Forward KL  | 5.36     |
| Running Reverse KL  | 20       |
| Running Update Time | 1403     |
----------------------------------
--2024-08-12 22:02:10.760094 UTC---
| Itration            | 1404      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -5.05e+05 |
| Running Env Steps   | 7020000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 28.9      |
| Running Update Time | 1404      |
-----------------------------------
--2024-08-12 22:03:45.006713 UTC---
| Itration            | 1405      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -4.18e+05 |
| Running Env Steps   | 7025000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 22.5      |
| Running Update Time | 1405      |
-----------------------------------
--2024-08-12 22:05:19.980925 UTC--
| Itration            | 1406     |
| Real Det Return     | 5.41e+03 |
| Real Sto Return     | 5.46e+03 |
| Reward Loss         | -5.5e+05 |
| Running Env Steps   | 7030000  |
| Running Forward KL  | 5.52     |
| Running Reverse KL  | 8.17     |
| Running Update Time | 1406     |
----------------------------------
--2024-08-12 22:06:56.394715 UTC---
| Itration            | 1407      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -1.19e+05 |
| Running Env Steps   | 7035000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 34.4      |
| Running Update Time | 1407      |
-----------------------------------
--2024-08-12 22:08:31.340040 UTC---
| Itration            | 1408      |
| Real Det Return     | 5.73e+03  |
| Real Sto Return     | 5.58e+03  |
| Reward Loss         | -3.43e+05 |
| Running Env Steps   | 7040000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 26.8      |
| Running Update Time | 1408      |
-----------------------------------
--2024-08-12 22:10:03.543504 UTC---
| Itration            | 1409      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -2.67e+04 |
| Running Env Steps   | 7045000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 4.34      |
| Running Update Time | 1409      |
-----------------------------------
--2024-08-12 22:11:38.443766 UTC---
| Itration            | 1410      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -2.52e+05 |
| Running Env Steps   | 7050000   |
| Running Forward KL  | 5.81      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 1410      |
-----------------------------------
--2024-08-12 22:13:10.892940 UTC---
| Itration            | 1411      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -6.46e+05 |
| Running Env Steps   | 7055000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 1411      |
-----------------------------------
--2024-08-12 22:14:42.380559 UTC---
| Itration            | 1412      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.55e+03  |
| Reward Loss         | -2.47e+05 |
| Running Env Steps   | 7060000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 1412      |
-----------------------------------
--2024-08-12 22:16:19.360951 UTC---
| Itration            | 1413      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -1.33e+05 |
| Running Env Steps   | 7065000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 4.08      |
| Running Update Time | 1413      |
-----------------------------------
--2024-08-12 22:17:51.826976 UTC---
| Itration            | 1414      |
| Real Det Return     | 5.68e+03  |
| Real Sto Return     | 5.56e+03  |
| Reward Loss         | -2.15e+05 |
| Running Env Steps   | 7070000   |
| Running Forward KL  | 5.75      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 1414      |
-----------------------------------
--2024-08-12 22:19:21.887775 UTC---
| Itration            | 1415      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -6.01e+05 |
| Running Env Steps   | 7075000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 28.9      |
| Running Update Time | 1415      |
-----------------------------------
--2024-08-12 22:20:55.856901 UTC---
| Itration            | 1416      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -2.22e+05 |
| Running Env Steps   | 7080000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 33.2      |
| Running Update Time | 1416      |
-----------------------------------
--2024-08-12 22:22:29.678391 UTC--
| Itration            | 1417     |
| Real Det Return     | 5.61e+03 |
| Real Sto Return     | 5.5e+03  |
| Reward Loss         | 1.17e+05 |
| Running Env Steps   | 7085000  |
| Running Forward KL  | 5.29     |
| Running Reverse KL  | 3.45     |
| Running Update Time | 1417     |
----------------------------------
--2024-08-12 22:24:05.375073 UTC---
| Itration            | 1418      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -7.95e+05 |
| Running Env Steps   | 7090000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 29.4      |
| Running Update Time | 1418      |
-----------------------------------
--2024-08-12 22:25:40.944009 UTC--
| Itration            | 1419     |
| Real Det Return     | 5.48e+03 |
| Real Sto Return     | 5.24e+03 |
| Reward Loss         | -4.7e+05 |
| Running Env Steps   | 7095000  |
| Running Forward KL  | 5.32     |
| Running Reverse KL  | 3.04     |
| Running Update Time | 1419     |
----------------------------------
--2024-08-12 22:27:14.334057 UTC---
| Itration            | 1420      |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.55e+03  |
| Reward Loss         | -1.06e+05 |
| Running Env Steps   | 7100000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 3.83      |
| Running Update Time | 1420      |
-----------------------------------
--2024-08-12 22:28:51.505248 UTC---
| Itration            | 1421      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -2.24e+05 |
| Running Env Steps   | 7105000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 3.55      |
| Running Update Time | 1421      |
-----------------------------------
--2024-08-12 22:30:25.450405 UTC---
| Itration            | 1422      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -2.39e+05 |
| Running Env Steps   | 7110000   |
| Running Forward KL  | 5.08      |
| Running Reverse KL  | 3.48      |
| Running Update Time | 1422      |
-----------------------------------
--2024-08-12 22:31:58.366376 UTC---
| Itration            | 1423      |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 7115000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 22        |
| Running Update Time | 1423      |
-----------------------------------
--2024-08-12 22:33:36.040006 UTC---
| Itration            | 1424      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -1.53e+06 |
| Running Env Steps   | 7120000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 53.7      |
| Running Update Time | 1424      |
-----------------------------------
--2024-08-12 22:35:08.200731 UTC---
| Itration            | 1425      |
| Real Det Return     | 5.72e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -5.11e+05 |
| Running Env Steps   | 7125000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 27.6      |
| Running Update Time | 1425      |
-----------------------------------
--2024-08-12 22:36:43.736884 UTC---
| Itration            | 1426      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -5.04e+05 |
| Running Env Steps   | 7130000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 14.1      |
| Running Update Time | 1426      |
-----------------------------------
--2024-08-12 22:38:19.759793 UTC---
| Itration            | 1427      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -7.53e+05 |
| Running Env Steps   | 7135000   |
| Running Forward KL  | 5.32      |
| Running Reverse KL  | 30.7      |
| Running Update Time | 1427      |
-----------------------------------
--2024-08-12 22:39:53.731218 UTC---
| Itration            | 1428      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -2.25e+05 |
| Running Env Steps   | 7140000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 3.49      |
| Running Update Time | 1428      |
-----------------------------------
--2024-08-12 22:41:29.460849 UTC---
| Itration            | 1429      |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -3.87e+05 |
| Running Env Steps   | 7145000   |
| Running Forward KL  | 5.23      |
| Running Reverse KL  | 3.4       |
| Running Update Time | 1429      |
-----------------------------------
--2024-08-12 22:43:04.513223 UTC---
| Itration            | 1430      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -9.57e+05 |
| Running Env Steps   | 7150000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 33.4      |
| Running Update Time | 1430      |
-----------------------------------
--2024-08-12 22:44:37.772503 UTC---
| Itration            | 1431      |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -9.83e+05 |
| Running Env Steps   | 7155000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 21.9      |
| Running Update Time | 1431      |
-----------------------------------
--2024-08-12 22:46:15.977894 UTC---
| Itration            | 1432      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -2.14e+05 |
| Running Env Steps   | 7160000   |
| Running Forward KL  | 5.34      |
| Running Reverse KL  | 3.43      |
| Running Update Time | 1432      |
-----------------------------------
--2024-08-12 22:47:51.127880 UTC---
| Itration            | 1433      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -2.05e+05 |
| Running Env Steps   | 7165000   |
| Running Forward KL  | 5.02      |
| Running Reverse KL  | 3.5       |
| Running Update Time | 1433      |
-----------------------------------
--2024-08-12 22:49:24.473782 UTC---
| Itration            | 1434      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -3.96e+05 |
| Running Env Steps   | 7170000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 3.95      |
| Running Update Time | 1434      |
-----------------------------------
--2024-08-12 22:50:59.432676 UTC--
| Itration            | 1435     |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 4.98e+03 |
| Reward Loss         | -4.2e+05 |
| Running Env Steps   | 7175000  |
| Running Forward KL  | 5.54     |
| Running Reverse KL  | 3.45     |
| Running Update Time | 1435     |
----------------------------------
--2024-08-12 22:52:34.579000 UTC---
| Itration            | 1436      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -6.84e+05 |
| Running Env Steps   | 7180000   |
| Running Forward KL  | 5.29      |
| Running Reverse KL  | 3.02      |
| Running Update Time | 1436      |
-----------------------------------
--2024-08-12 22:54:06.105869 UTC---
| Itration            | 1437      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -6.01e+05 |
| Running Env Steps   | 7185000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 1437      |
-----------------------------------
--2024-08-12 22:55:42.729078 UTC---
| Itration            | 1438      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -4.54e+05 |
| Running Env Steps   | 7190000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 19.8      |
| Running Update Time | 1438      |
-----------------------------------
--2024-08-12 22:57:15.922993 UTC---
| Itration            | 1439      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -2.54e+05 |
| Running Env Steps   | 7195000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 3.63      |
| Running Update Time | 1439      |
-----------------------------------
--2024-08-12 22:58:52.875232 UTC---
| Itration            | 1440      |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.61e+03  |
| Reward Loss         | -2.71e+05 |
| Running Env Steps   | 7200000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 22.1      |
| Running Update Time | 1440      |
-----------------------------------
--2024-08-12 23:00:28.572633 UTC---
| Itration            | 1441      |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -6.82e+05 |
| Running Env Steps   | 7205000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 1441      |
-----------------------------------
--2024-08-12 23:02:02.717120 UTC---
| Itration            | 1442      |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.56e+03  |
| Reward Loss         | -1.77e+05 |
| Running Env Steps   | 7210000   |
| Running Forward KL  | 5.31      |
| Running Reverse KL  | 10        |
| Running Update Time | 1442      |
-----------------------------------
--2024-08-12 23:03:39.398377 UTC---
| Itration            | 1443      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -5.91e+05 |
| Running Env Steps   | 7215000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 5.46      |
| Running Update Time | 1443      |
-----------------------------------
--2024-08-12 23:05:14.310387 UTC---
| Itration            | 1444      |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -5.97e+05 |
| Running Env Steps   | 7220000   |
| Running Forward KL  | 5.27      |
| Running Reverse KL  | 3.65      |
| Running Update Time | 1444      |
-----------------------------------
--2024-08-12 23:06:47.909862 UTC--
| Itration            | 1445     |
| Real Det Return     | 5.5e+03  |
| Real Sto Return     | 5.27e+03 |
| Reward Loss         | -2.5e+05 |
| Running Env Steps   | 7225000  |
| Running Forward KL  | 5.33     |
| Running Reverse KL  | 3.46     |
| Running Update Time | 1445     |
----------------------------------
--2024-08-12 23:08:23.412181 UTC---
| Itration            | 1446      |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -7.99e+05 |
| Running Env Steps   | 7230000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 3.75      |
| Running Update Time | 1446      |
-----------------------------------
--2024-08-12 23:09:56.921993 UTC---
| Itration            | 1447      |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -4.25e+06 |
| Running Env Steps   | 7235000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 81.3      |
| Running Update Time | 1447      |
-----------------------------------
--2024-08-12 23:11:30.207908 UTC---
| Itration            | 1448      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -3.05e+05 |
| Running Env Steps   | 7240000   |
| Running Forward KL  | 5.25      |
| Running Reverse KL  | 3.21      |
| Running Update Time | 1448      |
-----------------------------------
--2024-08-12 23:13:07.989775 UTC---
| Itration            | 1449      |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -4.91e+05 |
| Running Env Steps   | 7245000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 3.31      |
| Running Update Time | 1449      |
-----------------------------------
--2024-08-12 23:14:43.098299 UTC---
| Itration            | 1450      |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -6.91e+05 |
| Running Env Steps   | 7250000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 3.2       |
| Running Update Time | 1450      |
-----------------------------------
--2024-08-12 23:16:17.575504 UTC---
| Itration            | 1451      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -2.42e+05 |
| Running Env Steps   | 7255000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 7.24      |
| Running Update Time | 1451      |
-----------------------------------
--2024-08-12 23:17:54.754960 UTC---
| Itration            | 1452      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -5.19e+05 |
| Running Env Steps   | 7260000   |
| Running Forward KL  | 5.1       |
| Running Reverse KL  | 3.03      |
| Running Update Time | 1452      |
-----------------------------------
--2024-08-12 23:19:28.126311 UTC---
| Itration            | 1453      |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -6.28e+05 |
| Running Env Steps   | 7265000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 4.48      |
| Running Update Time | 1453      |
-----------------------------------
--2024-08-12 23:21:00.340173 UTC---
| Itration            | 1454      |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.68e+05 |
| Running Env Steps   | 7270000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 3.45      |
| Running Update Time | 1454      |
-----------------------------------
--2024-08-12 23:22:35.462474 UTC---
| Itration            | 1455      |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.72e+05 |
| Running Env Steps   | 7275000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 3.55      |
| Running Update Time | 1455      |
-----------------------------------
--2024-08-12 23:24:08.084270 UTC---
| Itration            | 1456      |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -5.92e+05 |
| Running Env Steps   | 7280000   |
| Running Forward KL  | 5.13      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 1456      |
-----------------------------------
--2024-08-12 23:25:43.410888 UTC---
| Itration            | 1457      |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -3.11e+06 |
| Running Env Steps   | 7285000   |
| Running Forward KL  | 5.59      |
| Running Reverse KL  | 35.6      |
| Running Update Time | 1457      |
-----------------------------------
--2024-08-12 23:27:17.881390 UTC--
| Itration            | 1458     |
| Real Det Return     | 5.15e+03 |
| Real Sto Return     | 4.89e+03 |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 7290000  |
| Running Forward KL  | 6.37     |
| Running Reverse KL  | 64.9     |
| Running Update Time | 1458     |
----------------------------------
--2024-08-12 23:28:52.112609 UTC---
| Itration            | 1459      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -5.02e+05 |
| Running Env Steps   | 7295000   |
| Running Forward KL  | 5.21      |
| Running Reverse KL  | 3.11      |
| Running Update Time | 1459      |
-----------------------------------
--2024-08-12 23:30:29.353733 UTC---
| Itration            | 1460      |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -2.25e+05 |
| Running Env Steps   | 7300000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 3.57      |
| Running Update Time | 1460      |
-----------------------------------
--2024-08-12 23:32:03.809526 UTC--
| Itration            | 1461     |
| Real Det Return     | 5.67e+03 |
| Real Sto Return     | 5.46e+03 |
| Reward Loss         | 1.05e+05 |
| Running Env Steps   | 7305000  |
| Running Forward KL  | 5.28     |
| Running Reverse KL  | 3.15     |
| Running Update Time | 1461     |
----------------------------------
--2024-08-12 23:33:36.363308 UTC---
| Itration            | 1462      |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -7.22e+05 |
| Running Env Steps   | 7310000   |
| Running Forward KL  | 5.39      |
| Running Reverse KL  | 3.76      |
| Running Update Time | 1462      |
-----------------------------------
--2024-08-12 23:35:12.342786 UTC---
| Itration            | 1463      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -8.34e+05 |
| Running Env Steps   | 7315000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 24        |
| Running Update Time | 1463      |
-----------------------------------
--2024-08-12 23:36:47.622502 UTC---
| Itration            | 1464      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -5.41e+05 |
| Running Env Steps   | 7320000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 28.4      |
| Running Update Time | 1464      |
-----------------------------------
--2024-08-12 23:38:15.795756 UTC---
| Itration            | 1465      |
| Real Det Return     | 4.59e+03  |
| Real Sto Return     | 4.24e+03  |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 7325000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 24.6      |
| Running Update Time | 1465      |
-----------------------------------
--2024-08-12 23:39:47.142143 UTC---
| Itration            | 1466      |
| Real Det Return     | 5.7e+03   |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -2.62e+06 |
| Running Env Steps   | 7330000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 84.5      |
| Running Update Time | 1466      |
-----------------------------------
--2024-08-12 23:41:17.664089 UTC---
| Itration            | 1467      |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.35e+03  |
| Reward Loss         | -8.87e+05 |
| Running Env Steps   | 7335000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 26.1      |
| Running Update Time | 1467      |
-----------------------------------
--2024-08-12 23:42:54.090222 UTC---
| Itration            | 1468      |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -4.03e+05 |
| Running Env Steps   | 7340000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 3.05      |
| Running Update Time | 1468      |
-----------------------------------
--2024-08-12 23:44:27.429368 UTC---
| Itration            | 1469      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -4.37e+05 |
| Running Env Steps   | 7345000   |
| Running Forward KL  | 5.1       |
| Running Reverse KL  | 3.18      |
| Running Update Time | 1469      |
-----------------------------------
--2024-08-12 23:45:59.830374 UTC---
| Itration            | 1470      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.33e+05 |
| Running Env Steps   | 7350000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 3.36      |
| Running Update Time | 1470      |
-----------------------------------
--2024-08-12 23:47:35.998319 UTC---
| Itration            | 1471      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -5.91e+05 |
| Running Env Steps   | 7355000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 2.77      |
| Running Update Time | 1471      |
-----------------------------------
--2024-08-12 23:49:11.286405 UTC---
| Itration            | 1472      |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -7.84e+05 |
| Running Env Steps   | 7360000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 2.97      |
| Running Update Time | 1472      |
-----------------------------------
--2024-08-12 23:50:44.184393 UTC--
| Itration            | 1473     |
| Real Det Return     | 5.41e+03 |
| Real Sto Return     | 5.28e+03 |
| Reward Loss         | -6.1e+05 |
| Running Env Steps   | 7365000  |
| Running Forward KL  | 4.85     |
| Running Reverse KL  | 2.4      |
| Running Update Time | 1473     |
----------------------------------
--2024-08-12 23:52:18.872190 UTC---
| Itration            | 1474      |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -8.97e+05 |
| Running Env Steps   | 7370000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 29        |
| Running Update Time | 1474      |
-----------------------------------
--2024-08-12 23:53:52.490149 UTC---
| Itration            | 1475      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -7.47e+05 |
| Running Env Steps   | 7375000   |
| Running Forward KL  | 5.01      |
| Running Reverse KL  | 3         |
| Running Update Time | 1475      |
-----------------------------------
--2024-08-12 23:55:26.227022 UTC---
| Itration            | 1476      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -3.54e+05 |
| Running Env Steps   | 7380000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 2.78      |
| Running Update Time | 1476      |
-----------------------------------
--2024-08-12 23:57:02.104957 UTC---
| Itration            | 1477      |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.5e+03   |
| Reward Loss         | -1.29e+05 |
| Running Env Steps   | 7385000   |
| Running Forward KL  | 4.95      |
| Running Reverse KL  | 2.82      |
| Running Update Time | 1477      |
-----------------------------------
--2024-08-12 23:58:35.520335 UTC---
| Itration            | 1478      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.54e+03  |
| Reward Loss         | -2.53e+05 |
| Running Env Steps   | 7390000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 11        |
| Running Update Time | 1478      |
-----------------------------------
--2024-08-13 00:00:09.450149 UTC---
| Itration            | 1479      |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -3.26e+05 |
| Running Env Steps   | 7395000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 3.46      |
| Running Update Time | 1479      |
-----------------------------------
--2024-08-13 00:01:43.676537 UTC---
| Itration            | 1480      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -6.83e+05 |
| Running Env Steps   | 7400000   |
| Running Forward KL  | 5.24      |
| Running Reverse KL  | 2.54      |
| Running Update Time | 1480      |
-----------------------------------
--2024-08-13 00:03:15.107912 UTC---
| Itration            | 1481      |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -4.17e+05 |
| Running Env Steps   | 7405000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 2.39      |
| Running Update Time | 1481      |
-----------------------------------
--2024-08-13 00:04:50.745636 UTC--
| Itration            | 1482     |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.48e+03 |
| Reward Loss         | -3.3e+05 |
| Running Env Steps   | 7410000  |
| Running Forward KL  | 5.28     |
| Running Reverse KL  | 3.18     |
| Running Update Time | 1482     |
----------------------------------
--2024-08-13 00:06:21.966084 UTC---
| Itration            | 1483      |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -8.94e+05 |
| Running Env Steps   | 7415000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 16.7      |
| Running Update Time | 1483      |
-----------------------------------
--2024-08-13 00:07:53.452709 UTC---
| Itration            | 1484      |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -5.26e+05 |
| Running Env Steps   | 7420000   |
| Running Forward KL  | 5.12      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 1484      |
-----------------------------------
--2024-08-13 00:09:30.385642 UTC---
| Itration            | 1485      |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 7425000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 37.4      |
| Running Update Time | 1485      |
-----------------------------------
--2024-08-13 00:11:05.033789 UTC---
| Itration            | 1486      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 7430000   |
| Running Forward KL  | 5.25      |
| Running Reverse KL  | 28.8      |
| Running Update Time | 1486      |
-----------------------------------
--2024-08-13 00:12:38.165438 UTC--
| Itration            | 1487     |
| Real Det Return     | 5.45e+03 |
| Real Sto Return     | 5.36e+03 |
| Reward Loss         | -9.3e+05 |
| Running Env Steps   | 7435000  |
| Running Forward KL  | 5.27     |
| Running Reverse KL  | 3.25     |
| Running Update Time | 1487     |
----------------------------------
--2024-08-13 00:14:13.332214 UTC---
| Itration            | 1488      |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -4.83e+05 |
| Running Env Steps   | 7440000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 3.34      |
| Running Update Time | 1488      |
-----------------------------------
--2024-08-13 00:15:45.454618 UTC---
| Itration            | 1489      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -4.11e+05 |
| Running Env Steps   | 7445000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 3.38      |
| Running Update Time | 1489      |
-----------------------------------
--2024-08-13 00:17:22.019983 UTC---
| Itration            | 1490      |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -6.42e+05 |
| Running Env Steps   | 7450000   |
| Running Forward KL  | 5.04      |
| Running Reverse KL  | 11        |
| Running Update Time | 1490      |
-----------------------------------
--2024-08-13 00:18:56.566956 UTC---
| Itration            | 1491      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -1.45e+06 |
| Running Env Steps   | 7455000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 32.9      |
| Running Update Time | 1491      |
-----------------------------------
--2024-08-13 00:20:30.079753 UTC---
| Itration            | 1492      |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -3.95e+05 |
| Running Env Steps   | 7460000   |
| Running Forward KL  | 5.24      |
| Running Reverse KL  | 3.33      |
| Running Update Time | 1492      |
-----------------------------------
--2024-08-13 00:22:06.744000 UTC---
| Itration            | 1493      |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -1.03e+05 |
| Running Env Steps   | 7465000   |
| Running Forward KL  | 5         |
| Running Reverse KL  | 2.75      |
| Running Update Time | 1493      |
-----------------------------------
--2024-08-13 00:23:40.390854 UTC---
| Itration            | 1494      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -3.38e+05 |
| Running Env Steps   | 7470000   |
| Running Forward KL  | 5         |
| Running Reverse KL  | 2.78      |
| Running Update Time | 1494      |
-----------------------------------
--2024-08-13 00:25:11.862773 UTC---
| Itration            | 1495      |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.59e+06 |
| Running Env Steps   | 7475000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 40        |
| Running Update Time | 1495      |
-----------------------------------
--2024-08-13 00:26:48.101553 UTC---
| Itration            | 1496      |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -5.01e+05 |
| Running Env Steps   | 7480000   |
| Running Forward KL  | 5.08      |
| Running Reverse KL  | 2.96      |
| Running Update Time | 1496      |
-----------------------------------
--2024-08-13 00:28:22.419599 UTC---
| Itration            | 1497      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -4.83e+05 |
| Running Env Steps   | 7485000   |
| Running Forward KL  | 5.34      |
| Running Reverse KL  | 2.41      |
| Running Update Time | 1497      |
-----------------------------------
--2024-08-13 00:29:54.105802 UTC---
| Itration            | 1498      |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -5.83e+05 |
| Running Env Steps   | 7490000   |
| Running Forward KL  | 5.07      |
| Running Reverse KL  | 2.89      |
| Running Update Time | 1498      |
-----------------------------------
--2024-08-13 00:31:31.693570 UTC---
| Itration            | 1499      |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -8.68e+05 |
| Running Env Steps   | 7495000   |
| Running Forward KL  | 4.95      |
| Running Reverse KL  | 7.16      |
| Running Update Time | 1499      |
-----------------------------------
