Logging to logs/Walker2dFH-v0/exp-16/pagar_fkl/2024_08_11_06_16_17
--2024-08-11 06:18:53.057138 UTC--
| Itration            | 0        |
| PAGAR Loss          | 128      |
| Real Det Return     | -16.7    |
| Real Sto Return     | -22.4    |
| Reward Loss         | 1.64e+05 |
| Running Env Steps   | 0        |
| Running Forward KL  | 26.5     |
| Running Reverse KL  | 397      |
| Running Update Time | 0        |
----------------------------------
--2024-08-11 06:21:38.163124 UTC---
| Itration            | 1         |
| PAGAR Loss          | -101      |
| Real Det Return     | -32.1     |
| Real Sto Return     | -25       |
| Reward Loss         | -3.78e+04 |
| Running Env Steps   | 5000      |
| Running Forward KL  | 25.6      |
| Running Reverse KL  | 387       |
| Running Update Time | 1         |
-----------------------------------
--2024-08-11 06:24:33.233210 UTC---
| Itration            | 2         |
| PAGAR Loss          | -1.24e+04 |
| Real Det Return     | 272       |
| Real Sto Return     | 306       |
| Reward Loss         | 4.68e+05  |
| Running Env Steps   | 10000     |
| Running Forward KL  | 21.9      |
| Running Reverse KL  | 307       |
| Running Update Time | 2         |
-----------------------------------
--2024-08-11 06:27:31.640146 UTC---
| Itration            | 3         |
| PAGAR Loss          | -6.39e+03 |
| Real Det Return     | 433       |
| Real Sto Return     | 363       |
| Reward Loss         | 3.54e+05  |
| Running Env Steps   | 15000     |
| Running Forward KL  | 22.1      |
| Running Reverse KL  | 330       |
| Running Update Time | 3         |
-----------------------------------
--2024-08-11 06:30:27.128503 UTC--
| Itration            | 4        |
| PAGAR Loss          | 1.32e+03 |
| Real Det Return     | 391      |
| Real Sto Return     | 350      |
| Reward Loss         | 7.01e+04 |
| Running Env Steps   | 20000    |
| Running Forward KL  | 21.8     |
| Running Reverse KL  | 286      |
| Running Update Time | 4        |
----------------------------------
--2024-08-11 06:33:24.294032 UTC--
| Itration            | 5        |
| PAGAR Loss          | 251      |
| Real Det Return     | 477      |
| Real Sto Return     | 446      |
| Reward Loss         | 1.17e+06 |
| Running Env Steps   | 25000    |
| Running Forward KL  | 22.3     |
| Running Reverse KL  | 322      |
| Running Update Time | 5        |
----------------------------------
--2024-08-11 06:36:25.288914 UTC--
| Itration            | 6        |
| PAGAR Loss          | 363      |
| Real Det Return     | 675      |
| Real Sto Return     | 505      |
| Reward Loss         | 4.23e+05 |
| Running Env Steps   | 30000    |
| Running Forward KL  | 21.9     |
| Running Reverse KL  | 299      |
| Running Update Time | 6        |
----------------------------------
--2024-08-11 06:39:38.195654 UTC--
| Itration            | 7        |
| PAGAR Loss          | -46.9    |
| Real Det Return     | 789      |
| Real Sto Return     | 591      |
| Reward Loss         | 7.64e+05 |
| Running Env Steps   | 35000    |
| Running Forward KL  | 20.9     |
| Running Reverse KL  | 262      |
| Running Update Time | 7        |
----------------------------------
--2024-08-11 06:43:14.010539 UTC--
| Itration            | 8        |
| PAGAR Loss          | -232     |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.04e+03 |
| Reward Loss         | 1.14e+06 |
| Running Env Steps   | 40000    |
| Running Forward KL  | 20.7     |
| Running Reverse KL  | 71.7     |
| Running Update Time | 8        |
----------------------------------
--2024-08-11 06:46:49.954803 UTC--
| Itration            | 9        |
| PAGAR Loss          | 1.23e+03 |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 836      |
| Reward Loss         | 1.23e+06 |
| Running Env Steps   | 45000    |
| Running Forward KL  | 21.9     |
| Running Reverse KL  | 75       |
| Running Update Time | 9        |
----------------------------------
--2024-08-11 06:50:27.245316 UTC--
| Itration            | 10       |
| PAGAR Loss          | 495      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.01e+03 |
| Reward Loss         | 1.36e+06 |
| Running Env Steps   | 50000    |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 32.5     |
| Running Update Time | 10       |
----------------------------------
--2024-08-11 06:54:07.535989 UTC--
| Itration            | 11       |
| PAGAR Loss          | -471     |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 979      |
| Reward Loss         | 1.23e+06 |
| Running Env Steps   | 55000    |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 35.7     |
| Running Update Time | 11       |
----------------------------------
--2024-08-11 06:57:39.697287 UTC---
| Itration            | 12        |
| PAGAR Loss          | -1.42e+03 |
| Real Det Return     | 946       |
| Real Sto Return     | 891       |
| Reward Loss         | 1.15e+06  |
| Running Env Steps   | 60000     |
| Running Forward KL  | 21.6      |
| Running Reverse KL  | 36.1      |
| Running Update Time | 12        |
-----------------------------------
--2024-08-11 07:01:14.226579 UTC--
| Itration            | 13       |
| PAGAR Loss          | -470     |
| Real Det Return     | 951      |
| Real Sto Return     | 946      |
| Reward Loss         | 8.62e+05 |
| Running Env Steps   | 65000    |
| Running Forward KL  | 21       |
| Running Reverse KL  | 86.6     |
| Running Update Time | 13       |
----------------------------------
--2024-08-11 07:04:49.553864 UTC--
| Itration            | 14       |
| PAGAR Loss          | -192     |
| Real Det Return     | 897      |
| Real Sto Return     | 1.03e+03 |
| Reward Loss         | 9.16e+05 |
| Running Env Steps   | 70000    |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 45.1     |
| Running Update Time | 14       |
----------------------------------
--2024-08-11 07:08:28.049643 UTC--
| Itration            | 15       |
| PAGAR Loss          | -808     |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.06e+03 |
| Reward Loss         | 9.96e+05 |
| Running Env Steps   | 75000    |
| Running Forward KL  | 21.3     |
| Running Reverse KL  | 24.8     |
| Running Update Time | 15       |
----------------------------------
--2024-08-11 07:12:06.778680 UTC--
| Itration            | 16       |
| PAGAR Loss          | -825     |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.04e+03 |
| Reward Loss         | 1.07e+06 |
| Running Env Steps   | 80000    |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 23.4     |
| Running Update Time | 16       |
----------------------------------
--2024-08-11 07:15:42.862283 UTC--
| Itration            | 17       |
| PAGAR Loss          | -58.2    |
| Real Det Return     | 951      |
| Real Sto Return     | 1.03e+03 |
| Reward Loss         | 5.39e+05 |
| Running Env Steps   | 85000    |
| Running Forward KL  | 21.7     |
| Running Reverse KL  | 93.4     |
| Running Update Time | 17       |
----------------------------------
--2024-08-11 07:19:27.133118 UTC--
| Itration            | 18       |
| PAGAR Loss          | 267      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.08e+03 |
| Reward Loss         | 1.04e+06 |
| Running Env Steps   | 90000    |
| Running Forward KL  | 21.7     |
| Running Reverse KL  | 14.1     |
| Running Update Time | 18       |
----------------------------------
--2024-08-11 07:23:06.561451 UTC--
| Itration            | 19       |
| PAGAR Loss          | 116      |
| Real Det Return     | 1.01e+03 |
| Real Sto Return     | 1.08e+03 |
| Reward Loss         | 9.4e+05  |
| Running Env Steps   | 95000    |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 13.5     |
| Running Update Time | 19       |
----------------------------------
--2024-08-11 07:26:47.621442 UTC--
| Itration            | 20       |
| PAGAR Loss          | 276      |
| Real Det Return     | 960      |
| Real Sto Return     | 1.1e+03  |
| Reward Loss         | 8.84e+05 |
| Running Env Steps   | 100000   |
| Running Forward KL  | 21.6     |
| Running Reverse KL  | 24.3     |
| Running Update Time | 20       |
----------------------------------
--2024-08-11 07:30:27.276665 UTC--
| Itration            | 21       |
| PAGAR Loss          | 595      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.09e+03 |
| Reward Loss         | 8.55e+05 |
| Running Env Steps   | 105000   |
| Running Forward KL  | 21.7     |
| Running Reverse KL  | 14       |
| Running Update Time | 21       |
----------------------------------
--2024-08-11 07:34:07.061085 UTC--
| Itration            | 22       |
| PAGAR Loss          | 1.45e+03 |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.08e+03 |
| Reward Loss         | 8.26e+05 |
| Running Env Steps   | 110000   |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 14       |
| Running Update Time | 22       |
----------------------------------
--2024-08-11 07:37:48.711186 UTC--
| Itration            | 23       |
| PAGAR Loss          | -201     |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.1e+03  |
| Reward Loss         | 7.11e+05 |
| Running Env Steps   | 115000   |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 15.3     |
| Running Update Time | 23       |
----------------------------------
--2024-08-11 07:41:26.969983 UTC--
| Itration            | 24       |
| PAGAR Loss          | -352     |
| Real Det Return     | 951      |
| Real Sto Return     | 1.09e+03 |
| Reward Loss         | 5.94e+05 |
| Running Env Steps   | 120000   |
| Running Forward KL  | 21.3     |
| Running Reverse KL  | 23.2     |
| Running Update Time | 24       |
----------------------------------
--2024-08-11 07:45:06.229368 UTC--
| Itration            | 25       |
| PAGAR Loss          | -766     |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.09e+03 |
| Reward Loss         | 5.02e+05 |
| Running Env Steps   | 125000   |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 13.3     |
| Running Update Time | 25       |
----------------------------------
--2024-08-11 07:48:44.445798 UTC--
| Itration            | 26       |
| PAGAR Loss          | 673      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.05e+03 |
| Reward Loss         | 4.72e+05 |
| Running Env Steps   | 130000   |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 13.5     |
| Running Update Time | 26       |
----------------------------------
--2024-08-11 07:52:25.292127 UTC--
| Itration            | 27       |
| PAGAR Loss          | 198      |
| Real Det Return     | 1.01e+03 |
| Real Sto Return     | 1.12e+03 |
| Reward Loss         | 5.23e+05 |
| Running Env Steps   | 135000   |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 13.6     |
| Running Update Time | 27       |
----------------------------------
--2024-08-11 07:56:06.844955 UTC--
| Itration            | 28       |
| PAGAR Loss          | 456      |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.12e+03 |
| Reward Loss         | 4.06e+05 |
| Running Env Steps   | 140000   |
| Running Forward KL  | 21.2     |
| Running Reverse KL  | 24       |
| Running Update Time | 28       |
----------------------------------
--2024-08-11 07:59:46.439127 UTC--
| Itration            | 29       |
| PAGAR Loss          | -498     |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.11e+03 |
| Reward Loss         | 3.3e+05  |
| Running Env Steps   | 145000   |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 13.6     |
| Running Update Time | 29       |
----------------------------------
--2024-08-11 08:03:29.718912 UTC---
| Itration            | 30        |
| PAGAR Loss          | -1.24e+03 |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | 3.04e+05  |
| Running Env Steps   | 150000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 30        |
-----------------------------------
--2024-08-11 08:07:09.607543 UTC--
| Itration            | 31       |
| PAGAR Loss          | -417     |
| Real Det Return     | 1.04e+03 |
| Real Sto Return     | 1.13e+03 |
| Reward Loss         | 2.06e+05 |
| Running Env Steps   | 155000   |
| Running Forward KL  | 21.5     |
| Running Reverse KL  | 13       |
| Running Update Time | 31       |
----------------------------------
--2024-08-11 08:10:53.085795 UTC--
| Itration            | 32       |
| PAGAR Loss          | -762     |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.13e+03 |
| Reward Loss         | 9.23e+04 |
| Running Env Steps   | 160000   |
| Running Forward KL  | 21.1     |
| Running Reverse KL  | 15       |
| Running Update Time | 32       |
----------------------------------
--2024-08-11 08:14:32.446946 UTC--
| Itration            | 33       |
| PAGAR Loss          | 192      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.09e+03 |
| Reward Loss         | 7.4e+04  |
| Running Env Steps   | 165000   |
| Running Forward KL  | 21.4     |
| Running Reverse KL  | 13.4     |
| Running Update Time | 33       |
----------------------------------
--2024-08-11 08:18:14.603077 UTC---
| Itration            | 34        |
| PAGAR Loss          | -1.23e+03 |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | 3.92e+04  |
| Running Env Steps   | 170000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.5      |
| Running Update Time | 34        |
-----------------------------------
--2024-08-11 08:21:56.267585 UTC---
| Itration            | 35        |
| PAGAR Loss          | 407       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -3.82e+04 |
| Running Env Steps   | 175000    |
| Running Forward KL  | 21.5      |
| Running Reverse KL  | 13.7      |
| Running Update Time | 35        |
-----------------------------------
--2024-08-11 08:25:37.588229 UTC---
| Itration            | 36        |
| PAGAR Loss          | -435      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -1.29e+05 |
| Running Env Steps   | 180000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 36        |
-----------------------------------
--2024-08-11 08:29:21.871806 UTC---
| Itration            | 37        |
| PAGAR Loss          | -537      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -2.93e+05 |
| Running Env Steps   | 185000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 37        |
-----------------------------------
--2024-08-11 08:32:59.822999 UTC---
| Itration            | 38        |
| PAGAR Loss          | -305      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -2.35e+05 |
| Running Env Steps   | 190000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 38        |
-----------------------------------
--2024-08-11 08:36:44.201263 UTC---
| Itration            | 39        |
| PAGAR Loss          | -635      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -3.33e+05 |
| Running Env Steps   | 195000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 39        |
-----------------------------------
--2024-08-11 08:40:23.528289 UTC---
| Itration            | 40        |
| PAGAR Loss          | -47.5     |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -3.87e+05 |
| Running Env Steps   | 200000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 40        |
-----------------------------------
--2024-08-11 08:44:05.949446 UTC---
| Itration            | 41        |
| PAGAR Loss          | -407      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -4.92e+05 |
| Running Env Steps   | 205000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.1      |
| Running Update Time | 41        |
-----------------------------------
--2024-08-11 08:47:48.240843 UTC---
| Itration            | 42        |
| PAGAR Loss          | 98.6      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -5.43e+05 |
| Running Env Steps   | 210000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 42        |
-----------------------------------
--2024-08-11 08:51:28.833134 UTC---
| Itration            | 43        |
| PAGAR Loss          | 540       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -5.84e+05 |
| Running Env Steps   | 215000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.1      |
| Running Update Time | 43        |
-----------------------------------
--2024-08-11 08:55:11.066088 UTC---
| Itration            | 44        |
| PAGAR Loss          | -234      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -6.09e+05 |
| Running Env Steps   | 220000    |
| Running Forward KL  | 21.3      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 44        |
-----------------------------------
--2024-08-11 08:58:50.843097 UTC---
| Itration            | 45        |
| PAGAR Loss          | 1.68e+03  |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -6.32e+05 |
| Running Env Steps   | 225000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 45        |
-----------------------------------
--2024-08-11 09:02:33.447164 UTC---
| Itration            | 46        |
| PAGAR Loss          | 248       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -7.93e+05 |
| Running Env Steps   | 230000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 13.2      |
| Running Update Time | 46        |
-----------------------------------
--2024-08-11 09:06:14.360003 UTC---
| Itration            | 47        |
| PAGAR Loss          | -862      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -8.62e+05 |
| Running Env Steps   | 235000    |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 47        |
-----------------------------------
--2024-08-11 09:09:56.439733 UTC---
| Itration            | 48        |
| PAGAR Loss          | 328       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -8.59e+05 |
| Running Env Steps   | 240000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 13.2      |
| Running Update Time | 48        |
-----------------------------------
--2024-08-11 09:13:38.496546 UTC---
| Itration            | 49        |
| PAGAR Loss          | -645      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 245000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 49        |
-----------------------------------
--2024-08-11 09:17:17.987328 UTC---
| Itration            | 50        |
| PAGAR Loss          | 297       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -9.64e+05 |
| Running Env Steps   | 250000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 15.7      |
| Running Update Time | 50        |
-----------------------------------
--2024-08-11 09:20:57.032359 UTC---
| Itration            | 51        |
| PAGAR Loss          | 124       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 255000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 51        |
-----------------------------------
--2024-08-11 09:24:38.102600 UTC---
| Itration            | 52        |
| PAGAR Loss          | -1.13e+03 |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 260000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 52        |
-----------------------------------
--2024-08-11 09:28:20.446701 UTC---
| Itration            | 53        |
| PAGAR Loss          | -1.02e+03 |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 265000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 53        |
-----------------------------------
--2024-08-11 09:32:02.249359 UTC---
| Itration            | 54        |
| PAGAR Loss          | -423      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 270000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 54        |
-----------------------------------
--2024-08-11 09:35:45.672948 UTC---
| Itration            | 55        |
| PAGAR Loss          | -1.03e+03 |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 275000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 55        |
-----------------------------------
--2024-08-11 09:39:25.931528 UTC--
| Itration            | 56       |
| PAGAR Loss          | -380     |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.15e+03 |
| Reward Loss         | -1.3e+06 |
| Running Env Steps   | 280000   |
| Running Forward KL  | 21       |
| Running Reverse KL  | 13.1     |
| Running Update Time | 56       |
----------------------------------
--2024-08-11 09:43:08.311641 UTC---
| Itration            | 57        |
| PAGAR Loss          | -1.27e+03 |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 285000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 57        |
-----------------------------------
--2024-08-11 09:46:50.566290 UTC---
| Itration            | 58        |
| PAGAR Loss          | -783      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 290000    |
| Running Forward KL  | 21.2      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 58        |
-----------------------------------
--2024-08-11 09:50:32.530932 UTC--
| Itration            | 59       |
| PAGAR Loss          | 513      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.13e+03 |
| Reward Loss         | -1.5e+06 |
| Running Env Steps   | 295000   |
| Running Forward KL  | 20.9     |
| Running Reverse KL  | 13       |
| Running Update Time | 59       |
----------------------------------
--2024-08-11 09:54:15.051067 UTC---
| Itration            | 60        |
| PAGAR Loss          | -270      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 300000    |
| Running Forward KL  | 21.1      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 60        |
-----------------------------------
--2024-08-11 09:57:58.156500 UTC---
| Itration            | 61        |
| PAGAR Loss          | 281       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 305000    |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 61        |
-----------------------------------
--2024-08-11 10:01:40.549398 UTC---
| Itration            | 62        |
| PAGAR Loss          | -621      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -1.74e+06 |
| Running Env Steps   | 310000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 13        |
| Running Update Time | 62        |
-----------------------------------
--2024-08-11 10:05:24.125032 UTC---
| Itration            | 63        |
| PAGAR Loss          | -875      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -1.77e+06 |
| Running Env Steps   | 315000    |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 63        |
-----------------------------------
--2024-08-11 10:09:06.278971 UTC---
| Itration            | 64        |
| PAGAR Loss          | 254       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -1.87e+06 |
| Running Env Steps   | 320000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 64        |
-----------------------------------
--2024-08-11 10:12:48.229630 UTC---
| Itration            | 65        |
| PAGAR Loss          | -542      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -1.97e+06 |
| Running Env Steps   | 325000    |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 65        |
-----------------------------------
--2024-08-11 10:16:32.707828 UTC---
| Itration            | 66        |
| PAGAR Loss          | -854      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -2.02e+06 |
| Running Env Steps   | 330000    |
| Running Forward KL  | 21        |
| Running Reverse KL  | 12.8      |
| Running Update Time | 66        |
-----------------------------------
--2024-08-11 10:20:11.551542 UTC---
| Itration            | 67        |
| PAGAR Loss          | -610      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 335000    |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 67        |
-----------------------------------
--2024-08-11 10:23:56.284285 UTC---
| Itration            | 68        |
| PAGAR Loss          | -1.42e+03 |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 340000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 68        |
-----------------------------------
--2024-08-11 10:27:36.864560 UTC---
| Itration            | 69        |
| PAGAR Loss          | -790      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -2.14e+06 |
| Running Env Steps   | 345000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 17        |
| Running Update Time | 69        |
-----------------------------------
--2024-08-11 10:31:18.762457 UTC---
| Itration            | 70        |
| PAGAR Loss          | -955      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -2.29e+06 |
| Running Env Steps   | 350000    |
| Running Forward KL  | 20.8      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 70        |
-----------------------------------
--2024-08-11 10:35:02.077253 UTC---
| Itration            | 71        |
| PAGAR Loss          | -287      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -2.29e+06 |
| Running Env Steps   | 355000    |
| Running Forward KL  | 20.7      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 71        |
-----------------------------------
--2024-08-11 10:38:44.820987 UTC---
| Itration            | 72        |
| PAGAR Loss          | 77.9      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -2.36e+06 |
| Running Env Steps   | 360000    |
| Running Forward KL  | 20.9      |
| Running Reverse KL  | 13        |
| Running Update Time | 72        |
-----------------------------------
--2024-08-11 10:42:28.133766 UTC---
| Itration            | 73        |
| PAGAR Loss          | -656      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.22e+03  |
| Reward Loss         | -2.31e+06 |
| Running Env Steps   | 365000    |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 19.5      |
| Running Update Time | 73        |
-----------------------------------
--2024-08-11 10:46:08.033884 UTC---
| Itration            | 74        |
| PAGAR Loss          | -196      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -2.39e+06 |
| Running Env Steps   | 370000    |
| Running Forward KL  | 20.5      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 74        |
-----------------------------------
--2024-08-11 10:49:53.356663 UTC---
| Itration            | 75        |
| PAGAR Loss          | -407      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -2.46e+06 |
| Running Env Steps   | 375000    |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 75        |
-----------------------------------
--2024-08-11 10:53:35.793325 UTC---
| Itration            | 76        |
| PAGAR Loss          | -630      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -2.44e+06 |
| Running Env Steps   | 380000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 30.5      |
| Running Update Time | 76        |
-----------------------------------
--2024-08-11 10:57:18.648672 UTC---
| Itration            | 77        |
| PAGAR Loss          | -1.17e+03 |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -2.69e+06 |
| Running Env Steps   | 385000    |
| Running Forward KL  | 20.6      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 77        |
-----------------------------------
--2024-08-11 11:01:02.663923 UTC---
| Itration            | 78        |
| PAGAR Loss          | -364      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -2.77e+06 |
| Running Env Steps   | 390000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 22.8      |
| Running Update Time | 78        |
-----------------------------------
--2024-08-11 11:04:46.329601 UTC---
| Itration            | 79        |
| PAGAR Loss          | 467       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -2.69e+06 |
| Running Env Steps   | 395000    |
| Running Forward KL  | 20.5      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 79        |
-----------------------------------
--2024-08-11 11:08:30.023129 UTC---
| Itration            | 80        |
| PAGAR Loss          | -746      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -2.78e+06 |
| Running Env Steps   | 400000    |
| Running Forward KL  | 20.4      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 80        |
-----------------------------------
--2024-08-11 11:12:10.771239 UTC---
| Itration            | 81        |
| PAGAR Loss          | -113      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -2.89e+06 |
| Running Env Steps   | 405000    |
| Running Forward KL  | 20.5      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 81        |
-----------------------------------
--2024-08-11 11:15:55.834373 UTC---
| Itration            | 82        |
| PAGAR Loss          | 781       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -2.76e+06 |
| Running Env Steps   | 410000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 82        |
-----------------------------------
--2024-08-11 11:19:40.090556 UTC---
| Itration            | 83        |
| PAGAR Loss          | -1.05e+03 |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -2.93e+06 |
| Running Env Steps   | 415000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12        |
| Running Update Time | 83        |
-----------------------------------
--2024-08-11 11:23:24.313116 UTC---
| Itration            | 84        |
| PAGAR Loss          | 315       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.21e+03  |
| Reward Loss         | -2.91e+06 |
| Running Env Steps   | 420000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 12        |
| Running Update Time | 84        |
-----------------------------------
--2024-08-11 11:27:07.739557 UTC---
| Itration            | 85        |
| PAGAR Loss          | 1.64e+03  |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.2e+03   |
| Reward Loss         | -3.05e+06 |
| Running Env Steps   | 425000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 85        |
-----------------------------------
--2024-08-11 11:30:48.935855 UTC---
| Itration            | 86        |
| PAGAR Loss          | -808      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -3.14e+06 |
| Running Env Steps   | 430000    |
| Running Forward KL  | 20.3      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 86        |
-----------------------------------
--2024-08-11 11:34:30.115926 UTC---
| Itration            | 87        |
| PAGAR Loss          | 470       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -3.15e+06 |
| Running Env Steps   | 435000    |
| Running Forward KL  | 20.5      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 87        |
-----------------------------------
--2024-08-11 11:38:12.411916 UTC--
| Itration            | 88       |
| PAGAR Loss          | 309      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.19e+03 |
| Reward Loss         | -3.2e+06 |
| Running Env Steps   | 440000   |
| Running Forward KL  | 20.2     |
| Running Reverse KL  | 12.2     |
| Running Update Time | 88       |
----------------------------------
--2024-08-11 11:41:54.196592 UTC--
| Itration            | 89       |
| PAGAR Loss          | 124      |
| Real Det Return     | 1.03e+03 |
| Real Sto Return     | 1.18e+03 |
| Reward Loss         | -3.3e+06 |
| Running Env Steps   | 445000   |
| Running Forward KL  | 20.3     |
| Running Reverse KL  | 12.1     |
| Running Update Time | 89       |
----------------------------------
--2024-08-11 11:45:38.205823 UTC---
| Itration            | 90        |
| PAGAR Loss          | 1.27e+03  |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -3.29e+06 |
| Running Env Steps   | 450000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.3      |
| Running Update Time | 90        |
-----------------------------------
--2024-08-11 11:49:18.563936 UTC---
| Itration            | 91        |
| PAGAR Loss          | 451       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -3.25e+06 |
| Running Env Steps   | 455000    |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 11.9      |
| Running Update Time | 91        |
-----------------------------------
--2024-08-11 11:52:59.595725 UTC---
| Itration            | 92        |
| PAGAR Loss          | 165       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -3.41e+06 |
| Running Env Steps   | 460000    |
| Running Forward KL  | 20        |
| Running Reverse KL  | 11.8      |
| Running Update Time | 92        |
-----------------------------------
--2024-08-11 11:56:35.658135 UTC---
| Itration            | 93        |
| PAGAR Loss          | -630      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -3.31e+06 |
| Running Env Steps   | 465000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 26.1      |
| Running Update Time | 93        |
-----------------------------------
--2024-08-11 12:00:17.416044 UTC---
| Itration            | 94        |
| PAGAR Loss          | -1.5e+03  |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -3.49e+06 |
| Running Env Steps   | 470000    |
| Running Forward KL  | 20        |
| Running Reverse KL  | 33.8      |
| Running Update Time | 94        |
-----------------------------------
--2024-08-11 12:04:01.057674 UTC---
| Itration            | 95        |
| PAGAR Loss          | 291       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.24e+03  |
| Reward Loss         | -3.46e+06 |
| Running Env Steps   | 475000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 95        |
-----------------------------------
--2024-08-11 12:07:43.506581 UTC---
| Itration            | 96        |
| PAGAR Loss          | -628      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -3.67e+06 |
| Running Env Steps   | 480000    |
| Running Forward KL  | 20.2      |
| Running Reverse KL  | 12.1      |
| Running Update Time | 96        |
-----------------------------------
--2024-08-11 12:11:28.022250 UTC---
| Itration            | 97        |
| PAGAR Loss          | -198      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -3.58e+06 |
| Running Env Steps   | 485000    |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 97        |
-----------------------------------
--2024-08-11 12:15:07.801275 UTC---
| Itration            | 98        |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -3.59e+06 |
| Running Env Steps   | 490000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 21.2      |
| Running Update Time | 98        |
-----------------------------------
--2024-08-11 12:18:46.635455 UTC---
| Itration            | 99        |
| PAGAR Loss          | -1.27e+03 |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -3.93e+06 |
| Running Env Steps   | 495000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 40.3      |
| Running Update Time | 99        |
-----------------------------------
--2024-08-11 12:22:25.910288 UTC---
| Itration            | 100       |
| PAGAR Loss          | 419       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -3.66e+06 |
| Running Env Steps   | 500000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 100       |
-----------------------------------
--2024-08-11 12:26:07.183707 UTC---
| Itration            | 101       |
| PAGAR Loss          | 74.1      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -4.11e+06 |
| Running Env Steps   | 505000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 43.7      |
| Running Update Time | 101       |
-----------------------------------
--2024-08-11 12:29:48.561701 UTC---
| Itration            | 102       |
| PAGAR Loss          | -1.6e+03  |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -3.51e+06 |
| Running Env Steps   | 510000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 59.6      |
| Running Update Time | 102       |
-----------------------------------
--2024-08-11 12:33:31.588558 UTC---
| Itration            | 103       |
| PAGAR Loss          | -526      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -3.61e+06 |
| Running Env Steps   | 515000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 43.6      |
| Running Update Time | 103       |
-----------------------------------
--2024-08-11 12:37:12.867095 UTC---
| Itration            | 104       |
| PAGAR Loss          | 106       |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -3.56e+06 |
| Running Env Steps   | 520000    |
| Running Forward KL  | 19        |
| Running Reverse KL  | 84.6      |
| Running Update Time | 104       |
-----------------------------------
--2024-08-11 12:40:50.445764 UTC---
| Itration            | 105       |
| PAGAR Loss          | 162       |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -4.07e+06 |
| Running Env Steps   | 525000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 72.9      |
| Running Update Time | 105       |
-----------------------------------
--2024-08-11 12:44:18.027598 UTC---
| Itration            | 106       |
| PAGAR Loss          | -1.08e+04 |
| Real Det Return     | 1.1e+03   |
| Real Sto Return     | 711       |
| Reward Loss         | -3.93e+06 |
| Running Env Steps   | 530000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 200       |
| Running Update Time | 106       |
-----------------------------------
--2024-08-11 12:47:54.022501 UTC---
| Itration            | 107       |
| PAGAR Loss          | -472      |
| Real Det Return     | 1.04e+03  |
| Real Sto Return     | 1.01e+03  |
| Reward Loss         | -3.75e+06 |
| Running Env Steps   | 535000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 77.1      |
| Running Update Time | 107       |
-----------------------------------
--2024-08-11 12:51:31.128563 UTC---
| Itration            | 108       |
| PAGAR Loss          | 1.66e+03  |
| Real Det Return     | 1.25e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -3.43e+06 |
| Running Env Steps   | 540000    |
| Running Forward KL  | 19        |
| Running Reverse KL  | 117       |
| Running Update Time | 108       |
-----------------------------------
--2024-08-11 12:55:12.398129 UTC---
| Itration            | 109       |
| PAGAR Loss          | 5.95e+03  |
| Real Det Return     | 1.29e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -3.91e+06 |
| Running Env Steps   | 545000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 51.5      |
| Running Update Time | 109       |
-----------------------------------
--2024-08-11 12:58:46.165446 UTC---
| Itration            | 110       |
| PAGAR Loss          | -1.18e+03 |
| Real Det Return     | 1.02e+03  |
| Real Sto Return     | 1.03e+03  |
| Reward Loss         | -3.77e+06 |
| Running Env Steps   | 550000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 140       |
| Running Update Time | 110       |
-----------------------------------
--2024-08-11 13:02:25.360406 UTC---
| Itration            | 111       |
| PAGAR Loss          | 4.06e+03  |
| Real Det Return     | 1.25e+03  |
| Real Sto Return     | 1.06e+03  |
| Reward Loss         | -3.54e+06 |
| Running Env Steps   | 555000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 111       |
-----------------------------------
--2024-08-11 13:06:07.066414 UTC--
| Itration            | 112      |
| PAGAR Loss          | -933     |
| Real Det Return     | 1.29e+03 |
| Real Sto Return     | 1.17e+03 |
| Reward Loss         | -3.8e+06 |
| Running Env Steps   | 560000   |
| Running Forward KL  | 19       |
| Running Reverse KL  | 35.3     |
| Running Update Time | 112      |
----------------------------------
--2024-08-11 13:09:49.968513 UTC---
| Itration            | 113       |
| PAGAR Loss          | -1.5e+03  |
| Real Det Return     | 1.13e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -3.83e+06 |
| Running Env Steps   | 565000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 67.1      |
| Running Update Time | 113       |
-----------------------------------
--2024-08-11 13:13:28.245076 UTC---
| Itration            | 114       |
| PAGAR Loss          | 702       |
| Real Det Return     | 1.19e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 570000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 61.6      |
| Running Update Time | 114       |
-----------------------------------
--2024-08-11 13:16:58.449671 UTC---
| Itration            | 115       |
| PAGAR Loss          | -2.77e+03 |
| Real Det Return     | 1.19e+03  |
| Real Sto Return     | 798       |
| Reward Loss         | -3.96e+06 |
| Running Env Steps   | 575000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 162       |
| Running Update Time | 115       |
-----------------------------------
--2024-08-11 13:20:36.313402 UTC---
| Itration            | 116       |
| PAGAR Loss          | 1.89e+03  |
| Real Det Return     | 1.12e+03  |
| Real Sto Return     | 1.02e+03  |
| Reward Loss         | -3.66e+06 |
| Running Env Steps   | 580000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 70.8      |
| Running Update Time | 116       |
-----------------------------------
--2024-08-11 13:24:16.746128 UTC---
| Itration            | 117       |
| PAGAR Loss          | -839      |
| Real Det Return     | 1.03e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -4.03e+06 |
| Running Env Steps   | 585000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 117       |
-----------------------------------
--2024-08-11 13:27:52.016254 UTC---
| Itration            | 118       |
| PAGAR Loss          | 1.09e+03  |
| Real Det Return     | 1.15e+03  |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.07e+06 |
| Running Env Steps   | 590000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 45.8      |
| Running Update Time | 118       |
-----------------------------------
--2024-08-11 13:30:50.357817 UTC---
| Itration            | 119       |
| PAGAR Loss          | -1.1e+03  |
| Real Det Return     | 1.1e+03   |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -3.85e+06 |
| Running Env Steps   | 595000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 12.1      |
| Running Update Time | 119       |
-----------------------------------
--2024-08-11 13:33:52.521571 UTC--
| Itration            | 120      |
| PAGAR Loss          | nan      |
| Real Det Return     | 1.14e+03 |
| Real Sto Return     | 1.18e+03 |
| Reward Loss         | -4.1e+06 |
| Running Env Steps   | 600000   |
| Running Forward KL  | 19.3     |
| Running Reverse KL  | 49.4     |
| Running Update Time | 120      |
----------------------------------
--2024-08-11 13:36:53.423387 UTC---
| Itration            | 121       |
| PAGAR Loss          | 3.93e+04  |
| Real Det Return     | 1.3e+03   |
| Real Sto Return     | 1.27e+03  |
| Reward Loss         | -4.21e+06 |
| Running Env Steps   | 605000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 32.3      |
| Running Update Time | 121       |
-----------------------------------
--2024-08-11 13:39:54.733258 UTC---
| Itration            | 122       |
| PAGAR Loss          | 1.19e+04  |
| Real Det Return     | 1.16e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.23e+06 |
| Running Env Steps   | 610000    |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 122       |
-----------------------------------
--2024-08-11 13:42:55.517189 UTC---
| Itration            | 123       |
| PAGAR Loss          | 2.15e+03  |
| Real Det Return     | 1.21e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -4.25e+06 |
| Running Env Steps   | 615000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 34.3      |
| Running Update Time | 123       |
-----------------------------------
--2024-08-11 13:45:53.105443 UTC---
| Itration            | 124       |
| PAGAR Loss          | 989       |
| Real Det Return     | 964       |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -4.26e+06 |
| Running Env Steps   | 620000    |
| Running Forward KL  | 20        |
| Running Reverse KL  | 12.9      |
| Running Update Time | 124       |
-----------------------------------
--2024-08-11 13:48:52.800992 UTC---
| Itration            | 125       |
| PAGAR Loss          | -357      |
| Real Det Return     | 1.15e+03  |
| Real Sto Return     | 984       |
| Reward Loss         | -4.29e+06 |
| Running Env Steps   | 625000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 125       |
-----------------------------------
--2024-08-11 13:51:52.739344 UTC---
| Itration            | 126       |
| PAGAR Loss          | 2.78e+03  |
| Real Det Return     | 1.17e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -4.38e+06 |
| Running Env Steps   | 630000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 126       |
-----------------------------------
--2024-08-11 13:54:51.574511 UTC---
| Itration            | 127       |
| PAGAR Loss          | -631      |
| Real Det Return     | 1.08e+03  |
| Real Sto Return     | 1.01e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 635000    |
| Running Forward KL  | 19.9      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 127       |
-----------------------------------
--2024-08-11 13:57:51.989733 UTC--
| Itration            | 128      |
| PAGAR Loss          | 1.14e+03 |
| Real Det Return     | 1.12e+03 |
| Real Sto Return     | 1.1e+03  |
| Reward Loss         | -4.4e+06 |
| Running Env Steps   | 640000   |
| Running Forward KL  | 19.6     |
| Running Reverse KL  | 38.6     |
| Running Update Time | 128      |
----------------------------------
--2024-08-11 14:00:51.643698 UTC---
| Itration            | 129       |
| PAGAR Loss          | -1.07e+03 |
| Real Det Return     | 1.06e+03  |
| Real Sto Return     | 1.07e+03  |
| Reward Loss         | -4.64e+06 |
| Running Env Steps   | 645000    |
| Running Forward KL  | 20.1      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 129       |
-----------------------------------
--2024-08-11 14:03:52.983415 UTC---
| Itration            | 130       |
| PAGAR Loss          | -32.1     |
| Real Det Return     | 1.11e+03  |
| Real Sto Return     | 1.04e+03  |
| Reward Loss         | -4.49e+06 |
| Running Env Steps   | 650000    |
| Running Forward KL  | 20        |
| Running Reverse KL  | 13        |
| Running Update Time | 130       |
-----------------------------------
--2024-08-11 14:06:55.366037 UTC---
| Itration            | 131       |
| PAGAR Loss          | 344       |
| Real Det Return     | 1.16e+03  |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.55e+06 |
| Running Env Steps   | 655000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 131       |
-----------------------------------
--2024-08-11 14:09:59.150550 UTC--
| Itration            | 132      |
| PAGAR Loss          | -621     |
| Real Det Return     | 1.12e+03 |
| Real Sto Return     | 1.09e+03 |
| Reward Loss         | -4.8e+06 |
| Running Env Steps   | 660000   |
| Running Forward KL  | 20.1     |
| Running Reverse KL  | 13       |
| Running Update Time | 132      |
----------------------------------
--2024-08-11 14:13:00.099784 UTC---
| Itration            | 133       |
| PAGAR Loss          | -586      |
| Real Det Return     | 1.11e+03  |
| Real Sto Return     | 1.17e+03  |
| Reward Loss         | -4.95e+06 |
| Running Env Steps   | 665000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 42.4      |
| Running Update Time | 133       |
-----------------------------------
--2024-08-11 14:16:00.802710 UTC---
| Itration            | 134       |
| PAGAR Loss          | -108      |
| Real Det Return     | 1.2e+03   |
| Real Sto Return     | 1.1e+03   |
| Reward Loss         | -4.74e+06 |
| Running Env Steps   | 670000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 134       |
-----------------------------------
--2024-08-11 14:19:02.043380 UTC--
| Itration            | 135      |
| PAGAR Loss          | -199     |
| Real Det Return     | 1.31e+03 |
| Real Sto Return     | 1.18e+03 |
| Reward Loss         | -4.9e+06 |
| Running Env Steps   | 675000   |
| Running Forward KL  | 19.6     |
| Running Reverse KL  | 12.2     |
| Running Update Time | 135      |
----------------------------------
--2024-08-11 14:22:02.353714 UTC---
| Itration            | 136       |
| PAGAR Loss          | 315       |
| Real Det Return     | 1.18e+03  |
| Real Sto Return     | 1.09e+03  |
| Reward Loss         | -4.96e+06 |
| Running Env Steps   | 680000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 12.8      |
| Running Update Time | 136       |
-----------------------------------
--2024-08-11 14:24:59.728624 UTC---
| Itration            | 137       |
| PAGAR Loss          | -2.84e+03 |
| Real Det Return     | 1.24e+03  |
| Real Sto Return     | 1.15e+03  |
| Reward Loss         | -4.85e+06 |
| Running Env Steps   | 685000    |
| Running Forward KL  | 19.8      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 137       |
-----------------------------------
--2024-08-11 14:28:01.460934 UTC---
| Itration            | 138       |
| PAGAR Loss          | 1.3e+03   |
| Real Det Return     | 1.1e+03   |
| Real Sto Return     | 1.12e+03  |
| Reward Loss         | -4.97e+06 |
| Running Env Steps   | 690000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 138       |
-----------------------------------
--2024-08-11 14:31:03.583309 UTC---
| Itration            | 139       |
| PAGAR Loss          | 797       |
| Real Det Return     | 1.34e+03  |
| Real Sto Return     | 1.21e+03  |
| Reward Loss         | -5.05e+06 |
| Running Env Steps   | 695000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 139       |
-----------------------------------
--2024-08-11 14:34:06.093506 UTC---
| Itration            | 140       |
| PAGAR Loss          | 1.8e+03   |
| Real Det Return     | 1.36e+03  |
| Real Sto Return     | 1.24e+03  |
| Reward Loss         | -4.56e+06 |
| Running Env Steps   | 700000    |
| Running Forward KL  | 18.9      |
| Running Reverse KL  | 41.2      |
| Running Update Time | 140       |
-----------------------------------
--2024-08-11 14:37:09.999394 UTC---
| Itration            | 141       |
| PAGAR Loss          | -1.53e+03 |
| Real Det Return     | 1.23e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -5.23e+06 |
| Running Env Steps   | 705000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 30.6      |
| Running Update Time | 141       |
-----------------------------------
--2024-08-11 14:40:11.035398 UTC---
| Itration            | 142       |
| PAGAR Loss          | -454      |
| Real Det Return     | 1.29e+03  |
| Real Sto Return     | 1.2e+03   |
| Reward Loss         | -5.17e+06 |
| Running Env Steps   | 710000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 12.2      |
| Running Update Time | 142       |
-----------------------------------
--2024-08-11 14:43:13.829923 UTC---
| Itration            | 143       |
| PAGAR Loss          | -1.09e+03 |
| Real Det Return     | 1.25e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -5.24e+06 |
| Running Env Steps   | 715000    |
| Running Forward KL  | 19.7      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 143       |
-----------------------------------
--2024-08-11 14:46:13.670014 UTC---
| Itration            | 144       |
| PAGAR Loss          | -657      |
| Real Det Return     | 1.38e+03  |
| Real Sto Return     | 1.23e+03  |
| Reward Loss         | -5.35e+06 |
| Running Env Steps   | 720000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 11.8      |
| Running Update Time | 144       |
-----------------------------------
--2024-08-11 14:49:17.134946 UTC---
| Itration            | 145       |
| PAGAR Loss          | -4.29     |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.19e+03  |
| Reward Loss         | -5.33e+06 |
| Running Env Steps   | 725000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 145       |
-----------------------------------
--2024-08-11 14:52:19.383380 UTC---
| Itration            | 146       |
| PAGAR Loss          | 2.25e+03  |
| Real Det Return     | 1.43e+03  |
| Real Sto Return     | 1.2e+03   |
| Reward Loss         | -5.37e+06 |
| Running Env Steps   | 730000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 32.6      |
| Running Update Time | 146       |
-----------------------------------
--2024-08-11 14:55:19.144749 UTC---
| Itration            | 147       |
| PAGAR Loss          | -762      |
| Real Det Return     | 1.37e+03  |
| Real Sto Return     | 1.14e+03  |
| Reward Loss         | -5.59e+06 |
| Running Env Steps   | 735000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 11.6      |
| Running Update Time | 147       |
-----------------------------------
--2024-08-11 14:58:21.514345 UTC---
| Itration            | 148       |
| PAGAR Loss          | -1.3e+03  |
| Real Det Return     | 1.33e+03  |
| Real Sto Return     | 1.22e+03  |
| Reward Loss         | -5.45e+06 |
| Running Env Steps   | 740000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 12.4      |
| Running Update Time | 148       |
-----------------------------------
--2024-08-11 15:01:22.827134 UTC---
| Itration            | 149       |
| PAGAR Loss          | -8.51e+03 |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.11e+03  |
| Reward Loss         | -5.59e+06 |
| Running Env Steps   | 745000    |
| Running Forward KL  | 19.6      |
| Running Reverse KL  | 12        |
| Running Update Time | 149       |
-----------------------------------
--2024-08-11 15:04:25.699546 UTC---
| Itration            | 150       |
| PAGAR Loss          | -2.44e+03 |
| Real Det Return     | 1.42e+03  |
| Real Sto Return     | 1.18e+03  |
| Reward Loss         | -5.93e+06 |
| Running Env Steps   | 750000    |
| Running Forward KL  | 19.3      |
| Running Reverse KL  | 48.8      |
| Running Update Time | 150       |
-----------------------------------
--2024-08-11 15:07:27.621600 UTC---
| Itration            | 151       |
| PAGAR Loss          | -1.19e+03 |
| Real Det Return     | 1.42e+03  |
| Real Sto Return     | 1.21e+03  |
| Reward Loss         | -5.51e+06 |
| Running Env Steps   | 755000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 28.9      |
| Running Update Time | 151       |
-----------------------------------
--2024-08-11 15:10:31.866993 UTC---
| Itration            | 152       |
| PAGAR Loss          | 2.36e+03  |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.16e+03  |
| Reward Loss         | -6.04e+06 |
| Running Env Steps   | 760000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 11.9      |
| Running Update Time | 152       |
-----------------------------------
--2024-08-11 15:13:33.902800 UTC---
| Itration            | 153       |
| PAGAR Loss          | 234       |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.23e+03  |
| Reward Loss         | -5.56e+06 |
| Running Env Steps   | 765000    |
| Running Forward KL  | 18.8      |
| Running Reverse KL  | 74.3      |
| Running Update Time | 153       |
-----------------------------------
--2024-08-11 15:16:37.186159 UTC---
| Itration            | 154       |
| PAGAR Loss          | -160      |
| Real Det Return     | 1.47e+03  |
| Real Sto Return     | 1.26e+03  |
| Reward Loss         | -5.89e+06 |
| Running Env Steps   | 770000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 21.8      |
| Running Update Time | 154       |
-----------------------------------
--2024-08-11 15:19:40.866164 UTC---
| Itration            | 155       |
| PAGAR Loss          | 1.37e+03  |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.23e+03  |
| Reward Loss         | -6.06e+06 |
| Running Env Steps   | 775000    |
| Running Forward KL  | 19.5      |
| Running Reverse KL  | 11.7      |
| Running Update Time | 155       |
-----------------------------------
--2024-08-11 15:22:42.102429 UTC---
| Itration            | 156       |
| PAGAR Loss          | -2.97e+03 |
| Real Det Return     | 1.23e+03  |
| Real Sto Return     | 1.13e+03  |
| Reward Loss         | -6.08e+06 |
| Running Env Steps   | 780000    |
| Running Forward KL  | 19.4      |
| Running Reverse KL  | 11.9      |
| Running Update Time | 156       |
-----------------------------------
--2024-08-11 15:25:44.820210 UTC---
| Itration            | 157       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.43e+03  |
| Real Sto Return     | 1.22e+03  |
| Reward Loss         | -6.24e+06 |
| Running Env Steps   | 785000    |
| Running Forward KL  | 19.1      |
| Running Reverse KL  | 32.5      |
| Running Update Time | 157       |
-----------------------------------
--2024-08-11 15:28:45.956572 UTC---
| Itration            | 158       |
| PAGAR Loss          | 128       |
| Real Det Return     | 1.34e+03  |
| Real Sto Return     | 1.27e+03  |
| Reward Loss         | -6.23e+06 |
| Running Env Steps   | 790000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 11.4      |
| Running Update Time | 158       |
-----------------------------------
--2024-08-11 15:31:48.271696 UTC---
| Itration            | 159       |
| PAGAR Loss          | 1.53e+03  |
| Real Det Return     | 1.47e+03  |
| Real Sto Return     | 1.38e+03  |
| Reward Loss         | -5.84e+06 |
| Running Env Steps   | 795000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 24.6      |
| Running Update Time | 159       |
-----------------------------------
--2024-08-11 15:34:46.947368 UTC---
| Itration            | 160       |
| PAGAR Loss          | 1.42e+03  |
| Real Det Return     | 1.29e+03  |
| Real Sto Return     | 1.3e+03   |
| Reward Loss         | -6.13e+06 |
| Running Env Steps   | 800000    |
| Running Forward KL  | 18.7      |
| Running Reverse KL  | 39.2      |
| Running Update Time | 160       |
-----------------------------------
--2024-08-11 15:37:47.319122 UTC---
| Itration            | 161       |
| PAGAR Loss          | 529       |
| Real Det Return     | 1.42e+03  |
| Real Sto Return     | 1.26e+03  |
| Reward Loss         | -6.26e+06 |
| Running Env Steps   | 805000    |
| Running Forward KL  | 19.2      |
| Running Reverse KL  | 11.5      |
| Running Update Time | 161       |
-----------------------------------
--2024-08-11 15:40:49.871632 UTC---
| Itration            | 162       |
| PAGAR Loss          | -752      |
| Real Det Return     | 1.44e+03  |
| Real Sto Return     | 1.28e+03  |
| Reward Loss         | -6.23e+06 |
| Running Env Steps   | 810000    |
| Running Forward KL  | 18.7      |
| Running Reverse KL  | 11.4      |
| Running Update Time | 162       |
-----------------------------------
--2024-08-11 15:43:48.391650 UTC---
| Itration            | 163       |
| PAGAR Loss          | 37        |
| Real Det Return     | 1.43e+03  |
| Real Sto Return     | 1.3e+03   |
| Reward Loss         | -6.23e+06 |
| Running Env Steps   | 815000    |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 68.1      |
| Running Update Time | 163       |
-----------------------------------
--2024-08-11 15:46:43.327927 UTC---
| Itration            | 164       |
| PAGAR Loss          | -2.95e+03 |
| Real Det Return     | 1.32e+03  |
| Real Sto Return     | 1.08e+03  |
| Reward Loss         | -5.96e+06 |
| Running Env Steps   | 820000    |
| Running Forward KL  | 18        |
| Running Reverse KL  | 178       |
| Running Update Time | 164       |
-----------------------------------
--2024-08-11 15:49:43.779965 UTC---
| Itration            | 165       |
| PAGAR Loss          | -1.57e+03 |
| Real Det Return     | 1.44e+03  |
| Real Sto Return     | 1.35e+03  |
| Reward Loss         | -6.13e+06 |
| Running Env Steps   | 825000    |
| Running Forward KL  | 17.9      |
| Running Reverse KL  | 30.3      |
| Running Update Time | 165       |
-----------------------------------
--2024-08-11 15:52:44.719076 UTC---
| Itration            | 166       |
| PAGAR Loss          | -703      |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.41e+03  |
| Reward Loss         | -6.03e+06 |
| Running Env Steps   | 830000    |
| Running Forward KL  | 18.1      |
| Running Reverse KL  | 53.1      |
| Running Update Time | 166       |
-----------------------------------
--2024-08-11 15:55:39.072465 UTC---
| Itration            | 167       |
| PAGAR Loss          | 803       |
| Real Det Return     | 1.43e+03  |
| Real Sto Return     | 1.38e+03  |
| Reward Loss         | -5.46e+06 |
| Running Env Steps   | 835000    |
| Running Forward KL  | 17.3      |
| Running Reverse KL  | 108       |
| Running Update Time | 167       |
-----------------------------------
--2024-08-11 15:58:41.184106 UTC---
| Itration            | 168       |
| PAGAR Loss          | -1.95e+03 |
| Real Det Return     | 1.41e+03  |
| Real Sto Return     | 1.45e+03  |
| Reward Loss         | -6.05e+06 |
| Running Env Steps   | 840000    |
| Running Forward KL  | 17.9      |
| Running Reverse KL  | 22.8      |
| Running Update Time | 168       |
-----------------------------------
--2024-08-11 16:01:39.952191 UTC---
| Itration            | 169       |
| PAGAR Loss          | 901       |
| Real Det Return     | 1.42e+03  |
| Real Sto Return     | 1.55e+03  |
| Reward Loss         | -6.29e+06 |
| Running Env Steps   | 845000    |
| Running Forward KL  | 18.2      |
| Running Reverse KL  | 16.5      |
| Running Update Time | 169       |
-----------------------------------
--2024-08-11 16:04:29.584628 UTC---
| Itration            | 170       |
| PAGAR Loss          | -5.75e+03 |
| Real Det Return     | 1.73e+03  |
| Real Sto Return     | 917       |
| Reward Loss         | -5.1e+06  |
| Running Env Steps   | 850000    |
| Running Forward KL  | 17.6      |
| Running Reverse KL  | 210       |
| Running Update Time | 170       |
-----------------------------------
--2024-08-11 16:07:22.108729 UTC---
| Itration            | 171       |
| PAGAR Loss          | -3.57e+03 |
| Real Det Return     | 1.58e+03  |
| Real Sto Return     | 1.24e+03  |
| Reward Loss         | -4.85e+06 |
| Running Env Steps   | 855000    |
| Running Forward KL  | 17.4      |
| Running Reverse KL  | 195       |
| Running Update Time | 171       |
-----------------------------------
--2024-08-11 16:10:20.216142 UTC---
| Itration            | 172       |
| PAGAR Loss          | -6.23e+03 |
| Real Det Return     | 1.95e+03  |
| Real Sto Return     | 1.6e+03   |
| Reward Loss         | -5.57e+06 |
| Running Env Steps   | 860000    |
| Running Forward KL  | 16.7      |
| Running Reverse KL  | 49.9      |
| Running Update Time | 172       |
-----------------------------------
--2024-08-11 16:13:12.935220 UTC---
| Itration            | 173       |
| PAGAR Loss          | -4.04e+03 |
| Real Det Return     | 2.06e+03  |
| Real Sto Return     | 1.3e+03   |
| Reward Loss         | -5.31e+06 |
| Running Env Steps   | 865000    |
| Running Forward KL  | 17        |
| Running Reverse KL  | 129       |
| Running Update Time | 173       |
-----------------------------------
--2024-08-11 16:16:17.221151 UTC---
| Itration            | 174       |
| PAGAR Loss          | -5.84e+03 |
| Real Det Return     | 2.16e+03  |
| Real Sto Return     | 1.95e+03  |
| Reward Loss         | -5.15e+06 |
| Running Env Steps   | 870000    |
| Running Forward KL  | 16.4      |
| Running Reverse KL  | 95.5      |
| Running Update Time | 174       |
-----------------------------------
--2024-08-11 16:19:16.319630 UTC---
| Itration            | 175       |
| PAGAR Loss          | -9.92e+03 |
| Real Det Return     | 2.12e+03  |
| Real Sto Return     | 1.7e+03   |
| Reward Loss         | -5.1e+06  |
| Running Env Steps   | 875000    |
| Running Forward KL  | 16.2      |
| Running Reverse KL  | 65.3      |
| Running Update Time | 175       |
-----------------------------------
--2024-08-11 16:22:18.632931 UTC---
| Itration            | 176       |
| PAGAR Loss          | 9.09e+03  |
| Real Det Return     | 2.82e+03  |
| Real Sto Return     | 2.1e+03   |
| Reward Loss         | -4.18e+06 |
| Running Env Steps   | 880000    |
| Running Forward KL  | 16        |
| Running Reverse KL  | 108       |
| Running Update Time | 176       |
-----------------------------------
--2024-08-11 16:25:21.224281 UTC---
| Itration            | 177       |
| PAGAR Loss          | 1.41e+04  |
| Real Det Return     | 2.76e+03  |
| Real Sto Return     | 2.38e+03  |
| Reward Loss         | -4.86e+06 |
| Running Env Steps   | 885000    |
| Running Forward KL  | 15.7      |
| Running Reverse KL  | 77.4      |
| Running Update Time | 177       |
-----------------------------------
--2024-08-11 16:28:23.562377 UTC---
| Itration            | 178       |
| PAGAR Loss          | -3.68e+03 |
| Real Det Return     | 2.65e+03  |
| Real Sto Return     | 2.26e+03  |
| Reward Loss         | -5.11e+06 |
| Running Env Steps   | 890000    |
| Running Forward KL  | 16.2      |
| Running Reverse KL  | 71.1      |
| Running Update Time | 178       |
-----------------------------------
--2024-08-11 16:31:26.120652 UTC---
| Itration            | 179       |
| PAGAR Loss          | 830       |
| Real Det Return     | 2.7e+03   |
| Real Sto Return     | 2.61e+03  |
| Reward Loss         | -4.96e+06 |
| Running Env Steps   | 895000    |
| Running Forward KL  | 15.9      |
| Running Reverse KL  | 9.74      |
| Running Update Time | 179       |
-----------------------------------
--2024-08-11 16:34:27.072539 UTC---
| Itration            | 180       |
| PAGAR Loss          | -7.1e+03  |
| Real Det Return     | 2.94e+03  |
| Real Sto Return     | 2.79e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 900000    |
| Running Forward KL  | 14.8      |
| Running Reverse KL  | 8.67      |
| Running Update Time | 180       |
-----------------------------------
--2024-08-11 16:37:32.399407 UTC---
| Itration            | 181       |
| PAGAR Loss          | -1.16e+03 |
| Real Det Return     | 2.78e+03  |
| Real Sto Return     | 2.76e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 905000    |
| Running Forward KL  | 15.2      |
| Running Reverse KL  | 8.86      |
| Running Update Time | 181       |
-----------------------------------
--2024-08-11 16:40:35.314200 UTC---
| Itration            | 182       |
| PAGAR Loss          | -1.51e+03 |
| Real Det Return     | 2.83e+03  |
| Real Sto Return     | 2.7e+03   |
| Reward Loss         | -4.68e+06 |
| Running Env Steps   | 910000    |
| Running Forward KL  | 15.5      |
| Running Reverse KL  | 9.51      |
| Running Update Time | 182       |
-----------------------------------
--2024-08-11 16:43:41.058927 UTC--
| Itration            | 183      |
| PAGAR Loss          | -121     |
| Real Det Return     | 2.83e+03 |
| Real Sto Return     | 2.73e+03 |
| Reward Loss         | -4.5e+06 |
| Running Env Steps   | 915000   |
| Running Forward KL  | 15.3     |
| Running Reverse KL  | 9.15     |
| Running Update Time | 183      |
----------------------------------
--2024-08-11 16:46:41.491975 UTC---
| Itration            | 184       |
| PAGAR Loss          | -2.21e+03 |
| Real Det Return     | 2.58e+03  |
| Real Sto Return     | 2.02e+03  |
| Reward Loss         | -4.75e+06 |
| Running Env Steps   | 920000    |
| Running Forward KL  | 16.1      |
| Running Reverse KL  | 113       |
| Running Update Time | 184       |
-----------------------------------
--2024-08-11 16:49:38.970171 UTC---
| Itration            | 185       |
| PAGAR Loss          | -6.23e+03 |
| Real Det Return     | 3.03e+03  |
| Real Sto Return     | 2.03e+03  |
| Reward Loss         | -3.66e+06 |
| Running Env Steps   | 925000    |
| Running Forward KL  | 15.2      |
| Running Reverse KL  | 66        |
| Running Update Time | 185       |
-----------------------------------
--2024-08-11 16:52:42.514876 UTC---
| Itration            | 186       |
| PAGAR Loss          | -5.78e+03 |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 2.99e+03  |
| Reward Loss         | -4.25e+06 |
| Running Env Steps   | 930000    |
| Running Forward KL  | 14.9      |
| Running Reverse KL  | 8.58      |
| Running Update Time | 186       |
-----------------------------------
--2024-08-11 16:55:44.219145 UTC---
| Itration            | 187       |
| PAGAR Loss          | -1.52e+04 |
| Real Det Return     | 3.34e+03  |
| Real Sto Return     | 3.13e+03  |
| Reward Loss         | -3.65e+06 |
| Running Env Steps   | 935000    |
| Running Forward KL  | 14.9      |
| Running Reverse KL  | 9.51      |
| Running Update Time | 187       |
-----------------------------------
--2024-08-11 16:58:46.073579 UTC---
| Itration            | 188       |
| PAGAR Loss          | 5.56e+03  |
| Real Det Return     | 3e+03     |
| Real Sto Return     | 2.86e+03  |
| Reward Loss         | -4.16e+06 |
| Running Env Steps   | 940000    |
| Running Forward KL  | 15.1      |
| Running Reverse KL  | 9.67      |
| Running Update Time | 188       |
-----------------------------------
--2024-08-11 17:01:48.988305 UTC---
| Itration            | 189       |
| PAGAR Loss          | -3.16e+03 |
| Real Det Return     | 3.13e+03  |
| Real Sto Return     | 2.9e+03   |
| Reward Loss         | -4.16e+06 |
| Running Env Steps   | 945000    |
| Running Forward KL  | 15.3      |
| Running Reverse KL  | 9.3       |
| Running Update Time | 189       |
-----------------------------------
--2024-08-11 17:04:50.582635 UTC---
| Itration            | 190       |
| PAGAR Loss          | -1.28e+03 |
| Real Det Return     | 3.02e+03  |
| Real Sto Return     | 2.85e+03  |
| Reward Loss         | -4.41e+06 |
| Running Env Steps   | 950000    |
| Running Forward KL  | 15.1      |
| Running Reverse KL  | 9.06      |
| Running Update Time | 190       |
-----------------------------------
--2024-08-11 17:07:51.732822 UTC---
| Itration            | 191       |
| PAGAR Loss          | -1.63e+04 |
| Real Det Return     | 2.79e+03  |
| Real Sto Return     | 2.86e+03  |
| Reward Loss         | -4.14e+06 |
| Running Env Steps   | 955000    |
| Running Forward KL  | 15.2      |
| Running Reverse KL  | 8.68      |
| Running Update Time | 191       |
-----------------------------------
--2024-08-11 17:10:54.710795 UTC---
| Itration            | 192       |
| PAGAR Loss          | 52.3      |
| Real Det Return     | 2.85e+03  |
| Real Sto Return     | 2.97e+03  |
| Reward Loss         | -4.13e+06 |
| Running Env Steps   | 960000    |
| Running Forward KL  | 14.8      |
| Running Reverse KL  | 9.24      |
| Running Update Time | 192       |
-----------------------------------
--2024-08-11 17:13:55.697710 UTC---
| Itration            | 193       |
| PAGAR Loss          | 3.94e+03  |
| Real Det Return     | 2.87e+03  |
| Real Sto Return     | 2.92e+03  |
| Reward Loss         | -4.39e+06 |
| Running Env Steps   | 965000    |
| Running Forward KL  | 15.3      |
| Running Reverse KL  | 9.49      |
| Running Update Time | 193       |
-----------------------------------
--2024-08-11 17:16:43.295735 UTC---
| Itration            | 194       |
| PAGAR Loss          | 379       |
| Real Det Return     | 3.18e+03  |
| Real Sto Return     | 2.87e+03  |
| Reward Loss         | -4.28e+06 |
| Running Env Steps   | 970000    |
| Running Forward KL  | 15.2      |
| Running Reverse KL  | 8.71      |
| Running Update Time | 194       |
-----------------------------------
--2024-08-11 17:19:10.211021 UTC---
| Itration            | 195       |
| PAGAR Loss          | -3.62e+03 |
| Real Det Return     | 3.23e+03  |
| Real Sto Return     | 3.06e+03  |
| Reward Loss         | -3.78e+06 |
| Running Env Steps   | 975000    |
| Running Forward KL  | 14        |
| Running Reverse KL  | 8.69      |
| Running Update Time | 195       |
-----------------------------------
--2024-08-11 17:21:36.478286 UTC--
| Itration            | 196      |
| PAGAR Loss          | -7.8e+03 |
| Real Det Return     | 3.01e+03 |
| Real Sto Return     | 2.91e+03 |
| Reward Loss         | -4.4e+06 |
| Running Env Steps   | 980000   |
| Running Forward KL  | 15.5     |
| Running Reverse KL  | 9.22     |
| Running Update Time | 196      |
----------------------------------
--2024-08-11 17:24:04.615360 UTC---
| Itration            | 197       |
| PAGAR Loss          | -1.47e+03 |
| Real Det Return     | 3e+03     |
| Real Sto Return     | 3e+03     |
| Reward Loss         | -4.4e+06  |
| Running Env Steps   | 985000    |
| Running Forward KL  | 14.6      |
| Running Reverse KL  | 8.14      |
| Running Update Time | 197       |
-----------------------------------
--2024-08-11 17:26:32.313649 UTC---
| Itration            | 198       |
| PAGAR Loss          | -4.84e+03 |
| Real Det Return     | 3.55e+03  |
| Real Sto Return     | 3.21e+03  |
| Reward Loss         | -3.55e+06 |
| Running Env Steps   | 990000    |
| Running Forward KL  | 14.7      |
| Running Reverse KL  | 9.45      |
| Running Update Time | 198       |
-----------------------------------
--2024-08-11 17:29:01.995483 UTC---
| Itration            | 199       |
| PAGAR Loss          | 5.48e+03  |
| Real Det Return     | 3.17e+03  |
| Real Sto Return     | 3.16e+03  |
| Reward Loss         | -3.94e+06 |
| Running Env Steps   | 995000    |
| Running Forward KL  | 14.6      |
| Running Reverse KL  | 8.9       |
| Running Update Time | 199       |
-----------------------------------
--2024-08-11 17:31:27.222026 UTC---
| Itration            | 200       |
| PAGAR Loss          | -2.46e+04 |
| Real Det Return     | 3.31e+03  |
| Real Sto Return     | 2.71e+03  |
| Reward Loss         | -5.23e+06 |
| Running Env Steps   | 1000000   |
| Running Forward KL  | 14.6      |
| Running Reverse KL  | 106       |
| Running Update Time | 200       |
-----------------------------------
--2024-08-11 17:33:55.464347 UTC---
| Itration            | 201       |
| PAGAR Loss          | -634      |
| Real Det Return     | 3.2e+03   |
| Real Sto Return     | 3.2e+03   |
| Reward Loss         | -4.07e+06 |
| Running Env Steps   | 1005000   |
| Running Forward KL  | 14.8      |
| Running Reverse KL  | 9.48      |
| Running Update Time | 201       |
-----------------------------------
--2024-08-11 17:36:21.238437 UTC---
| Itration            | 202       |
| PAGAR Loss          | -1.71e+03 |
| Real Det Return     | 3.34e+03  |
| Real Sto Return     | 3.22e+03  |
| Reward Loss         | -4.03e+06 |
| Running Env Steps   | 1010000   |
| Running Forward KL  | 14.3      |
| Running Reverse KL  | 8.21      |
| Running Update Time | 202       |
-----------------------------------
--2024-08-11 17:38:48.536501 UTC---
| Itration            | 203       |
| PAGAR Loss          | 475       |
| Real Det Return     | 3.38e+03  |
| Real Sto Return     | 3.36e+03  |
| Reward Loss         | -3.81e+06 |
| Running Env Steps   | 1015000   |
| Running Forward KL  | 13.8      |
| Running Reverse KL  | 8.04      |
| Running Update Time | 203       |
-----------------------------------
--2024-08-11 17:41:16.493184 UTC---
| Itration            | 204       |
| PAGAR Loss          | -910      |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 3.35e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 1020000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 8.11      |
| Running Update Time | 204       |
-----------------------------------
--2024-08-11 17:43:41.737023 UTC---
| Itration            | 205       |
| PAGAR Loss          | -5.87e+04 |
| Real Det Return     | 3.48e+03  |
| Real Sto Return     | 2.88e+03  |
| Reward Loss         | -4.38e+06 |
| Running Env Steps   | 1025000   |
| Running Forward KL  | 14.1      |
| Running Reverse KL  | 55.8      |
| Running Update Time | 205       |
-----------------------------------
--2024-08-11 17:46:10.140540 UTC---
| Itration            | 206       |
| PAGAR Loss          | -7.61e+03 |
| Real Det Return     | 3.48e+03  |
| Real Sto Return     | 3.37e+03  |
| Reward Loss         | -3.72e+06 |
| Running Env Steps   | 1030000   |
| Running Forward KL  | 14.3      |
| Running Reverse KL  | 9.15      |
| Running Update Time | 206       |
-----------------------------------
--2024-08-11 17:48:35.433999 UTC---
| Itration            | 207       |
| PAGAR Loss          | -1.55e+04 |
| Real Det Return     | 3.47e+03  |
| Real Sto Return     | 3.1e+03   |
| Reward Loss         | -4.09e+06 |
| Running Env Steps   | 1035000   |
| Running Forward KL  | 14.3      |
| Running Reverse KL  | 44        |
| Running Update Time | 207       |
-----------------------------------
--2024-08-11 17:51:03.434038 UTC---
| Itration            | 208       |
| PAGAR Loss          | -2.5e+03  |
| Real Det Return     | 3.25e+03  |
| Real Sto Return     | 3.19e+03  |
| Reward Loss         | -3.92e+06 |
| Running Env Steps   | 1040000   |
| Running Forward KL  | 14        |
| Running Reverse KL  | 8.38      |
| Running Update Time | 208       |
-----------------------------------
--2024-08-11 17:53:28.015297 UTC---
| Itration            | 209       |
| PAGAR Loss          | -1.85e+03 |
| Real Det Return     | 3.6e+03   |
| Real Sto Return     | 3.39e+03  |
| Reward Loss         | -3.99e+06 |
| Running Env Steps   | 1045000   |
| Running Forward KL  | 14.6      |
| Running Reverse KL  | 9.31      |
| Running Update Time | 209       |
-----------------------------------
--2024-08-11 17:55:55.645827 UTC---
| Itration            | 210       |
| PAGAR Loss          | -852      |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 3.2e+03   |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 1050000   |
| Running Forward KL  | 14        |
| Running Reverse KL  | 8.45      |
| Running Update Time | 210       |
-----------------------------------
--2024-08-11 17:58:21.805086 UTC---
| Itration            | 211       |
| PAGAR Loss          | 1.1e+03   |
| Real Det Return     | 3.13e+03  |
| Real Sto Return     | 3.14e+03  |
| Reward Loss         | -4.34e+06 |
| Running Env Steps   | 1055000   |
| Running Forward KL  | 14.1      |
| Running Reverse KL  | 8.61      |
| Running Update Time | 211       |
-----------------------------------
--2024-08-11 18:00:50.657021 UTC---
| Itration            | 212       |
| PAGAR Loss          | 4.1e+03   |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 3.24e+03  |
| Reward Loss         | -3.95e+06 |
| Running Env Steps   | 1060000   |
| Running Forward KL  | 13.9      |
| Running Reverse KL  | 8.76      |
| Running Update Time | 212       |
-----------------------------------
--2024-08-11 18:03:18.441382 UTC---
| Itration            | 213       |
| PAGAR Loss          | 61.9      |
| Real Det Return     | 3.32e+03  |
| Real Sto Return     | 3.22e+03  |
| Reward Loss         | -4.16e+06 |
| Running Env Steps   | 1065000   |
| Running Forward KL  | 14.2      |
| Running Reverse KL  | 8.4       |
| Running Update Time | 213       |
-----------------------------------
--2024-08-11 18:05:44.803552 UTC---
| Itration            | 214       |
| PAGAR Loss          | 2.41e+03  |
| Real Det Return     | 3.22e+03  |
| Real Sto Return     | 3.16e+03  |
| Reward Loss         | -3.95e+06 |
| Running Env Steps   | 1070000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 8.81      |
| Running Update Time | 214       |
-----------------------------------
--2024-08-11 18:08:12.757915 UTC---
| Itration            | 215       |
| PAGAR Loss          | 5.4e+03   |
| Real Det Return     | 3.36e+03  |
| Real Sto Return     | 3.19e+03  |
| Reward Loss         | -4.13e+06 |
| Running Env Steps   | 1075000   |
| Running Forward KL  | 13.4      |
| Running Reverse KL  | 23.5      |
| Running Update Time | 215       |
-----------------------------------
--2024-08-11 18:10:37.954576 UTC---
| Itration            | 216       |
| PAGAR Loss          | 5.82e+03  |
| Real Det Return     | 3.11e+03  |
| Real Sto Return     | 3.14e+03  |
| Reward Loss         | -4.14e+06 |
| Running Env Steps   | 1080000   |
| Running Forward KL  | 14.3      |
| Running Reverse KL  | 8.88      |
| Running Update Time | 216       |
-----------------------------------
--2024-08-11 18:13:07.090152 UTC---
| Itration            | 217       |
| PAGAR Loss          | 7.3e+03   |
| Real Det Return     | 3.39e+03  |
| Real Sto Return     | 3.28e+03  |
| Reward Loss         | -4.19e+06 |
| Running Env Steps   | 1085000   |
| Running Forward KL  | 14.1      |
| Running Reverse KL  | 9.67      |
| Running Update Time | 217       |
-----------------------------------
--2024-08-11 18:15:33.901565 UTC---
| Itration            | 218       |
| PAGAR Loss          | -4.37e+03 |
| Real Det Return     | 3.16e+03  |
| Real Sto Return     | 3.15e+03  |
| Reward Loss         | -4.42e+06 |
| Running Env Steps   | 1090000   |
| Running Forward KL  | 14.5      |
| Running Reverse KL  | 8.51      |
| Running Update Time | 218       |
-----------------------------------
--2024-08-11 18:18:00.206404 UTC---
| Itration            | 219       |
| PAGAR Loss          | -1.04e+04 |
| Real Det Return     | 3.23e+03  |
| Real Sto Return     | 3.36e+03  |
| Reward Loss         | -3.84e+06 |
| Running Env Steps   | 1095000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 8.41      |
| Running Update Time | 219       |
-----------------------------------
--2024-08-11 18:20:27.141807 UTC---
| Itration            | 220       |
| PAGAR Loss          | 7.02e+03  |
| Real Det Return     | 3.35e+03  |
| Real Sto Return     | 3.09e+03  |
| Reward Loss         | -4.07e+06 |
| Running Env Steps   | 1100000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 8.31      |
| Running Update Time | 220       |
-----------------------------------
--2024-08-11 18:22:51.387516 UTC---
| Itration            | 221       |
| PAGAR Loss          | -103      |
| Real Det Return     | 3.34e+03  |
| Real Sto Return     | 3.25e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 1105000   |
| Running Forward KL  | 14        |
| Running Reverse KL  | 9.76      |
| Running Update Time | 221       |
-----------------------------------
--2024-08-11 18:25:21.060296 UTC---
| Itration            | 222       |
| PAGAR Loss          | -371      |
| Real Det Return     | 3.45e+03  |
| Real Sto Return     | 3.42e+03  |
| Reward Loss         | -3.97e+06 |
| Running Env Steps   | 1110000   |
| Running Forward KL  | 13.4      |
| Running Reverse KL  | 8.61      |
| Running Update Time | 222       |
-----------------------------------
--2024-08-11 18:27:45.796759 UTC---
| Itration            | 223       |
| PAGAR Loss          | -1.66e+04 |
| Real Det Return     | 3.4e+03   |
| Real Sto Return     | 3e+03     |
| Reward Loss         | -4.26e+06 |
| Running Env Steps   | 1115000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 35.5      |
| Running Update Time | 223       |
-----------------------------------
--2024-08-11 18:30:14.417359 UTC---
| Itration            | 224       |
| PAGAR Loss          | -1.12e+04 |
| Real Det Return     | 3.53e+03  |
| Real Sto Return     | 3.32e+03  |
| Reward Loss         | -3.82e+06 |
| Running Env Steps   | 1120000   |
| Running Forward KL  | 13.4      |
| Running Reverse KL  | 8         |
| Running Update Time | 224       |
-----------------------------------
--2024-08-11 18:32:39.251373 UTC---
| Itration            | 225       |
| PAGAR Loss          | 479       |
| Real Det Return     | 3.53e+03  |
| Real Sto Return     | 3.31e+03  |
| Reward Loss         | -3.95e+06 |
| Running Env Steps   | 1125000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 8.21      |
| Running Update Time | 225       |
-----------------------------------
--2024-08-11 18:35:06.759268 UTC---
| Itration            | 226       |
| PAGAR Loss          | -5.45e+03 |
| Real Det Return     | 3.57e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -3.81e+06 |
| Running Env Steps   | 1130000   |
| Running Forward KL  | 13.2      |
| Running Reverse KL  | 8.21      |
| Running Update Time | 226       |
-----------------------------------
--2024-08-11 18:37:32.846399 UTC---
| Itration            | 227       |
| PAGAR Loss          | -3.55e+04 |
| Real Det Return     | 3.27e+03  |
| Real Sto Return     | 3.2e+03   |
| Reward Loss         | -4.08e+06 |
| Running Env Steps   | 1135000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 8.42      |
| Running Update Time | 227       |
-----------------------------------
--2024-08-11 18:39:59.570294 UTC---
| Itration            | 228       |
| PAGAR Loss          | -1.46e+04 |
| Real Det Return     | 3.22e+03  |
| Real Sto Return     | 3.12e+03  |
| Reward Loss         | -4.51e+06 |
| Running Env Steps   | 1140000   |
| Running Forward KL  | 13.8      |
| Running Reverse KL  | 9         |
| Running Update Time | 228       |
-----------------------------------
--2024-08-11 18:42:28.412260 UTC---
| Itration            | 229       |
| PAGAR Loss          | -739      |
| Real Det Return     | 3.44e+03  |
| Real Sto Return     | 3.28e+03  |
| Reward Loss         | -4.29e+06 |
| Running Env Steps   | 1145000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 29.9      |
| Running Update Time | 229       |
-----------------------------------
--2024-08-11 18:44:54.624243 UTC---
| Itration            | 230       |
| PAGAR Loss          | -8.13e+03 |
| Real Det Return     | 3.57e+03  |
| Real Sto Return     | 3.31e+03  |
| Reward Loss         | -3.81e+06 |
| Running Env Steps   | 1150000   |
| Running Forward KL  | 13.1      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 230       |
-----------------------------------
--2024-08-11 18:47:23.300076 UTC--
| Itration            | 231      |
| PAGAR Loss          | 4.07e+03 |
| Real Det Return     | 3.48e+03 |
| Real Sto Return     | 3.42e+03 |
| Reward Loss         | -3.8e+06 |
| Running Env Steps   | 1155000  |
| Running Forward KL  | 13.1     |
| Running Reverse KL  | 7.31     |
| Running Update Time | 231      |
----------------------------------
--2024-08-11 18:49:49.136247 UTC---
| Itration            | 232       |
| PAGAR Loss          | 4.15e+03  |
| Real Det Return     | 3.59e+03  |
| Real Sto Return     | 3.24e+03  |
| Reward Loss         | -4.01e+06 |
| Running Env Steps   | 1160000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 7.24      |
| Running Update Time | 232       |
-----------------------------------
--2024-08-11 18:52:17.881302 UTC---
| Itration            | 233       |
| PAGAR Loss          | 1.76e+04  |
| Real Det Return     | 3.41e+03  |
| Real Sto Return     | 3.27e+03  |
| Reward Loss         | -4.43e+06 |
| Running Env Steps   | 1165000   |
| Running Forward KL  | 13.7      |
| Running Reverse KL  | 8.36      |
| Running Update Time | 233       |
-----------------------------------
--2024-08-11 18:54:44.701746 UTC---
| Itration            | 234       |
| PAGAR Loss          | -984      |
| Real Det Return     | 3.55e+03  |
| Real Sto Return     | 3.54e+03  |
| Reward Loss         | -3.52e+06 |
| Running Env Steps   | 1170000   |
| Running Forward KL  | 13.6      |
| Running Reverse KL  | 8.2       |
| Running Update Time | 234       |
-----------------------------------
--2024-08-11 18:57:12.417495 UTC---
| Itration            | 235       |
| PAGAR Loss          | -7.69e+03 |
| Real Det Return     | 3.59e+03  |
| Real Sto Return     | 3.47e+03  |
| Reward Loss         | -4.08e+06 |
| Running Env Steps   | 1175000   |
| Running Forward KL  | 13.3      |
| Running Reverse KL  | 8.25      |
| Running Update Time | 235       |
-----------------------------------
--2024-08-11 18:59:40.733226 UTC--
| Itration            | 236      |
| PAGAR Loss          | 1.77e+04 |
| Real Det Return     | 3.39e+03 |
| Real Sto Return     | 3.35e+03 |
| Reward Loss         | -4.4e+06 |
| Running Env Steps   | 1180000  |
| Running Forward KL  | 13.7     |
| Running Reverse KL  | 9.02     |
| Running Update Time | 236      |
----------------------------------
--2024-08-11 19:02:07.963580 UTC---
| Itration            | 237       |
| PAGAR Loss          | -3.66e+03 |
| Real Det Return     | 3.62e+03  |
| Real Sto Return     | 3.61e+03  |
| Reward Loss         | -3.7e+06  |
| Running Env Steps   | 1185000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 6.81      |
| Running Update Time | 237       |
-----------------------------------
--2024-08-11 19:04:35.394554 UTC---
| Itration            | 238       |
| PAGAR Loss          | 8.47      |
| Real Det Return     | 3.61e+03  |
| Real Sto Return     | 3.49e+03  |
| Reward Loss         | -4.15e+06 |
| Running Env Steps   | 1190000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 7.89      |
| Running Update Time | 238       |
-----------------------------------
--2024-08-11 19:07:02.750649 UTC---
| Itration            | 239       |
| PAGAR Loss          | 9.78e+03  |
| Real Det Return     | 3.5e+03   |
| Real Sto Return     | 3.42e+03  |
| Reward Loss         | -3.72e+06 |
| Running Env Steps   | 1195000   |
| Running Forward KL  | 13        |
| Running Reverse KL  | 30.9      |
| Running Update Time | 239       |
-----------------------------------
--2024-08-11 19:09:30.101861 UTC--
| Itration            | 240      |
| PAGAR Loss          | 225      |
| Real Det Return     | 3.66e+03 |
| Real Sto Return     | 3.45e+03 |
| Reward Loss         | -4.3e+06 |
| Running Env Steps   | 1200000  |
| Running Forward KL  | 13.1     |
| Running Reverse KL  | 7.27     |
| Running Update Time | 240      |
----------------------------------
--2024-08-11 19:11:57.589723 UTC---
| Itration            | 241       |
| PAGAR Loss          | -3.13e+03 |
| Real Det Return     | 3.58e+03  |
| Real Sto Return     | 3.25e+03  |
| Reward Loss         | -4.39e+06 |
| Running Env Steps   | 1205000   |
| Running Forward KL  | 13.1      |
| Running Reverse KL  | 7.62      |
| Running Update Time | 241       |
-----------------------------------
--2024-08-11 19:14:22.764702 UTC--
| Itration            | 242      |
| PAGAR Loss          | 4.72e+03 |
| Real Det Return     | 3.52e+03 |
| Real Sto Return     | 3.4e+03  |
| Reward Loss         | -4.2e+06 |
| Running Env Steps   | 1210000  |
| Running Forward KL  | 13       |
| Running Reverse KL  | 7.47     |
| Running Update Time | 242      |
----------------------------------
--2024-08-11 19:16:51.378628 UTC---
| Itration            | 243       |
| PAGAR Loss          | 2.04e+04  |
| Real Det Return     | 3.69e+03  |
| Real Sto Return     | 3.51e+03  |
| Reward Loss         | -4.09e+06 |
| Running Env Steps   | 1215000   |
| Running Forward KL  | 13.1      |
| Running Reverse KL  | 7.19      |
| Running Update Time | 243       |
-----------------------------------
--2024-08-11 19:19:15.126618 UTC---
| Itration            | 244       |
| PAGAR Loss          | -1.09e+04 |
| Real Det Return     | 3.91e+03  |
| Real Sto Return     | 3.57e+03  |
| Reward Loss         | -3.87e+06 |
| Running Env Steps   | 1220000   |
| Running Forward KL  | 12.9      |
| Running Reverse KL  | 7.4       |
| Running Update Time | 244       |
-----------------------------------
--2024-08-11 19:21:43.471767 UTC---
| Itration            | 245       |
| PAGAR Loss          | 1.34e+04  |
| Real Det Return     | 3.69e+03  |
| Real Sto Return     | 3.53e+03  |
| Reward Loss         | -4.16e+06 |
| Running Env Steps   | 1225000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 7.2       |
| Running Update Time | 245       |
-----------------------------------
--2024-08-11 19:24:10.615530 UTC---
| Itration            | 246       |
| PAGAR Loss          | -2.73e+03 |
| Real Det Return     | 3.51e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -4.43e+06 |
| Running Env Steps   | 1230000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 6.84      |
| Running Update Time | 246       |
-----------------------------------
--2024-08-11 19:26:39.275325 UTC---
| Itration            | 247       |
| PAGAR Loss          | -2.77e+04 |
| Real Det Return     | 3.93e+03  |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -3.91e+06 |
| Running Env Steps   | 1235000   |
| Running Forward KL  | 12.5      |
| Running Reverse KL  | 7.65      |
| Running Update Time | 247       |
-----------------------------------
--2024-08-11 19:29:07.009438 UTC---
| Itration            | 248       |
| PAGAR Loss          | -5.99e+03 |
| Real Det Return     | 3.74e+03  |
| Real Sto Return     | 3.68e+03  |
| Reward Loss         | -4e+06    |
| Running Env Steps   | 1240000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 7.78      |
| Running Update Time | 248       |
-----------------------------------
--2024-08-11 19:31:32.852808 UTC---
| Itration            | 249       |
| PAGAR Loss          | -27.1     |
| Real Det Return     | 3.74e+03  |
| Real Sto Return     | 3.72e+03  |
| Reward Loss         | -3.56e+06 |
| Running Env Steps   | 1245000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 7.48      |
| Running Update Time | 249       |
-----------------------------------
--2024-08-11 19:34:02.016035 UTC---
| Itration            | 250       |
| PAGAR Loss          | 4.6e+03   |
| Real Det Return     | 3.84e+03  |
| Real Sto Return     | 3.7e+03   |
| Reward Loss         | -3.49e+06 |
| Running Env Steps   | 1250000   |
| Running Forward KL  | 12        |
| Running Reverse KL  | 7.19      |
| Running Update Time | 250       |
-----------------------------------
--2024-08-11 19:36:30.182339 UTC---
| Itration            | 251       |
| PAGAR Loss          | -8.89e+03 |
| Real Det Return     | 3.54e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -4e+06    |
| Running Env Steps   | 1255000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 7.3       |
| Running Update Time | 251       |
-----------------------------------
--2024-08-11 19:38:55.714552 UTC---
| Itration            | 252       |
| PAGAR Loss          | 4.25e+03  |
| Real Det Return     | 3.45e+03  |
| Real Sto Return     | 3.41e+03  |
| Reward Loss         | -4.57e+06 |
| Running Env Steps   | 1260000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 8.11      |
| Running Update Time | 252       |
-----------------------------------
--2024-08-11 19:41:23.673803 UTC---
| Itration            | 253       |
| PAGAR Loss          | -5.48e+03 |
| Real Det Return     | 3.61e+03  |
| Real Sto Return     | 3.6e+03   |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 1265000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 7.66      |
| Running Update Time | 253       |
-----------------------------------
--2024-08-11 19:43:49.572296 UTC--
| Itration            | 254      |
| PAGAR Loss          | 3.87e+03 |
| Real Det Return     | 3.65e+03 |
| Real Sto Return     | 3.72e+03 |
| Reward Loss         | -3.8e+06 |
| Running Env Steps   | 1270000  |
| Running Forward KL  | 12.2     |
| Running Reverse KL  | 7.73     |
| Running Update Time | 254      |
----------------------------------
--2024-08-11 19:46:18.096996 UTC---
| Itration            | 255       |
| PAGAR Loss          | -3.83e+03 |
| Real Det Return     | 3.6e+03   |
| Real Sto Return     | 3.48e+03  |
| Reward Loss         | -4.52e+06 |
| Running Env Steps   | 1275000   |
| Running Forward KL  | 13.2      |
| Running Reverse KL  | 7.99      |
| Running Update Time | 255       |
-----------------------------------
--2024-08-11 19:48:44.281734 UTC---
| Itration            | 256       |
| PAGAR Loss          | -1.62e+04 |
| Real Det Return     | 3.69e+03  |
| Real Sto Return     | 3.66e+03  |
| Reward Loss         | -4.44e+06 |
| Running Env Steps   | 1280000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 6.96      |
| Running Update Time | 256       |
-----------------------------------
--2024-08-11 19:51:10.438726 UTC---
| Itration            | 257       |
| PAGAR Loss          | 5.25e+03  |
| Real Det Return     | 3.81e+03  |
| Real Sto Return     | 3.48e+03  |
| Reward Loss         | -4.15e+06 |
| Running Env Steps   | 1285000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 10.9      |
| Running Update Time | 257       |
-----------------------------------
--2024-08-11 19:53:37.145371 UTC---
| Itration            | 258       |
| PAGAR Loss          | -1.4e+04  |
| Real Det Return     | 3.62e+03  |
| Real Sto Return     | 3.57e+03  |
| Reward Loss         | -4.49e+06 |
| Running Env Steps   | 1290000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 7.1       |
| Running Update Time | 258       |
-----------------------------------
--2024-08-11 19:56:00.208010 UTC---
| Itration            | 259       |
| PAGAR Loss          | -4.47e+03 |
| Real Det Return     | 3.73e+03  |
| Real Sto Return     | 3.76e+03  |
| Reward Loss         | -4.01e+06 |
| Running Env Steps   | 1295000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 7.55      |
| Running Update Time | 259       |
-----------------------------------
--2024-08-11 19:58:28.500132 UTC---
| Itration            | 260       |
| PAGAR Loss          | -2.04e+04 |
| Real Det Return     | 3.72e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -4.04e+06 |
| Running Env Steps   | 1300000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 7.8       |
| Running Update Time | 260       |
-----------------------------------
--2024-08-11 20:00:52.492437 UTC---
| Itration            | 261       |
| PAGAR Loss          | -6.66e+03 |
| Real Det Return     | 3.98e+03  |
| Real Sto Return     | 3.52e+03  |
| Reward Loss         | -4.78e+06 |
| Running Env Steps   | 1305000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 29.9      |
| Running Update Time | 261       |
-----------------------------------
--2024-08-11 20:03:20.528857 UTC---
| Itration            | 262       |
| PAGAR Loss          | 4.95e+03  |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 3.56e+03  |
| Reward Loss         | -4.48e+06 |
| Running Env Steps   | 1310000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 7.8       |
| Running Update Time | 262       |
-----------------------------------
--2024-08-11 20:05:48.489984 UTC---
| Itration            | 263       |
| PAGAR Loss          | 6.03e+03  |
| Real Det Return     | 3.83e+03  |
| Real Sto Return     | 3.59e+03  |
| Reward Loss         | -4.61e+06 |
| Running Env Steps   | 1315000   |
| Running Forward KL  | 12.9      |
| Running Reverse KL  | 7.25      |
| Running Update Time | 263       |
-----------------------------------
--2024-08-11 20:08:13.995989 UTC---
| Itration            | 264       |
| PAGAR Loss          | 1.69e+04  |
| Real Det Return     | 3.52e+03  |
| Real Sto Return     | 3.69e+03  |
| Reward Loss         | -4.25e+06 |
| Running Env Steps   | 1320000   |
| Running Forward KL  | 12.8      |
| Running Reverse KL  | 7.72      |
| Running Update Time | 264       |
-----------------------------------
--2024-08-11 20:10:41.783228 UTC---
| Itration            | 265       |
| PAGAR Loss          | 2.21e+03  |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 3.51e+03  |
| Reward Loss         | -4.63e+06 |
| Running Env Steps   | 1325000   |
| Running Forward KL  | 12.9      |
| Running Reverse KL  | 46.1      |
| Running Update Time | 265       |
-----------------------------------
--2024-08-11 20:13:06.869369 UTC---
| Itration            | 266       |
| PAGAR Loss          | -2.69e+04 |
| Real Det Return     | 3.75e+03  |
| Real Sto Return     | 3.62e+03  |
| Reward Loss         | -4.82e+06 |
| Running Env Steps   | 1330000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 6.7       |
| Running Update Time | 266       |
-----------------------------------
--2024-08-11 20:15:35.512146 UTC---
| Itration            | 267       |
| PAGAR Loss          | -5.98e+03 |
| Real Det Return     | 3.62e+03  |
| Real Sto Return     | 3.75e+03  |
| Reward Loss         | -4.06e+06 |
| Running Env Steps   | 1335000   |
| Running Forward KL  | 12.9      |
| Running Reverse KL  | 7.07      |
| Running Update Time | 267       |
-----------------------------------
--2024-08-11 20:18:03.016389 UTC---
| Itration            | 268       |
| PAGAR Loss          | -2.65e+03 |
| Real Det Return     | 3.89e+03  |
| Real Sto Return     | 3.74e+03  |
| Reward Loss         | -4.44e+06 |
| Running Env Steps   | 1340000   |
| Running Forward KL  | 12.5      |
| Running Reverse KL  | 6.93      |
| Running Update Time | 268       |
-----------------------------------
--2024-08-11 20:20:31.775393 UTC---
| Itration            | 269       |
| PAGAR Loss          | -514      |
| Real Det Return     | 3.76e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 1345000   |
| Running Forward KL  | 12.4      |
| Running Reverse KL  | 7.25      |
| Running Update Time | 269       |
-----------------------------------
--2024-08-11 20:23:00.270664 UTC---
| Itration            | 270       |
| PAGAR Loss          | 5e+03     |
| Real Det Return     | 3.54e+03  |
| Real Sto Return     | 3.7e+03   |
| Reward Loss         | -4.36e+06 |
| Running Env Steps   | 1350000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 7.7       |
| Running Update Time | 270       |
-----------------------------------
--2024-08-11 20:25:27.541377 UTC---
| Itration            | 271       |
| PAGAR Loss          | -4.03e+04 |
| Real Det Return     | 3.8e+03   |
| Real Sto Return     | 3.85e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 1355000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 10.8      |
| Running Update Time | 271       |
-----------------------------------
--2024-08-11 20:27:56.002594 UTC---
| Itration            | 272       |
| PAGAR Loss          | -8.36e+03 |
| Real Det Return     | 4.19e+03  |
| Real Sto Return     | 3.9e+03   |
| Reward Loss         | -3.96e+06 |
| Running Env Steps   | 1360000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 6.7       |
| Running Update Time | 272       |
-----------------------------------
--2024-08-11 20:30:22.266264 UTC---
| Itration            | 273       |
| PAGAR Loss          | 2.26e+04  |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 3.76e+03  |
| Reward Loss         | -4.07e+06 |
| Running Env Steps   | 1365000   |
| Running Forward KL  | 12.1      |
| Running Reverse KL  | 6.77      |
| Running Update Time | 273       |
-----------------------------------
--2024-08-11 20:32:50.104758 UTC---
| Itration            | 274       |
| PAGAR Loss          | -1.24e+03 |
| Real Det Return     | 3.74e+03  |
| Real Sto Return     | 3.69e+03  |
| Reward Loss         | -4.46e+06 |
| Running Env Steps   | 1370000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 7.4       |
| Running Update Time | 274       |
-----------------------------------
--2024-08-11 20:35:16.467950 UTC---
| Itration            | 275       |
| PAGAR Loss          | 2.12e+03  |
| Real Det Return     | 3.84e+03  |
| Real Sto Return     | 3.8e+03   |
| Reward Loss         | -4.41e+06 |
| Running Env Steps   | 1375000   |
| Running Forward KL  | 12.1      |
| Running Reverse KL  | 7.16      |
| Running Update Time | 275       |
-----------------------------------
--2024-08-11 20:37:43.361847 UTC---
| Itration            | 276       |
| PAGAR Loss          | -8.98e+03 |
| Real Det Return     | 4e+03     |
| Real Sto Return     | 3.58e+03  |
| Reward Loss         | -3.83e+06 |
| Running Env Steps   | 1380000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 44.6      |
| Running Update Time | 276       |
-----------------------------------
--2024-08-11 20:40:10.121182 UTC---
| Itration            | 277       |
| PAGAR Loss          | 342       |
| Real Det Return     | 3.75e+03  |
| Real Sto Return     | 3.65e+03  |
| Reward Loss         | -4.75e+06 |
| Running Env Steps   | 1385000   |
| Running Forward KL  | 12.1      |
| Running Reverse KL  | 6.67      |
| Running Update Time | 277       |
-----------------------------------
--2024-08-11 20:42:35.209315 UTC---
| Itration            | 278       |
| PAGAR Loss          | -2.52e+03 |
| Real Det Return     | 3.54e+03  |
| Real Sto Return     | 3.49e+03  |
| Reward Loss         | -5.43e+06 |
| Running Env Steps   | 1390000   |
| Running Forward KL  | 13        |
| Running Reverse KL  | 46.1      |
| Running Update Time | 278       |
-----------------------------------
--2024-08-11 20:45:04.070189 UTC--
| Itration            | 279      |
| PAGAR Loss          | 9.81e+04 |
| Real Det Return     | 3.79e+03 |
| Real Sto Return     | 3.82e+03 |
| Reward Loss         | -4.4e+06 |
| Running Env Steps   | 1395000  |
| Running Forward KL  | 12.5     |
| Running Reverse KL  | 7.12     |
| Running Update Time | 279      |
----------------------------------
--2024-08-11 20:47:26.010115 UTC---
| Itration            | 280       |
| PAGAR Loss          | -1.47e+05 |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 3.61e+03  |
| Reward Loss         | -5.49e+06 |
| Running Env Steps   | 1400000   |
| Running Forward KL  | 12.6      |
| Running Reverse KL  | 47.1      |
| Running Update Time | 280       |
-----------------------------------
--2024-08-11 20:49:53.060820 UTC---
| Itration            | 281       |
| PAGAR Loss          | 1.34e+04  |
| Real Det Return     | 3.95e+03  |
| Real Sto Return     | 3.46e+03  |
| Reward Loss         | -4.43e+06 |
| Running Env Steps   | 1405000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 10.8      |
| Running Update Time | 281       |
-----------------------------------
--2024-08-11 20:52:17.748665 UTC---
| Itration            | 282       |
| PAGAR Loss          | 1.24e+04  |
| Real Det Return     | 3.83e+03  |
| Real Sto Return     | 3.63e+03  |
| Reward Loss         | -4.38e+06 |
| Running Env Steps   | 1410000   |
| Running Forward KL  | 12.4      |
| Running Reverse KL  | 6.94      |
| Running Update Time | 282       |
-----------------------------------
--2024-08-11 20:54:44.624622 UTC---
| Itration            | 283       |
| PAGAR Loss          | 7.09e+03  |
| Real Det Return     | 3.82e+03  |
| Real Sto Return     | 3.98e+03  |
| Reward Loss         | -3.91e+06 |
| Running Env Steps   | 1415000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 20.6      |
| Running Update Time | 283       |
-----------------------------------
--2024-08-11 20:57:11.658696 UTC---
| Itration            | 284       |
| PAGAR Loss          | -1.56e+04 |
| Real Det Return     | 3.76e+03  |
| Real Sto Return     | 3.51e+03  |
| Reward Loss         | -4.62e+06 |
| Running Env Steps   | 1420000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 7.07      |
| Running Update Time | 284       |
-----------------------------------
--2024-08-11 20:59:37.496310 UTC---
| Itration            | 285       |
| PAGAR Loss          | -4.03e+03 |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 3.43e+03  |
| Reward Loss         | -4.08e+06 |
| Running Env Steps   | 1425000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 19.4      |
| Running Update Time | 285       |
-----------------------------------
--2024-08-11 21:02:05.939243 UTC---
| Itration            | 286       |
| PAGAR Loss          | -6.94e+03 |
| Real Det Return     | 4.11e+03  |
| Real Sto Return     | 3.83e+03  |
| Reward Loss         | -5.09e+06 |
| Running Env Steps   | 1430000   |
| Running Forward KL  | 12.7      |
| Running Reverse KL  | 44.6      |
| Running Update Time | 286       |
-----------------------------------
--2024-08-11 21:04:30.141909 UTC---
| Itration            | 287       |
| PAGAR Loss          | 5.7e+03   |
| Real Det Return     | 4.05e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -4.04e+06 |
| Running Env Steps   | 1435000   |
| Running Forward KL  | 11.9      |
| Running Reverse KL  | 7.1       |
| Running Update Time | 287       |
-----------------------------------
--2024-08-11 21:06:55.585911 UTC--
| Itration            | 288      |
| PAGAR Loss          | 7.29e+03 |
| Real Det Return     | 3.93e+03 |
| Real Sto Return     | 3.47e+03 |
| Reward Loss         | -4.6e+06 |
| Running Env Steps   | 1440000  |
| Running Forward KL  | 12.3     |
| Running Reverse KL  | 7.23     |
| Running Update Time | 288      |
----------------------------------
--2024-08-11 21:09:21.873983 UTC---
| Itration            | 289       |
| PAGAR Loss          | 2.88e+03  |
| Real Det Return     | 3.78e+03  |
| Real Sto Return     | 3.91e+03  |
| Reward Loss         | -4.28e+06 |
| Running Env Steps   | 1445000   |
| Running Forward KL  | 12        |
| Running Reverse KL  | 7.24      |
| Running Update Time | 289       |
-----------------------------------
--2024-08-11 21:11:48.852604 UTC---
| Itration            | 290       |
| PAGAR Loss          | -63.7     |
| Real Det Return     | 4.09e+03  |
| Real Sto Return     | 3.99e+03  |
| Reward Loss         | -4.14e+06 |
| Running Env Steps   | 1450000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 6.95      |
| Running Update Time | 290       |
-----------------------------------
--2024-08-11 21:14:15.666678 UTC---
| Itration            | 291       |
| PAGAR Loss          | 1.81e+04  |
| Real Det Return     | 3.94e+03  |
| Real Sto Return     | 3.96e+03  |
| Reward Loss         | -4.19e+06 |
| Running Env Steps   | 1455000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 6.72      |
| Running Update Time | 291       |
-----------------------------------
--2024-08-11 21:16:40.015928 UTC---
| Itration            | 292       |
| PAGAR Loss          | nan       |
| Real Det Return     | 3.89e+03  |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 1460000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 22.7      |
| Running Update Time | 292       |
-----------------------------------
--2024-08-11 21:19:07.843064 UTC---
| Itration            | 293       |
| PAGAR Loss          | -4.76e+03 |
| Real Det Return     | 3.94e+03  |
| Real Sto Return     | 3.83e+03  |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 1465000   |
| Running Forward KL  | 12.2      |
| Running Reverse KL  | 7.3       |
| Running Update Time | 293       |
-----------------------------------
--2024-08-11 21:21:30.923865 UTC---
| Itration            | 294       |
| PAGAR Loss          | -2.1e+05  |
| Real Det Return     | 4.01e+03  |
| Real Sto Return     | 3.96e+03  |
| Reward Loss         | -4.78e+06 |
| Running Env Steps   | 1470000   |
| Running Forward KL  | 12.1      |
| Running Reverse KL  | 46.3      |
| Running Update Time | 294       |
-----------------------------------
--2024-08-11 21:23:56.693181 UTC---
| Itration            | 295       |
| PAGAR Loss          | -4.29e+04 |
| Real Det Return     | 4.15e+03  |
| Real Sto Return     | 3.5e+03   |
| Reward Loss         | -4.44e+06 |
| Running Env Steps   | 1475000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 57.4      |
| Running Update Time | 295       |
-----------------------------------
--2024-08-11 21:26:23.300901 UTC---
| Itration            | 296       |
| PAGAR Loss          | 8.79e+03  |
| Real Det Return     | 4.22e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 1480000   |
| Running Forward KL  | 12.3      |
| Running Reverse KL  | 45.1      |
| Running Update Time | 296       |
-----------------------------------
--2024-08-11 21:28:48.885748 UTC---
| Itration            | 297       |
| PAGAR Loss          | -3.24e+03 |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -4.45e+06 |
| Running Env Steps   | 1485000   |
| Running Forward KL  | 11.5      |
| Running Reverse KL  | 6.52      |
| Running Update Time | 297       |
-----------------------------------
--2024-08-11 21:31:15.735345 UTC---
| Itration            | 298       |
| PAGAR Loss          | 6.88e+04  |
| Real Det Return     | 3.99e+03  |
| Real Sto Return     | 3.91e+03  |
| Reward Loss         | -4.06e+06 |
| Running Env Steps   | 1490000   |
| Running Forward KL  | 11.8      |
| Running Reverse KL  | 6.91      |
| Running Update Time | 298       |
-----------------------------------
--2024-08-11 21:33:39.886994 UTC---
| Itration            | 299       |
| PAGAR Loss          | -7.72e+03 |
| Real Det Return     | 3.98e+03  |
| Real Sto Return     | 4.13e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 1495000   |
| Running Forward KL  | 11.1      |
| Running Reverse KL  | 15.6      |
| Running Update Time | 299       |
-----------------------------------
--2024-08-11 21:36:05.669399 UTC---
| Itration            | 300       |
| PAGAR Loss          | 9.52e+03  |
| Real Det Return     | 4.06e+03  |
| Real Sto Return     | 4.1e+03   |
| Reward Loss         | -3.76e+06 |
| Running Env Steps   | 1500000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 6.8       |
| Running Update Time | 300       |
-----------------------------------
--2024-08-11 21:38:31.956946 UTC---
| Itration            | 301       |
| PAGAR Loss          | 6.32e+04  |
| Real Det Return     | 4.21e+03  |
| Real Sto Return     | 4.06e+03  |
| Reward Loss         | -4.11e+06 |
| Running Env Steps   | 1505000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 6.87      |
| Running Update Time | 301       |
-----------------------------------
--2024-08-11 21:40:56.799323 UTC---
| Itration            | 302       |
| PAGAR Loss          | 1.2e+04   |
| Real Det Return     | 4.01e+03  |
| Real Sto Return     | 3.99e+03  |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 1510000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 73.2      |
| Running Update Time | 302       |
-----------------------------------
--2024-08-11 21:43:24.451327 UTC---
| Itration            | 303       |
| PAGAR Loss          | -4.11e+03 |
| Real Det Return     | 3.93e+03  |
| Real Sto Return     | 4.02e+03  |
| Reward Loss         | -4.17e+06 |
| Running Env Steps   | 1515000   |
| Running Forward KL  | 11.4      |
| Running Reverse KL  | 6.45      |
| Running Update Time | 303       |
-----------------------------------
--2024-08-11 21:45:41.603269 UTC---
| Itration            | 304       |
| PAGAR Loss          | -9.92e+04 |
| Real Det Return     | 2.51e+03  |
| Real Sto Return     | 3.68e+03  |
| Reward Loss         | -3.96e+06 |
| Running Env Steps   | 1520000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 43.6      |
| Running Update Time | 304       |
-----------------------------------
--2024-08-11 21:48:07.786010 UTC---
| Itration            | 305       |
| PAGAR Loss          | -1.85e+04 |
| Real Det Return     | 3.95e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -4.33e+06 |
| Running Env Steps   | 1525000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 10.2      |
| Running Update Time | 305       |
-----------------------------------
--2024-08-11 21:50:34.259474 UTC---
| Itration            | 306       |
| PAGAR Loss          | 8.34e+03  |
| Real Det Return     | 4.07e+03  |
| Real Sto Return     | 3.9e+03   |
| Reward Loss         | -4.31e+06 |
| Running Env Steps   | 1530000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 44.8      |
| Running Update Time | 306       |
-----------------------------------
--2024-08-11 21:52:59.971272 UTC---
| Itration            | 307       |
| PAGAR Loss          | -4.09e+04 |
| Real Det Return     | 4.24e+03  |
| Real Sto Return     | 3.62e+03  |
| Reward Loss         | -4.4e+06  |
| Running Env Steps   | 1535000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 50.9      |
| Running Update Time | 307       |
-----------------------------------
--2024-08-11 21:55:25.198157 UTC---
| Itration            | 308       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.11e+03  |
| Real Sto Return     | 3.57e+03  |
| Reward Loss         | -4.46e+06 |
| Running Env Steps   | 1540000   |
| Running Forward KL  | 11.1      |
| Running Reverse KL  | 27.9      |
| Running Update Time | 308       |
-----------------------------------
--2024-08-11 21:57:47.676809 UTC---
| Itration            | 309       |
| PAGAR Loss          | -2.1e+04  |
| Real Det Return     | 3.06e+03  |
| Real Sto Return     | 3.83e+03  |
| Reward Loss         | -3.99e+06 |
| Running Env Steps   | 1545000   |
| Running Forward KL  | 11.5      |
| Running Reverse KL  | 34.7      |
| Running Update Time | 309       |
-----------------------------------
--2024-08-11 22:00:14.276883 UTC---
| Itration            | 310       |
| PAGAR Loss          | 1.99e+04  |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 4.18e+03  |
| Reward Loss         | -3.94e+06 |
| Running Env Steps   | 1550000   |
| Running Forward KL  | 11.6      |
| Running Reverse KL  | 8.11      |
| Running Update Time | 310       |
-----------------------------------
--2024-08-11 22:02:36.738526 UTC---
| Itration            | 311       |
| PAGAR Loss          | -5.86e+04 |
| Real Det Return     | 3.58e+03  |
| Real Sto Return     | 3.74e+03  |
| Reward Loss         | -4.58e+06 |
| Running Env Steps   | 1555000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 23.9      |
| Running Update Time | 311       |
-----------------------------------
--2024-08-11 22:04:59.377759 UTC---
| Itration            | 312       |
| PAGAR Loss          | -1.39e+04 |
| Real Det Return     | 4.17e+03  |
| Real Sto Return     | 3.06e+03  |
| Reward Loss         | -4.49e+06 |
| Running Env Steps   | 1560000   |
| Running Forward KL  | 11.4      |
| Running Reverse KL  | 73.5      |
| Running Update Time | 312       |
-----------------------------------
--2024-08-11 22:07:21.409422 UTC---
| Itration            | 313       |
| PAGAR Loss          | 6.47e+03  |
| Real Det Return     | 3.7e+03   |
| Real Sto Return     | 3.64e+03  |
| Reward Loss         | -4.45e+06 |
| Running Env Steps   | 1565000   |
| Running Forward KL  | 11.1      |
| Running Reverse KL  | 26.3      |
| Running Update Time | 313       |
-----------------------------------
--2024-08-11 22:09:48.798189 UTC---
| Itration            | 314       |
| PAGAR Loss          | 4.54e+03  |
| Real Det Return     | 4.05e+03  |
| Real Sto Return     | 4.05e+03  |
| Reward Loss         | -4.48e+06 |
| Running Env Steps   | 1570000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 14.4      |
| Running Update Time | 314       |
-----------------------------------
--2024-08-11 22:12:12.285206 UTC---
| Itration            | 315       |
| PAGAR Loss          | -9.14e+03 |
| Real Det Return     | 3.69e+03  |
| Real Sto Return     | 3.7e+03   |
| Reward Loss         | -3.99e+06 |
| Running Env Steps   | 1575000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 40.2      |
| Running Update Time | 315       |
-----------------------------------
--2024-08-11 22:14:58.535320 UTC---
| Itration            | 316       |
| PAGAR Loss          | 1.83e+04  |
| Real Det Return     | 4.24e+03  |
| Real Sto Return     | 4.09e+03  |
| Reward Loss         | -3.95e+06 |
| Running Env Steps   | 1580000   |
| Running Forward KL  | 10.9      |
| Running Reverse KL  | 6.26      |
| Running Update Time | 316       |
-----------------------------------
--2024-08-11 22:17:37.822315 UTC---
| Itration            | 317       |
| PAGAR Loss          | -4.84e+04 |
| Real Det Return     | 3.15e+03  |
| Real Sto Return     | 3.18e+03  |
| Reward Loss         | -5.86e+06 |
| Running Env Steps   | 1585000   |
| Running Forward KL  | 11.9      |
| Running Reverse KL  | 74.7      |
| Running Update Time | 317       |
-----------------------------------
--2024-08-11 22:20:22.651188 UTC---
| Itration            | 318       |
| PAGAR Loss          | -3.84e+04 |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 2.95e+03  |
| Reward Loss         | -4.53e+06 |
| Running Env Steps   | 1590000   |
| Running Forward KL  | 11.3      |
| Running Reverse KL  | 66.1      |
| Running Update Time | 318       |
-----------------------------------
--2024-08-11 22:23:09.003316 UTC---
| Itration            | 319       |
| PAGAR Loss          | 3.02e+05  |
| Real Det Return     | 4.3e+03   |
| Real Sto Return     | 3.92e+03  |
| Reward Loss         | -4.27e+06 |
| Running Env Steps   | 1595000   |
| Running Forward KL  | 11.4      |
| Running Reverse KL  | 94.9      |
| Running Update Time | 319       |
-----------------------------------
--2024-08-11 22:25:58.885398 UTC---
| Itration            | 320       |
| PAGAR Loss          | 1.43e+04  |
| Real Det Return     | 4.32e+03  |
| Real Sto Return     | 3.84e+03  |
| Reward Loss         | -4.24e+06 |
| Running Env Steps   | 1600000   |
| Running Forward KL  | 11.2      |
| Running Reverse KL  | 6.17      |
| Running Update Time | 320       |
-----------------------------------
--2024-08-11 22:28:48.314028 UTC---
| Itration            | 321       |
| PAGAR Loss          | 1.05e+04  |
| Real Det Return     | 3.98e+03  |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -4.02e+06 |
| Running Env Steps   | 1605000   |
| Running Forward KL  | 10.9      |
| Running Reverse KL  | 5.43      |
| Running Update Time | 321       |
-----------------------------------
--2024-08-11 22:31:34.010395 UTC---
| Itration            | 322       |
| PAGAR Loss          | -1.4e+05  |
| Real Det Return     | 3.71e+03  |
| Real Sto Return     | 3e+03     |
| Reward Loss         | -5.79e+06 |
| Running Env Steps   | 1610000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 146       |
| Running Update Time | 322       |
-----------------------------------
--2024-08-11 22:34:26.792321 UTC---
| Itration            | 323       |
| PAGAR Loss          | 4.81e+03  |
| Real Det Return     | 4.3e+03   |
| Real Sto Return     | 4.09e+03  |
| Reward Loss         | -4.22e+06 |
| Running Env Steps   | 1615000   |
| Running Forward KL  | 11.1      |
| Running Reverse KL  | 6         |
| Running Update Time | 323       |
-----------------------------------
--2024-08-11 22:37:16.463309 UTC---
| Itration            | 324       |
| PAGAR Loss          | 3.67e+04  |
| Real Det Return     | 4.09e+03  |
| Real Sto Return     | 3.82e+03  |
| Reward Loss         | -5.15e+06 |
| Running Env Steps   | 1620000   |
| Running Forward KL  | 10.7      |
| Running Reverse KL  | 44.5      |
| Running Update Time | 324       |
-----------------------------------
--2024-08-11 22:40:10.330688 UTC---
| Itration            | 325       |
| PAGAR Loss          | -2.54e+04 |
| Real Det Return     | 4.23e+03  |
| Real Sto Return     | 4.21e+03  |
| Reward Loss         | -4.42e+06 |
| Running Env Steps   | 1625000   |
| Running Forward KL  | 10.4      |
| Running Reverse KL  | 44.2      |
| Running Update Time | 325       |
-----------------------------------
--2024-08-11 22:42:59.993930 UTC---
| Itration            | 326       |
| PAGAR Loss          | 1.14e+04  |
| Real Det Return     | 4.08e+03  |
| Real Sto Return     | 4.03e+03  |
| Reward Loss         | -3.55e+06 |
| Running Env Steps   | 1630000   |
| Running Forward KL  | 10.4      |
| Running Reverse KL  | 39.3      |
| Running Update Time | 326       |
-----------------------------------
--2024-08-11 22:45:54.491593 UTC---
| Itration            | 327       |
| PAGAR Loss          | 8.68e+03  |
| Real Det Return     | 4.11e+03  |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -4.54e+06 |
| Running Env Steps   | 1635000   |
| Running Forward KL  | 10.4      |
| Running Reverse KL  | 5.7       |
| Running Update Time | 327       |
-----------------------------------
--2024-08-11 22:48:45.767118 UTC---
| Itration            | 328       |
| PAGAR Loss          | -2.38e+04 |
| Real Det Return     | 4.3e+03   |
| Real Sto Return     | 3.97e+03  |
| Reward Loss         | -3.59e+06 |
| Running Env Steps   | 1640000   |
| Running Forward KL  | 10.6      |
| Running Reverse KL  | 5.05      |
| Running Update Time | 328       |
-----------------------------------
--2024-08-11 22:51:37.918818 UTC---
| Itration            | 329       |
| PAGAR Loss          | -5.94e+03 |
| Real Det Return     | 4.2e+03   |
| Real Sto Return     | 3.82e+03  |
| Reward Loss         | -4.1e+06  |
| Running Env Steps   | 1645000   |
| Running Forward KL  | 10.2      |
| Running Reverse KL  | 5.63      |
| Running Update Time | 329       |
-----------------------------------
--2024-08-11 22:54:30.641742 UTC---
| Itration            | 330       |
| PAGAR Loss          | -3.97e+03 |
| Real Det Return     | 4.5e+03   |
| Real Sto Return     | 3.7e+03   |
| Reward Loss         | -3.58e+06 |
| Running Env Steps   | 1650000   |
| Running Forward KL  | 9.7       |
| Running Reverse KL  | 4.31      |
| Running Update Time | 330       |
-----------------------------------
--2024-08-11 22:57:18.653380 UTC---
| Itration            | 331       |
| PAGAR Loss          | -2.98e+04 |
| Real Det Return     | 4.37e+03  |
| Real Sto Return     | 2.95e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 1655000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 32.3      |
| Running Update Time | 331       |
-----------------------------------
--2024-08-11 23:00:11.065085 UTC---
| Itration            | 332       |
| PAGAR Loss          | 2.51e+04  |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -3.67e+06 |
| Running Env Steps   | 1660000   |
| Running Forward KL  | 10.4      |
| Running Reverse KL  | 5.85      |
| Running Update Time | 332       |
-----------------------------------
--2024-08-11 23:03:00.624747 UTC---
| Itration            | 333       |
| PAGAR Loss          | -1.25e+05 |
| Real Det Return     | 4.31e+03  |
| Real Sto Return     | 3.89e+03  |
| Reward Loss         | -3.71e+06 |
| Running Env Steps   | 1665000   |
| Running Forward KL  | 9.91      |
| Running Reverse KL  | 7.66      |
| Running Update Time | 333       |
-----------------------------------
--2024-08-11 23:05:54.301697 UTC---
| Itration            | 334       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 4e+03     |
| Reward Loss         | -4.94e+06 |
| Running Env Steps   | 1670000   |
| Running Forward KL  | 10.8      |
| Running Reverse KL  | 11        |
| Running Update Time | 334       |
-----------------------------------
--2024-08-11 23:08:46.722065 UTC---
| Itration            | 335       |
| PAGAR Loss          | 4.06e+04  |
| Real Det Return     | 4.29e+03  |
| Real Sto Return     | 3.64e+03  |
| Reward Loss         | -3.99e+06 |
| Running Env Steps   | 1675000   |
| Running Forward KL  | 10        |
| Running Reverse KL  | 5.99      |
| Running Update Time | 335       |
-----------------------------------
--2024-08-11 23:11:39.707270 UTC---
| Itration            | 336       |
| PAGAR Loss          | -4.42e+04 |
| Real Det Return     | 4.41e+03  |
| Real Sto Return     | 3.6e+03   |
| Reward Loss         | -3.92e+06 |
| Running Env Steps   | 1680000   |
| Running Forward KL  | 10.7      |
| Running Reverse KL  | 54.9      |
| Running Update Time | 336       |
-----------------------------------
--2024-08-11 23:14:34.963387 UTC---
| Itration            | 337       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.39e+03  |
| Real Sto Return     | 4.32e+03  |
| Reward Loss         | -5.11e+06 |
| Running Env Steps   | 1685000   |
| Running Forward KL  | 10.4      |
| Running Reverse KL  | 54.6      |
| Running Update Time | 337       |
-----------------------------------
--2024-08-11 23:17:15.704235 UTC---
| Itration            | 338       |
| PAGAR Loss          | -4.5e+04  |
| Real Det Return     | 4.28e+03  |
| Real Sto Return     | 3.81e+03  |
| Reward Loss         | -3.46e+06 |
| Running Env Steps   | 1690000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 36.2      |
| Running Update Time | 338       |
-----------------------------------
--2024-08-11 23:19:40.171159 UTC---
| Itration            | 339       |
| PAGAR Loss          | -1.28e+04 |
| Real Det Return     | 4.27e+03  |
| Real Sto Return     | 4.02e+03  |
| Reward Loss         | -3.61e+06 |
| Running Env Steps   | 1695000   |
| Running Forward KL  | 10.7      |
| Running Reverse KL  | 70.9      |
| Running Update Time | 339       |
-----------------------------------
--2024-08-11 23:22:05.345184 UTC---
| Itration            | 340       |
| PAGAR Loss          | 3.73e+04  |
| Real Det Return     | 4.36e+03  |
| Real Sto Return     | 4.12e+03  |
| Reward Loss         | -3.27e+06 |
| Running Env Steps   | 1700000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 6.16      |
| Running Update Time | 340       |
-----------------------------------
--2024-08-11 23:24:32.707361 UTC---
| Itration            | 341       |
| PAGAR Loss          | -8.61e+03 |
| Real Det Return     | 4.45e+03  |
| Real Sto Return     | 4.24e+03  |
| Reward Loss         | -2.86e+06 |
| Running Env Steps   | 1705000   |
| Running Forward KL  | 10        |
| Running Reverse KL  | 5.49      |
| Running Update Time | 341       |
-----------------------------------
--2024-08-11 23:26:53.711054 UTC--
| Itration            | 342      |
| PAGAR Loss          | 6.05e+03 |
| Real Det Return     | 4.43e+03 |
| Real Sto Return     | 3.68e+03 |
| Reward Loss         | -3.2e+06 |
| Running Env Steps   | 1710000  |
| Running Forward KL  | 9.72     |
| Running Reverse KL  | 35       |
| Running Update Time | 342      |
----------------------------------
--2024-08-11 23:29:21.424469 UTC---
| Itration            | 343       |
| PAGAR Loss          | -3.64e+03 |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 4.27e+03  |
| Reward Loss         | -3.14e+06 |
| Running Env Steps   | 1715000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 5.46      |
| Running Update Time | 343       |
-----------------------------------
--2024-08-11 23:31:45.684124 UTC---
| Itration            | 344       |
| PAGAR Loss          | -7.19e+03 |
| Real Det Return     | 4.48e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -2.95e+06 |
| Running Env Steps   | 1720000   |
| Running Forward KL  | 9.89      |
| Running Reverse KL  | 5.86      |
| Running Update Time | 344       |
-----------------------------------
--2024-08-11 23:34:13.095368 UTC---
| Itration            | 345       |
| PAGAR Loss          | 6.39e+04  |
| Real Det Return     | 4.73e+03  |
| Real Sto Return     | 4.48e+03  |
| Reward Loss         | -2.49e+06 |
| Running Env Steps   | 1725000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 5.92      |
| Running Update Time | 345       |
-----------------------------------
--2024-08-11 23:36:40.069964 UTC---
| Itration            | 346       |
| PAGAR Loss          | -1.56e+03 |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -2.47e+06 |
| Running Env Steps   | 1730000   |
| Running Forward KL  | 9.41      |
| Running Reverse KL  | 5.39      |
| Running Update Time | 346       |
-----------------------------------
--2024-08-11 23:39:06.469401 UTC---
| Itration            | 347       |
| PAGAR Loss          | 8.5e+04   |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -2.42e+06 |
| Running Env Steps   | 1735000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 5.61      |
| Running Update Time | 347       |
-----------------------------------
--2024-08-11 23:41:31.342252 UTC---
| Itration            | 348       |
| PAGAR Loss          | 3.97e+05  |
| Real Det Return     | 4.86e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -2.31e+06 |
| Running Env Steps   | 1740000   |
| Running Forward KL  | 9.85      |
| Running Reverse KL  | 5.47      |
| Running Update Time | 348       |
-----------------------------------
--2024-08-11 23:43:56.801963 UTC--
| Itration            | 349      |
| PAGAR Loss          | 6.13e+04 |
| Real Det Return     | 4.43e+03 |
| Real Sto Return     | 4.24e+03 |
| Reward Loss         | -3.4e+06 |
| Running Env Steps   | 1745000  |
| Running Forward KL  | 10.5     |
| Running Reverse KL  | 6.56     |
| Running Update Time | 349      |
----------------------------------
--2024-08-11 23:46:24.736022 UTC--
| Itration            | 350      |
| PAGAR Loss          | 6.73e+03 |
| Real Det Return     | 4.72e+03 |
| Real Sto Return     | 4.6e+03  |
| Reward Loss         | -2.3e+06 |
| Running Env Steps   | 1750000  |
| Running Forward KL  | 9.36     |
| Running Reverse KL  | 44.6     |
| Running Update Time | 350      |
----------------------------------
--2024-08-11 23:48:50.080361 UTC---
| Itration            | 351       |
| PAGAR Loss          | -1.21e+03 |
| Real Det Return     | 4.86e+03  |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 1755000   |
| Running Forward KL  | 10        |
| Running Reverse KL  | 5.8       |
| Running Update Time | 351       |
-----------------------------------
--2024-08-11 23:51:15.346501 UTC---
| Itration            | 352       |
| PAGAR Loss          | -1.42e+03 |
| Real Det Return     | 4.81e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -1.85e+06 |
| Running Env Steps   | 1760000   |
| Running Forward KL  | 9.56      |
| Running Reverse KL  | 5.63      |
| Running Update Time | 352       |
-----------------------------------
--2024-08-11 23:53:41.769495 UTC---
| Itration            | 353       |
| PAGAR Loss          | -2.56e+04 |
| Real Det Return     | 4.9e+03   |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -1.96e+06 |
| Running Env Steps   | 1765000   |
| Running Forward KL  | 9.85      |
| Running Reverse KL  | 34        |
| Running Update Time | 353       |
-----------------------------------
--2024-08-11 23:56:08.222621 UTC--
| Itration            | 354      |
| PAGAR Loss          | -2.5e+03 |
| Real Det Return     | 4.64e+03 |
| Real Sto Return     | 4.45e+03 |
| Reward Loss         | -3.2e+06 |
| Running Env Steps   | 1770000  |
| Running Forward KL  | 9.79     |
| Running Reverse KL  | 47.8     |
| Running Update Time | 354      |
----------------------------------
--2024-08-11 23:58:35.345337 UTC---
| Itration            | 355       |
| PAGAR Loss          | 8.55e+03  |
| Real Det Return     | 4.87e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 1775000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 38.5      |
| Running Update Time | 355       |
-----------------------------------
--2024-08-12 00:00:57.237288 UTC---
| Itration            | 356       |
| PAGAR Loss          | 1.44e+04  |
| Real Det Return     | 4.66e+03  |
| Real Sto Return     | 4.03e+03  |
| Reward Loss         | -2.19e+06 |
| Running Env Steps   | 1780000   |
| Running Forward KL  | 9.02      |
| Running Reverse KL  | 5.27      |
| Running Update Time | 356       |
-----------------------------------
--2024-08-12 00:03:23.937340 UTC---
| Itration            | 357       |
| PAGAR Loss          | -9.25e+04 |
| Real Det Return     | 4.79e+03  |
| Real Sto Return     | 4.26e+03  |
| Reward Loss         | -3.76e+06 |
| Running Env Steps   | 1785000   |
| Running Forward KL  | 10        |
| Running Reverse KL  | 110       |
| Running Update Time | 357       |
-----------------------------------
--2024-08-12 00:05:50.259049 UTC---
| Itration            | 358       |
| PAGAR Loss          | 5.07e+04  |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.61e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 1790000   |
| Running Forward KL  | 9.41      |
| Running Reverse KL  | 5.62      |
| Running Update Time | 358       |
-----------------------------------
--2024-08-12 00:08:16.390839 UTC---
| Itration            | 359       |
| PAGAR Loss          | -328      |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 1795000   |
| Running Forward KL  | 9.64      |
| Running Reverse KL  | 5.92      |
| Running Update Time | 359       |
-----------------------------------
--2024-08-12 00:10:44.806409 UTC---
| Itration            | 360       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5e+03     |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.53e+06 |
| Running Env Steps   | 1800000   |
| Running Forward KL  | 8.71      |
| Running Reverse KL  | 9.53      |
| Running Update Time | 360       |
-----------------------------------
--2024-08-12 00:13:09.662946 UTC---
| Itration            | 361       |
| PAGAR Loss          | 2.74e+04  |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 1805000   |
| Running Forward KL  | 9.19      |
| Running Reverse KL  | 5.36      |
| Running Update Time | 361       |
-----------------------------------
--2024-08-12 00:15:37.947687 UTC---
| Itration            | 362       |
| PAGAR Loss          | 2.93e+04  |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -9.85e+05 |
| Running Env Steps   | 1810000   |
| Running Forward KL  | 9.18      |
| Running Reverse KL  | 5         |
| Running Update Time | 362       |
-----------------------------------
--2024-08-12 00:18:02.139779 UTC---
| Itration            | 363       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 1815000   |
| Running Forward KL  | 8.11      |
| Running Reverse KL  | 4.04      |
| Running Update Time | 363       |
-----------------------------------
--2024-08-12 00:20:28.689406 UTC--
| Itration            | 364      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.14e+03 |
| Real Sto Return     | 4.51e+03 |
| Reward Loss         | -1.4e+06 |
| Running Env Steps   | 1820000  |
| Running Forward KL  | 9.42     |
| Running Reverse KL  | 52.5     |
| Running Update Time | 364      |
----------------------------------
--2024-08-12 00:22:51.575412 UTC---
| Itration            | 365       |
| PAGAR Loss          | -6.41e+03 |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.35e+03  |
| Reward Loss         | -2.94e+06 |
| Running Env Steps   | 1825000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 121       |
| Running Update Time | 365       |
-----------------------------------
--2024-08-12 00:25:18.418984 UTC---
| Itration            | 366       |
| PAGAR Loss          | 5.6e+04   |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -8.31e+05 |
| Running Env Steps   | 1830000   |
| Running Forward KL  | 9.21      |
| Running Reverse KL  | 6.09      |
| Running Update Time | 366       |
-----------------------------------
--2024-08-12 00:27:45.547415 UTC---
| Itration            | 367       |
| PAGAR Loss          | 1.01e+04  |
| Real Det Return     | 4.99e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -1.77e+06 |
| Running Env Steps   | 1835000   |
| Running Forward KL  | 9.5       |
| Running Reverse KL  | 5.62      |
| Running Update Time | 367       |
-----------------------------------
--2024-08-12 00:30:11.124079 UTC---
| Itration            | 368       |
| PAGAR Loss          | -1.27e+04 |
| Real Det Return     | 4.86e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 1840000   |
| Running Forward KL  | 11        |
| Running Reverse KL  | 6.81      |
| Running Update Time | 368       |
-----------------------------------
--2024-08-12 00:32:37.900622 UTC---
| Itration            | 369       |
| PAGAR Loss          | -587      |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -1.73e+06 |
| Running Env Steps   | 1845000   |
| Running Forward KL  | 9.48      |
| Running Reverse KL  | 5.31      |
| Running Update Time | 369       |
-----------------------------------
--2024-08-12 00:35:03.968634 UTC---
| Itration            | 370       |
| PAGAR Loss          | 4.83e+03  |
| Real Det Return     | 4.99e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 1850000   |
| Running Forward KL  | 8.99      |
| Running Reverse KL  | 5.54      |
| Running Update Time | 370       |
-----------------------------------
--2024-08-12 00:37:29.954217 UTC---
| Itration            | 371       |
| PAGAR Loss          | -1.23e+05 |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.29e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 1855000   |
| Running Forward KL  | 8.88      |
| Running Reverse KL  | 5.47      |
| Running Update Time | 371       |
-----------------------------------
--2024-08-12 00:39:51.850131 UTC---
| Itration            | 372       |
| PAGAR Loss          | -2.87e+04 |
| Real Det Return     | 4.6e+03   |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -9.54e+05 |
| Running Env Steps   | 1860000   |
| Running Forward KL  | 8.68      |
| Running Reverse KL  | 4.95      |
| Running Update Time | 372       |
-----------------------------------
--2024-08-12 00:42:17.380584 UTC---
| Itration            | 373       |
| PAGAR Loss          | -4.18e+04 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 1865000   |
| Running Forward KL  | 9.32      |
| Running Reverse KL  | 5.36      |
| Running Update Time | 373       |
-----------------------------------
--2024-08-12 00:44:42.546854 UTC---
| Itration            | 374       |
| PAGAR Loss          | -6.57e+04 |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -2.1e+06  |
| Running Env Steps   | 1870000   |
| Running Forward KL  | 9.08      |
| Running Reverse KL  | 4.06      |
| Running Update Time | 374       |
-----------------------------------
--2024-08-12 00:47:05.292996 UTC---
| Itration            | 375       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.72e+03  |
| Real Sto Return     | 4.05e+03  |
| Reward Loss         | -3.92e+06 |
| Running Env Steps   | 1875000   |
| Running Forward KL  | 9.45      |
| Running Reverse KL  | 81.8      |
| Running Update Time | 375       |
-----------------------------------
--2024-08-12 00:49:32.614684 UTC---
| Itration            | 376       |
| PAGAR Loss          | -1.09e+04 |
| Real Det Return     | 4.64e+03  |
| Real Sto Return     | 4.51e+03  |
| Reward Loss         | -2.63e+06 |
| Running Env Steps   | 1880000   |
| Running Forward KL  | 9.47      |
| Running Reverse KL  | 5.62      |
| Running Update Time | 376       |
-----------------------------------
--2024-08-12 00:51:56.727106 UTC---
| Itration            | 377       |
| PAGAR Loss          | -1.52e+04 |
| Real Det Return     | 4.77e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 1885000   |
| Running Forward KL  | 9.03      |
| Running Reverse KL  | 43.6      |
| Running Update Time | 377       |
-----------------------------------
--2024-08-12 00:54:17.311460 UTC---
| Itration            | 378       |
| PAGAR Loss          | -9.49e+07 |
| Real Det Return     | 3.82e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -2.36e+06 |
| Running Env Steps   | 1890000   |
| Running Forward KL  | 10.1      |
| Running Reverse KL  | 87.6      |
| Running Update Time | 378       |
-----------------------------------
--2024-08-12 00:56:42.495211 UTC---
| Itration            | 379       |
| PAGAR Loss          | -2.81e+04 |
| Real Det Return     | 4.89e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -2.49e+06 |
| Running Env Steps   | 1895000   |
| Running Forward KL  | 8.36      |
| Running Reverse KL  | 81.7      |
| Running Update Time | 379       |
-----------------------------------
--2024-08-12 00:59:07.872025 UTC---
| Itration            | 380       |
| PAGAR Loss          | -5.31e+04 |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 3.98e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 1900000   |
| Running Forward KL  | 8.44      |
| Running Reverse KL  | 4.52      |
| Running Update Time | 380       |
-----------------------------------
--2024-08-12 01:01:30.925091 UTC---
| Itration            | 381       |
| PAGAR Loss          | 2.11e+05  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 1905000   |
| Running Forward KL  | 8.38      |
| Running Reverse KL  | 3.93      |
| Running Update Time | 381       |
-----------------------------------
--2024-08-12 01:03:58.071477 UTC---
| Itration            | 382       |
| PAGAR Loss          | 3.59e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 1910000   |
| Running Forward KL  | 7.85      |
| Running Reverse KL  | 4.69      |
| Running Update Time | 382       |
-----------------------------------
--2024-08-12 01:06:22.851472 UTC---
| Itration            | 383       |
| PAGAR Loss          | -3.84e+04 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -7.18e+05 |
| Running Env Steps   | 1915000   |
| Running Forward KL  | 8.25      |
| Running Reverse KL  | 4.53      |
| Running Update Time | 383       |
-----------------------------------
--2024-08-12 01:08:46.867251 UTC---
| Itration            | 384       |
| PAGAR Loss          | -1.15e+05 |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.03e+03  |
| Reward Loss         | -2.34e+06 |
| Running Env Steps   | 1920000   |
| Running Forward KL  | 9.02      |
| Running Reverse KL  | 51.5      |
| Running Update Time | 384       |
-----------------------------------
--2024-08-12 01:11:12.968842 UTC---
| Itration            | 385       |
| PAGAR Loss          | 3.78e+04  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -7.02e+05 |
| Running Env Steps   | 1925000   |
| Running Forward KL  | 8.25      |
| Running Reverse KL  | 5.18      |
| Running Update Time | 385       |
-----------------------------------
--2024-08-12 01:13:38.333237 UTC---
| Itration            | 386       |
| PAGAR Loss          | 7.8e+04   |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 1930000   |
| Running Forward KL  | 8.37      |
| Running Reverse KL  | 43.3      |
| Running Update Time | 386       |
-----------------------------------
--2024-08-12 01:16:06.154342 UTC---
| Itration            | 387       |
| PAGAR Loss          | -1.59e+05 |
| Real Det Return     | 5.03e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 1935000   |
| Running Forward KL  | 8.15      |
| Running Reverse KL  | 24.7      |
| Running Update Time | 387       |
-----------------------------------
--2024-08-12 01:18:30.759746 UTC---
| Itration            | 388       |
| PAGAR Loss          | 1.45e+07  |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -7.21e+05 |
| Running Env Steps   | 1940000   |
| Running Forward KL  | 7.74      |
| Running Reverse KL  | 4.5       |
| Running Update Time | 388       |
-----------------------------------
--2024-08-12 01:20:57.379589 UTC---
| Itration            | 389       |
| PAGAR Loss          | -2.19e+04 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 1945000   |
| Running Forward KL  | 7.85      |
| Running Reverse KL  | 4.62      |
| Running Update Time | 389       |
-----------------------------------
--2024-08-12 01:23:21.010964 UTC---
| Itration            | 390       |
| PAGAR Loss          | -9.32e+04 |
| Real Det Return     | 4.99e+03  |
| Real Sto Return     | 4.32e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 1950000   |
| Running Forward KL  | 8.61      |
| Running Reverse KL  | 43        |
| Running Update Time | 390       |
-----------------------------------
--2024-08-12 01:25:45.716122 UTC---
| Itration            | 391       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.4e+03   |
| Reward Loss         | -1.47e+06 |
| Running Env Steps   | 1955000   |
| Running Forward KL  | 8.23      |
| Running Reverse KL  | 60.3      |
| Running Update Time | 391       |
-----------------------------------
--2024-08-12 01:28:11.414895 UTC---
| Itration            | 392       |
| PAGAR Loss          | -5.45e+03 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -9.34e+05 |
| Running Env Steps   | 1960000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 4.67      |
| Running Update Time | 392       |
-----------------------------------
--2024-08-12 01:30:38.050982 UTC---
| Itration            | 393       |
| PAGAR Loss          | -1.26e+05 |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.48e+06 |
| Running Env Steps   | 1965000   |
| Running Forward KL  | 8.3       |
| Running Reverse KL  | 5.01      |
| Running Update Time | 393       |
-----------------------------------
--2024-08-12 01:33:05.491140 UTC---
| Itration            | 394       |
| PAGAR Loss          | -9.47e+08 |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 1970000   |
| Running Forward KL  | 8.9       |
| Running Reverse KL  | 6.87      |
| Running Update Time | 394       |
-----------------------------------
--2024-08-12 01:35:29.978627 UTC---
| Itration            | 395       |
| PAGAR Loss          | 2.04e+05  |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 1975000   |
| Running Forward KL  | 7.63      |
| Running Reverse KL  | 4.56      |
| Running Update Time | 395       |
-----------------------------------
--2024-08-12 01:37:56.348278 UTC---
| Itration            | 396       |
| PAGAR Loss          | 1.94e+04  |
| Real Det Return     | 5.04e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 1980000   |
| Running Forward KL  | 7.59      |
| Running Reverse KL  | 4.53      |
| Running Update Time | 396       |
-----------------------------------
--2024-08-12 01:40:18.447303 UTC---
| Itration            | 397       |
| PAGAR Loss          | -1.01e+05 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -8.08e+05 |
| Running Env Steps   | 1985000   |
| Running Forward KL  | 7.43      |
| Running Reverse KL  | 4.45      |
| Running Update Time | 397       |
-----------------------------------
--2024-08-12 01:42:46.416394 UTC---
| Itration            | 398       |
| PAGAR Loss          | 8.48e+03  |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 1990000   |
| Running Forward KL  | 7.74      |
| Running Reverse KL  | 4.47      |
| Running Update Time | 398       |
-----------------------------------
--2024-08-12 01:45:09.847574 UTC---
| Itration            | 399       |
| PAGAR Loss          | -1.28e+04 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -9.21e+05 |
| Running Env Steps   | 1995000   |
| Running Forward KL  | 7.68      |
| Running Reverse KL  | 4.84      |
| Running Update Time | 399       |
-----------------------------------
--2024-08-12 01:47:33.589669 UTC---
| Itration            | 400       |
| PAGAR Loss          | -145      |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -8.16e+05 |
| Running Env Steps   | 2000000   |
| Running Forward KL  | 8.28      |
| Running Reverse KL  | 22.5      |
| Running Update Time | 400       |
-----------------------------------
--2024-08-12 01:49:59.225813 UTC--
| Itration            | 401      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.16e+03 |
| Real Sto Return     | 4.76e+03 |
| Reward Loss         | -1.5e+06 |
| Running Env Steps   | 2005000  |
| Running Forward KL  | 8.94     |
| Running Reverse KL  | 22.4     |
| Running Update Time | 401      |
----------------------------------
--2024-08-12 01:52:24.957727 UTC---
| Itration            | 402       |
| PAGAR Loss          | 2.35e+04  |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 2010000   |
| Running Forward KL  | 7.59      |
| Running Reverse KL  | 4.21      |
| Running Update Time | 402       |
-----------------------------------
--2024-08-12 01:54:50.525829 UTC---
| Itration            | 403       |
| PAGAR Loss          | -1.27e+04 |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 2015000   |
| Running Forward KL  | 7.49      |
| Running Reverse KL  | 4.58      |
| Running Update Time | 403       |
-----------------------------------
--2024-08-12 01:57:17.545845 UTC---
| Itration            | 404       |
| PAGAR Loss          | 1.93e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -9.16e+05 |
| Running Env Steps   | 2020000   |
| Running Forward KL  | 7.92      |
| Running Reverse KL  | 5.22      |
| Running Update Time | 404       |
-----------------------------------
--2024-08-12 01:59:44.912212 UTC---
| Itration            | 405       |
| PAGAR Loss          | 2.12e+04  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -8.95e+05 |
| Running Env Steps   | 2025000   |
| Running Forward KL  | 7.9       |
| Running Reverse KL  | 5.12      |
| Running Update Time | 405       |
-----------------------------------
--2024-08-12 02:02:12.546574 UTC---
| Itration            | 406       |
| PAGAR Loss          | 1.54e+04  |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 2030000   |
| Running Forward KL  | 7.81      |
| Running Reverse KL  | 4.55      |
| Running Update Time | 406       |
-----------------------------------
--2024-08-12 02:04:39.391573 UTC---
| Itration            | 407       |
| PAGAR Loss          | 7.7e+03   |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -8.54e+05 |
| Running Env Steps   | 2035000   |
| Running Forward KL  | 7.69      |
| Running Reverse KL  | 5.15      |
| Running Update Time | 407       |
-----------------------------------
--2024-08-12 02:07:07.825616 UTC---
| Itration            | 408       |
| PAGAR Loss          | 2.8e+03   |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -7.71e+05 |
| Running Env Steps   | 2040000   |
| Running Forward KL  | 8.1       |
| Running Reverse KL  | 6.12      |
| Running Update Time | 408       |
-----------------------------------
--2024-08-12 02:09:28.701933 UTC---
| Itration            | 409       |
| PAGAR Loss          | -3.05e+04 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 3.78e+03  |
| Reward Loss         | -2.1e+06  |
| Running Env Steps   | 2045000   |
| Running Forward KL  | 8.76      |
| Running Reverse KL  | 81.1      |
| Running Update Time | 409       |
-----------------------------------
--2024-08-12 02:11:55.002035 UTC---
| Itration            | 410       |
| PAGAR Loss          | 2.65e+03  |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.35e+06 |
| Running Env Steps   | 2050000   |
| Running Forward KL  | 8.75      |
| Running Reverse KL  | 5.55      |
| Running Update Time | 410       |
-----------------------------------
--2024-08-12 02:14:21.590115 UTC---
| Itration            | 411       |
| PAGAR Loss          | -1.37e+05 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -2.43e+06 |
| Running Env Steps   | 2055000   |
| Running Forward KL  | 8.44      |
| Running Reverse KL  | 110       |
| Running Update Time | 411       |
-----------------------------------
--2024-08-12 02:16:47.269112 UTC---
| Itration            | 412       |
| PAGAR Loss          | -2.95e+04 |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.1e+06  |
| Running Env Steps   | 2060000   |
| Running Forward KL  | 7.67      |
| Running Reverse KL  | 4.87      |
| Running Update Time | 412       |
-----------------------------------
--2024-08-12 02:19:13.890249 UTC---
| Itration            | 413       |
| PAGAR Loss          | -2.17e+04 |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 2065000   |
| Running Forward KL  | 8.66      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 413       |
-----------------------------------
--2024-08-12 02:21:36.404224 UTC---
| Itration            | 414       |
| PAGAR Loss          | -1.16e+04 |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.13e+03  |
| Reward Loss         | -2.35e+06 |
| Running Env Steps   | 2070000   |
| Running Forward KL  | 8.84      |
| Running Reverse KL  | 89.5      |
| Running Update Time | 414       |
-----------------------------------
--2024-08-12 02:24:05.867361 UTC---
| Itration            | 415       |
| PAGAR Loss          | -1.43e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -8.63e+05 |
| Running Env Steps   | 2075000   |
| Running Forward KL  | 8.11      |
| Running Reverse KL  | 5.27      |
| Running Update Time | 415       |
-----------------------------------
--2024-08-12 02:26:31.781119 UTC---
| Itration            | 416       |
| PAGAR Loss          | -1.68e+04 |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 2080000   |
| Running Forward KL  | 8.49      |
| Running Reverse KL  | 5.28      |
| Running Update Time | 416       |
-----------------------------------
--2024-08-12 02:28:58.006114 UTC---
| Itration            | 417       |
| PAGAR Loss          | 2.22e+04  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 2085000   |
| Running Forward KL  | 8.67      |
| Running Reverse KL  | 5.76      |
| Running Update Time | 417       |
-----------------------------------
--2024-08-12 02:31:21.877587 UTC---
| Itration            | 418       |
| PAGAR Loss          | -1.51e+04 |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.49e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 2090000   |
| Running Forward KL  | 7.92      |
| Running Reverse KL  | 4.56      |
| Running Update Time | 418       |
-----------------------------------
--2024-08-12 02:33:48.420787 UTC--
| Itration            | 419      |
| PAGAR Loss          | 3.18e+04 |
| Real Det Return     | 5.17e+03 |
| Real Sto Return     | 4.91e+03 |
| Reward Loss         | -1e+06   |
| Running Env Steps   | 2095000  |
| Running Forward KL  | 7.64     |
| Running Reverse KL  | 5.48     |
| Running Update Time | 419      |
----------------------------------
--2024-08-12 02:36:14.221535 UTC---
| Itration            | 420       |
| PAGAR Loss          | -1.15e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -9.06e+05 |
| Running Env Steps   | 2100000   |
| Running Forward KL  | 8.3       |
| Running Reverse KL  | 5.64      |
| Running Update Time | 420       |
-----------------------------------
--2024-08-12 02:38:40.628946 UTC---
| Itration            | 421       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -2.87e+06 |
| Running Env Steps   | 2105000   |
| Running Forward KL  | 9.17      |
| Running Reverse KL  | 117       |
| Running Update Time | 421       |
-----------------------------------
--2024-08-12 02:41:06.122553 UTC---
| Itration            | 422       |
| PAGAR Loss          | -6.94e+04 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.33e+03  |
| Reward Loss         | -2.72e+06 |
| Running Env Steps   | 2110000   |
| Running Forward KL  | 9.04      |
| Running Reverse KL  | 94.5      |
| Running Update Time | 422       |
-----------------------------------
--2024-08-12 02:43:31.185813 UTC---
| Itration            | 423       |
| PAGAR Loss          | 1.16e+04  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -5.43e+05 |
| Running Env Steps   | 2115000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 4.9       |
| Running Update Time | 423       |
-----------------------------------
--2024-08-12 02:45:58.620501 UTC---
| Itration            | 424       |
| PAGAR Loss          | -1.04e+04 |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 2120000   |
| Running Forward KL  | 9.13      |
| Running Reverse KL  | 6.65      |
| Running Update Time | 424       |
-----------------------------------
--2024-08-12 02:48:24.199542 UTC---
| Itration            | 425       |
| PAGAR Loss          | -3.77e+04 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -6.91e+05 |
| Running Env Steps   | 2125000   |
| Running Forward KL  | 7.84      |
| Running Reverse KL  | 5.28      |
| Running Update Time | 425       |
-----------------------------------
--2024-08-12 02:50:49.118379 UTC---
| Itration            | 426       |
| PAGAR Loss          | -1.27e+05 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 2130000   |
| Running Forward KL  | 7.68      |
| Running Reverse KL  | 4.62      |
| Running Update Time | 426       |
-----------------------------------
--2024-08-12 02:53:15.082836 UTC---
| Itration            | 427       |
| PAGAR Loss          | 1.53e+04  |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 2135000   |
| Running Forward KL  | 8.17      |
| Running Reverse KL  | 5.41      |
| Running Update Time | 427       |
-----------------------------------
--2024-08-12 02:55:40.564467 UTC---
| Itration            | 428       |
| PAGAR Loss          | -2.99e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 2140000   |
| Running Forward KL  | 8.08      |
| Running Reverse KL  | 16.7      |
| Running Update Time | 428       |
-----------------------------------
--2024-08-12 02:58:06.410906 UTC---
| Itration            | 429       |
| PAGAR Loss          | 3.52e+04  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -9.49e+05 |
| Running Env Steps   | 2145000   |
| Running Forward KL  | 8.4       |
| Running Reverse KL  | 6.07      |
| Running Update Time | 429       |
-----------------------------------
--2024-08-12 03:00:31.759780 UTC---
| Itration            | 430       |
| PAGAR Loss          | -1.09e+03 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -9.32e+05 |
| Running Env Steps   | 2150000   |
| Running Forward KL  | 8.69      |
| Running Reverse KL  | 6.38      |
| Running Update Time | 430       |
-----------------------------------
--2024-08-12 03:03:00.040463 UTC---
| Itration            | 431       |
| PAGAR Loss          | -7.76e+03 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -8.82e+05 |
| Running Env Steps   | 2155000   |
| Running Forward KL  | 8.22      |
| Running Reverse KL  | 25.8      |
| Running Update Time | 431       |
-----------------------------------
--2024-08-12 03:05:19.107359 UTC---
| Itration            | 432       |
| PAGAR Loss          | -3.17e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 3.94e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 2160000   |
| Running Forward KL  | 8.64      |
| Running Reverse KL  | 102       |
| Running Update Time | 432       |
-----------------------------------
--2024-08-12 03:07:46.128185 UTC---
| Itration            | 433       |
| PAGAR Loss          | 8.52e+05  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -9.22e+05 |
| Running Env Steps   | 2165000   |
| Running Forward KL  | 7.24      |
| Running Reverse KL  | 4.56      |
| Running Update Time | 433       |
-----------------------------------
--2024-08-12 03:10:12.481275 UTC---
| Itration            | 434       |
| PAGAR Loss          | 2.06e+04  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 2170000   |
| Running Forward KL  | 8.73      |
| Running Reverse KL  | 6.22      |
| Running Update Time | 434       |
-----------------------------------
--2024-08-12 03:12:37.249423 UTC---
| Itration            | 435       |
| PAGAR Loss          | -6.72e+03 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 2175000   |
| Running Forward KL  | 8.07      |
| Running Reverse KL  | 16.5      |
| Running Update Time | 435       |
-----------------------------------
--2024-08-12 03:15:06.258246 UTC---
| Itration            | 436       |
| PAGAR Loss          | -3.24e+04 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -8.62e+05 |
| Running Env Steps   | 2180000   |
| Running Forward KL  | 7.9       |
| Running Reverse KL  | 5.06      |
| Running Update Time | 436       |
-----------------------------------
--2024-08-12 03:17:30.008994 UTC---
| Itration            | 437       |
| PAGAR Loss          | 7.04e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -8.01e+05 |
| Running Env Steps   | 2185000   |
| Running Forward KL  | 8.08      |
| Running Reverse KL  | 4.79      |
| Running Update Time | 437       |
-----------------------------------
--2024-08-12 03:19:56.557111 UTC--
| Itration            | 438      |
| PAGAR Loss          | 5.85e+03 |
| Real Det Return     | 5.35e+03 |
| Real Sto Return     | 5.03e+03 |
| Reward Loss         | -8.7e+05 |
| Running Env Steps   | 2190000  |
| Running Forward KL  | 7.79     |
| Running Reverse KL  | 4.51     |
| Running Update Time | 438      |
----------------------------------
--2024-08-12 03:22:20.463736 UTC---
| Itration            | 439       |
| PAGAR Loss          | -1.72e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -8.51e+05 |
| Running Env Steps   | 2195000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 22.7      |
| Running Update Time | 439       |
-----------------------------------
--2024-08-12 03:24:13.570579 UTC---
| Itration            | 440       |
| PAGAR Loss          | -2.32e+06 |
| Real Det Return     | 218       |
| Real Sto Return     | 224       |
| Reward Loss         | -6.34e+06 |
| Running Env Steps   | 2200000   |
| Running Forward KL  | 22.4      |
| Running Reverse KL  | 394       |
| Running Update Time | 440       |
-----------------------------------
--2024-08-12 03:26:38.293878 UTC---
| Itration            | 441       |
| PAGAR Loss          | 3.37e+04  |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -1.72e+06 |
| Running Env Steps   | 2205000   |
| Running Forward KL  | 8.93      |
| Running Reverse KL  | 5.39      |
| Running Update Time | 441       |
-----------------------------------
--2024-08-12 03:29:04.787311 UTC---
| Itration            | 442       |
| PAGAR Loss          | -1.49e+04 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -9.32e+05 |
| Running Env Steps   | 2210000   |
| Running Forward KL  | 7.81      |
| Running Reverse KL  | 5.02      |
| Running Update Time | 442       |
-----------------------------------
--2024-08-12 03:31:30.823957 UTC---
| Itration            | 443       |
| PAGAR Loss          | 6.05e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -3.83e+05 |
| Running Env Steps   | 2215000   |
| Running Forward KL  | 7.7       |
| Running Reverse KL  | 4.9       |
| Running Update Time | 443       |
-----------------------------------
--2024-08-12 03:33:56.651755 UTC---
| Itration            | 444       |
| PAGAR Loss          | -5.74e+04 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -8.43e+05 |
| Running Env Steps   | 2220000   |
| Running Forward KL  | 7.54      |
| Running Reverse KL  | 42.6      |
| Running Update Time | 444       |
-----------------------------------
--2024-08-12 03:36:23.807974 UTC---
| Itration            | 445       |
| PAGAR Loss          | 3.23e+04  |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 2225000   |
| Running Forward KL  | 7.25      |
| Running Reverse KL  | 4.26      |
| Running Update Time | 445       |
-----------------------------------
--2024-08-12 03:38:51.027222 UTC---
| Itration            | 446       |
| PAGAR Loss          | 5.42e+04  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -3.77e+05 |
| Running Env Steps   | 2230000   |
| Running Forward KL  | 7.43      |
| Running Reverse KL  | 4.48      |
| Running Update Time | 446       |
-----------------------------------
--2024-08-12 03:41:17.877516 UTC---
| Itration            | 447       |
| PAGAR Loss          | 1.12e+04  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 2235000   |
| Running Forward KL  | 8.11      |
| Running Reverse KL  | 4.76      |
| Running Update Time | 447       |
-----------------------------------
--2024-08-12 03:43:44.920476 UTC---
| Itration            | 448       |
| PAGAR Loss          | 6.74e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -9.21e+05 |
| Running Env Steps   | 2240000   |
| Running Forward KL  | 7.19      |
| Running Reverse KL  | 4.11      |
| Running Update Time | 448       |
-----------------------------------
--2024-08-12 03:46:11.380109 UTC---
| Itration            | 449       |
| PAGAR Loss          | 6.26e+04  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -9.03e+05 |
| Running Env Steps   | 2245000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 3.54      |
| Running Update Time | 449       |
-----------------------------------
--2024-08-12 03:48:36.612600 UTC---
| Itration            | 450       |
| PAGAR Loss          | 7.35e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -5.37e+05 |
| Running Env Steps   | 2250000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 3.57      |
| Running Update Time | 450       |
-----------------------------------
--2024-08-12 03:51:03.860292 UTC---
| Itration            | 451       |
| PAGAR Loss          | -2.82e+04 |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -8.69e+05 |
| Running Env Steps   | 2255000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 2.79      |
| Running Update Time | 451       |
-----------------------------------
--2024-08-12 03:53:27.688244 UTC---
| Itration            | 452       |
| PAGAR Loss          | -9.99e+03 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -6.78e+05 |
| Running Env Steps   | 2260000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 3.45      |
| Running Update Time | 452       |
-----------------------------------
--2024-08-12 03:55:55.753013 UTC---
| Itration            | 453       |
| PAGAR Loss          | 3.07e+05  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -7.47e+05 |
| Running Env Steps   | 2265000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 3.1       |
| Running Update Time | 453       |
-----------------------------------
--2024-08-12 03:58:18.999896 UTC---
| Itration            | 454       |
| PAGAR Loss          | -1.1e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -7.73e+05 |
| Running Env Steps   | 2270000   |
| Running Forward KL  | 7.21      |
| Running Reverse KL  | 3.75      |
| Running Update Time | 454       |
-----------------------------------
--2024-08-12 04:00:47.907416 UTC---
| Itration            | 455       |
| PAGAR Loss          | 1.48e+04  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -6.79e+05 |
| Running Env Steps   | 2275000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 3.04      |
| Running Update Time | 455       |
-----------------------------------
--2024-08-12 04:03:11.624100 UTC---
| Itration            | 456       |
| PAGAR Loss          | -2.15e+04 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -2.03e+05 |
| Running Env Steps   | 2280000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 3.01      |
| Running Update Time | 456       |
-----------------------------------
--2024-08-12 04:05:40.709244 UTC---
| Itration            | 457       |
| PAGAR Loss          | 1.44e+04  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -3.01e+05 |
| Running Env Steps   | 2285000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 8.94      |
| Running Update Time | 457       |
-----------------------------------
--2024-08-12 04:08:10.484894 UTC--
| Itration            | 458      |
| PAGAR Loss          | 2.75e+04 |
| Real Det Return     | 5.24e+03 |
| Real Sto Return     | 5.14e+03 |
| Reward Loss         | -7.2e+05 |
| Running Env Steps   | 2290000  |
| Running Forward KL  | 5.96     |
| Running Reverse KL  | 3.06     |
| Running Update Time | 458      |
----------------------------------
--2024-08-12 04:10:36.449955 UTC---
| Itration            | 459       |
| PAGAR Loss          | 3.76e+04  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -8.05e+05 |
| Running Env Steps   | 2295000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 3.77      |
| Running Update Time | 459       |
-----------------------------------
--2024-08-12 04:13:03.054107 UTC---
| Itration            | 460       |
| PAGAR Loss          | 8.82e+03  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -8.71e+05 |
| Running Env Steps   | 2300000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 3.36      |
| Running Update Time | 460       |
-----------------------------------
--2024-08-12 04:15:28.698727 UTC---
| Itration            | 461       |
| PAGAR Loss          | -1.46e+05 |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -6.54e+05 |
| Running Env Steps   | 2305000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 24.1      |
| Running Update Time | 461       |
-----------------------------------
--2024-08-12 04:17:56.642868 UTC---
| Itration            | 462       |
| PAGAR Loss          | -2.92e+04 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -4.8e+05  |
| Running Env Steps   | 2310000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 2.66      |
| Running Update Time | 462       |
-----------------------------------
--2024-08-12 04:20:25.282856 UTC---
| Itration            | 463       |
| PAGAR Loss          | -4.27e+05 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -3.43e+05 |
| Running Env Steps   | 2315000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 2.38      |
| Running Update Time | 463       |
-----------------------------------
--2024-08-12 04:22:52.038645 UTC---
| Itration            | 464       |
| PAGAR Loss          | 5.91e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -4.57e+05 |
| Running Env Steps   | 2320000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 2.59      |
| Running Update Time | 464       |
-----------------------------------
--2024-08-12 04:25:17.697737 UTC---
| Itration            | 465       |
| PAGAR Loss          | 633       |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -1.87e+05 |
| Running Env Steps   | 2325000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 2.75      |
| Running Update Time | 465       |
-----------------------------------
--2024-08-12 04:27:41.254493 UTC---
| Itration            | 466       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -5.66e+05 |
| Running Env Steps   | 2330000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 19.2      |
| Running Update Time | 466       |
-----------------------------------
--2024-08-12 04:30:09.899819 UTC---
| Itration            | 467       |
| PAGAR Loss          | 3.62e+04  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -3.08e+05 |
| Running Env Steps   | 2335000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 3.25      |
| Running Update Time | 467       |
-----------------------------------
--2024-08-12 04:32:36.205729 UTC--
| Itration            | 468      |
| PAGAR Loss          | 1.75e+04 |
| Real Det Return     | 5.43e+03 |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | -3.6e+05 |
| Running Env Steps   | 2340000  |
| Running Forward KL  | 6.33     |
| Running Reverse KL  | 3.35     |
| Running Update Time | 468      |
----------------------------------
--2024-08-12 04:35:02.475106 UTC---
| Itration            | 469       |
| PAGAR Loss          | 1.73e+06  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -4.04e+05 |
| Running Env Steps   | 2345000   |
| Running Forward KL  | 6.02      |
| Running Reverse KL  | 34.7      |
| Running Update Time | 469       |
-----------------------------------
--2024-08-12 04:37:30.348588 UTC--
| Itration            | 470      |
| PAGAR Loss          | 1.33e+05 |
| Real Det Return     | 5.35e+03 |
| Real Sto Return     | 5.29e+03 |
| Reward Loss         | -5.4e+05 |
| Running Env Steps   | 2350000  |
| Running Forward KL  | 6.39     |
| Running Reverse KL  | 3.81     |
| Running Update Time | 470      |
----------------------------------
--2024-08-12 04:39:56.739908 UTC--
| Itration            | 471      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.21e+03 |
| Real Sto Return     | 5.18e+03 |
| Reward Loss         | -6e+05   |
| Running Env Steps   | 2355000  |
| Running Forward KL  | 5.8      |
| Running Reverse KL  | 2.61     |
| Running Update Time | 471      |
----------------------------------
--2024-08-12 04:42:25.323965 UTC---
| Itration            | 472       |
| PAGAR Loss          | 1.92e+03  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -4.98e+05 |
| Running Env Steps   | 2360000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.24      |
| Running Update Time | 472       |
-----------------------------------
--2024-08-12 04:44:50.041212 UTC---
| Itration            | 473       |
| PAGAR Loss          | -3.52e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -3.61e+05 |
| Running Env Steps   | 2365000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 23.8      |
| Running Update Time | 473       |
-----------------------------------
--2024-08-12 04:47:18.186242 UTC---
| Itration            | 474       |
| PAGAR Loss          | 4.15e+04  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -4.06e+05 |
| Running Env Steps   | 2370000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 27.8      |
| Running Update Time | 474       |
-----------------------------------
--2024-08-12 04:50:04.395727 UTC---
| Itration            | 475       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -6.43e+05 |
| Running Env Steps   | 2375000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 2.37      |
| Running Update Time | 475       |
-----------------------------------
--2024-08-12 04:52:55.135590 UTC---
| Itration            | 476       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -2.38e+05 |
| Running Env Steps   | 2380000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 2.32      |
| Running Update Time | 476       |
-----------------------------------
--2024-08-12 04:55:46.859094 UTC---
| Itration            | 477       |
| PAGAR Loss          | -1.08e+04 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -4.22e+05 |
| Running Env Steps   | 2385000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 2.48      |
| Running Update Time | 477       |
-----------------------------------
--2024-08-12 04:58:37.237284 UTC--
| Itration            | 478      |
| PAGAR Loss          | 3.7e+04  |
| Real Det Return     | 5.42e+03 |
| Real Sto Return     | 5.3e+03  |
| Reward Loss         | -2.1e+05 |
| Running Env Steps   | 2390000  |
| Running Forward KL  | 5.85     |
| Running Reverse KL  | 2.8      |
| Running Update Time | 478      |
----------------------------------
--2024-08-12 05:01:30.009909 UTC---
| Itration            | 479       |
| PAGAR Loss          | 5.6e+04   |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -7.93e+05 |
| Running Env Steps   | 2395000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 3.06      |
| Running Update Time | 479       |
-----------------------------------
--2024-08-12 05:04:21.800162 UTC---
| Itration            | 480       |
| PAGAR Loss          | 2.95e+07  |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -9.86e+05 |
| Running Env Steps   | 2400000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 3.56      |
| Running Update Time | 480       |
-----------------------------------
--2024-08-12 05:07:13.359940 UTC---
| Itration            | 481       |
| PAGAR Loss          | -9.03e+03 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -4.97e+05 |
| Running Env Steps   | 2405000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 23.7      |
| Running Update Time | 481       |
-----------------------------------
--2024-08-12 05:10:02.767878 UTC---
| Itration            | 482       |
| PAGAR Loss          | 2.6e+04   |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -6.34e+05 |
| Running Env Steps   | 2410000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 2.61      |
| Running Update Time | 482       |
-----------------------------------
--2024-08-12 05:12:53.266606 UTC--
| Itration            | 483      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.41e+03 |
| Real Sto Return     | 4.49e+03 |
| Reward Loss         | -6.4e+05 |
| Running Env Steps   | 2415000  |
| Running Forward KL  | 6.49     |
| Running Reverse KL  | 48.8     |
| Running Update Time | 483      |
----------------------------------
--2024-08-12 05:15:45.999736 UTC---
| Itration            | 484       |
| PAGAR Loss          | 1.07e+05  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -5.99e+05 |
| Running Env Steps   | 2420000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 2.38      |
| Running Update Time | 484       |
-----------------------------------
--2024-08-12 05:18:36.750063 UTC---
| Itration            | 485       |
| PAGAR Loss          | -7.37e+03 |
| Real Det Return     | 5.02e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -3.66e+05 |
| Running Env Steps   | 2425000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 2.98      |
| Running Update Time | 485       |
-----------------------------------
--2024-08-12 05:21:29.658468 UTC---
| Itration            | 486       |
| PAGAR Loss          | 6.12e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -3.31e+05 |
| Running Env Steps   | 2430000   |
| Running Forward KL  | 5.23      |
| Running Reverse KL  | 2.33      |
| Running Update Time | 486       |
-----------------------------------
--2024-08-12 05:24:20.310028 UTC---
| Itration            | 487       |
| PAGAR Loss          | -4.25e+04 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -2.73e+05 |
| Running Env Steps   | 2435000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 1.97      |
| Running Update Time | 487       |
-----------------------------------
--2024-08-12 05:27:11.693877 UTC---
| Itration            | 488       |
| PAGAR Loss          | 1.1e+06   |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -6.35e+05 |
| Running Env Steps   | 2440000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 2.87      |
| Running Update Time | 488       |
-----------------------------------
--2024-08-12 05:30:02.859046 UTC---
| Itration            | 489       |
| PAGAR Loss          | 6.68e+03  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -5.76e+05 |
| Running Env Steps   | 2445000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 489       |
-----------------------------------
--2024-08-12 05:32:56.172164 UTC---
| Itration            | 490       |
| PAGAR Loss          | -1.33e+05 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -3.12e+05 |
| Running Env Steps   | 2450000   |
| Running Forward KL  | 5.39      |
| Running Reverse KL  | 2.43      |
| Running Update Time | 490       |
-----------------------------------
--2024-08-12 05:35:50.039367 UTC---
| Itration            | 491       |
| PAGAR Loss          | -1.54e+05 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 2455000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 39.7      |
| Running Update Time | 491       |
-----------------------------------
--2024-08-12 05:38:45.731122 UTC---
| Itration            | 492       |
| PAGAR Loss          | 1.07e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -4.23e+05 |
| Running Env Steps   | 2460000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 2.48      |
| Running Update Time | 492       |
-----------------------------------
--2024-08-12 05:41:40.327017 UTC---
| Itration            | 493       |
| PAGAR Loss          | -2.42e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -3.17e+05 |
| Running Env Steps   | 2465000   |
| Running Forward KL  | 6.7       |
| Running Reverse KL  | 25.7      |
| Running Update Time | 493       |
-----------------------------------
--2024-08-12 05:44:22.780502 UTC---
| Itration            | 494       |
| PAGAR Loss          | -4.37e+04 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -3.99e+05 |
| Running Env Steps   | 2470000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 3.26      |
| Running Update Time | 494       |
-----------------------------------
--2024-08-12 05:46:49.408809 UTC---
| Itration            | 495       |
| PAGAR Loss          | 2.79e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -3.36e+05 |
| Running Env Steps   | 2475000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 2.42      |
| Running Update Time | 495       |
-----------------------------------
--2024-08-12 05:49:16.988380 UTC---
| Itration            | 496       |
| PAGAR Loss          | 5.66e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -1.75e+05 |
| Running Env Steps   | 2480000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 2.57      |
| Running Update Time | 496       |
-----------------------------------
--2024-08-12 05:51:39.033739 UTC---
| Itration            | 497       |
| PAGAR Loss          | 3.62e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -3.53e+05 |
| Running Env Steps   | 2485000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 2.95      |
| Running Update Time | 497       |
-----------------------------------
--2024-08-12 05:54:05.433781 UTC---
| Itration            | 498       |
| PAGAR Loss          | 1.96e+03  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -5.95e+05 |
| Running Env Steps   | 2490000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 2.86      |
| Running Update Time | 498       |
-----------------------------------
--2024-08-12 05:56:30.696848 UTC---
| Itration            | 499       |
| PAGAR Loss          | -2.18e+08 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -2.44e+05 |
| Running Env Steps   | 2495000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 12.7      |
| Running Update Time | 499       |
-----------------------------------
--2024-08-12 05:58:55.712991 UTC---
| Itration            | 500       |
| PAGAR Loss          | -1.89e+06 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 2500000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 40        |
| Running Update Time | 500       |
-----------------------------------
--2024-08-12 06:01:24.046664 UTC---
| Itration            | 501       |
| PAGAR Loss          | -8.74e+04 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -1.81e+06 |
| Running Env Steps   | 2505000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 33.7      |
| Running Update Time | 501       |
-----------------------------------
--2024-08-12 06:03:48.837655 UTC--
| Itration            | 502      |
| PAGAR Loss          | 7.36e+03 |
| Real Det Return     | 5.25e+03 |
| Real Sto Return     | 4.93e+03 |
| Reward Loss         | -4.3e+05 |
| Running Env Steps   | 2510000  |
| Running Forward KL  | 5.92     |
| Running Reverse KL  | 2.8      |
| Running Update Time | 502      |
----------------------------------
--2024-08-12 06:06:17.056340 UTC---
| Itration            | 503       |
| PAGAR Loss          | -4.55e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -2e+05    |
| Running Env Steps   | 2515000   |
| Running Forward KL  | 5.42      |
| Running Reverse KL  | 2.07      |
| Running Update Time | 503       |
-----------------------------------
--2024-08-12 06:08:41.193546 UTC---
| Itration            | 504       |
| PAGAR Loss          | -1.56e+05 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -4.99e+05 |
| Running Env Steps   | 2520000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 6.93      |
| Running Update Time | 504       |
-----------------------------------
--2024-08-12 06:11:04.297481 UTC--
| Itration            | 505      |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.28e+03 |
| Real Sto Return     | 3.53e+03 |
| Reward Loss         | -2e+06   |
| Running Env Steps   | 2525000  |
| Running Forward KL  | 6.45     |
| Running Reverse KL  | 76.3     |
| Running Update Time | 505      |
----------------------------------
--2024-08-12 06:13:29.031693 UTC---
| Itration            | 506       |
| PAGAR Loss          | -3.08e+05 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 2530000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 41.1      |
| Running Update Time | 506       |
-----------------------------------
--2024-08-12 06:15:55.372811 UTC---
| Itration            | 507       |
| PAGAR Loss          | 7.88e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -2.38e+05 |
| Running Env Steps   | 2535000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 507       |
-----------------------------------
--2024-08-12 06:18:22.299541 UTC---
| Itration            | 508       |
| PAGAR Loss          | -7.79e+04 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -5.25e+05 |
| Running Env Steps   | 2540000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 2.46      |
| Running Update Time | 508       |
-----------------------------------
--2024-08-12 06:20:49.028882 UTC---
| Itration            | 509       |
| PAGAR Loss          | -3.99e+06 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -2.8e+05  |
| Running Env Steps   | 2545000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 9.58      |
| Running Update Time | 509       |
-----------------------------------
--2024-08-12 06:23:15.763758 UTC---
| Itration            | 510       |
| PAGAR Loss          | -4.6e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -7.08e+05 |
| Running Env Steps   | 2550000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 40.8      |
| Running Update Time | 510       |
-----------------------------------
--2024-08-12 06:25:40.074411 UTC---
| Itration            | 511       |
| PAGAR Loss          | -1.08e+08 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.45e+06 |
| Running Env Steps   | 2555000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 98.7      |
| Running Update Time | 511       |
-----------------------------------
--2024-08-12 06:28:08.320235 UTC---
| Itration            | 512       |
| PAGAR Loss          | -7.29e+08 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -6.15e+05 |
| Running Env Steps   | 2560000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 18.1      |
| Running Update Time | 512       |
-----------------------------------
--2024-08-12 06:30:33.271733 UTC---
| Itration            | 513       |
| PAGAR Loss          | -1.92e+10 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -7.23e+05 |
| Running Env Steps   | 2565000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 35.5      |
| Running Update Time | 513       |
-----------------------------------
--2024-08-12 06:33:00.463242 UTC---
| Itration            | 514       |
| PAGAR Loss          | 1.86e+06  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -5.09e+05 |
| Running Env Steps   | 2570000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 2.66      |
| Running Update Time | 514       |
-----------------------------------
--2024-08-12 06:35:28.590498 UTC---
| Itration            | 515       |
| PAGAR Loss          | -9.03e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -5.28e+05 |
| Running Env Steps   | 2575000   |
| Running Forward KL  | 5.89      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 515       |
-----------------------------------
--2024-08-12 06:37:57.486361 UTC---
| Itration            | 516       |
| PAGAR Loss          | 3.13e+06  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -2.12e+05 |
| Running Env Steps   | 2580000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 3.2       |
| Running Update Time | 516       |
-----------------------------------
--2024-08-12 06:40:25.251208 UTC---
| Itration            | 517       |
| PAGAR Loss          | -1.75e+06 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -9.51e+05 |
| Running Env Steps   | 2585000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 31.4      |
| Running Update Time | 517       |
-----------------------------------
--2024-08-12 06:42:51.214401 UTC---
| Itration            | 518       |
| PAGAR Loss          | 3.89e+05  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -2.68e+05 |
| Running Env Steps   | 2590000   |
| Running Forward KL  | 5.27      |
| Running Reverse KL  | 2.1       |
| Running Update Time | 518       |
-----------------------------------
--2024-08-12 06:45:17.015390 UTC---
| Itration            | 519       |
| PAGAR Loss          | -1.52e+07 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.2e+03   |
| Reward Loss         | 4.28e+05  |
| Running Env Steps   | 2595000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 36        |
| Running Update Time | 519       |
-----------------------------------
--2024-08-12 06:47:42.629028 UTC---
| Itration            | 520       |
| PAGAR Loss          | -4.69e+04 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -6.45e+04 |
| Running Env Steps   | 2600000   |
| Running Forward KL  | 5.61      |
| Running Reverse KL  | 1.98      |
| Running Update Time | 520       |
-----------------------------------
--2024-08-12 06:50:11.003021 UTC---
| Itration            | 521       |
| PAGAR Loss          | 1.7e+05   |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -3.31e+05 |
| Running Env Steps   | 2605000   |
| Running Forward KL  | 5.88      |
| Running Reverse KL  | 2.66      |
| Running Update Time | 521       |
-----------------------------------
--2024-08-12 06:52:35.126828 UTC---
| Itration            | 522       |
| PAGAR Loss          | -6.32e+06 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -1.6e+06  |
| Running Env Steps   | 2610000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 77.3      |
| Running Update Time | 522       |
-----------------------------------
--2024-08-12 06:55:01.421730 UTC---
| Itration            | 523       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -3.08e+05 |
| Running Env Steps   | 2615000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 3.41      |
| Running Update Time | 523       |
-----------------------------------
--2024-08-12 06:57:26.138702 UTC---
| Itration            | 524       |
| PAGAR Loss          | 6.43e+06  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.4e+03   |
| Reward Loss         | -4.22e+05 |
| Running Env Steps   | 2620000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 2.18      |
| Running Update Time | 524       |
-----------------------------------
--2024-08-12 06:59:52.152078 UTC---
| Itration            | 525       |
| PAGAR Loss          | 4.45e+07  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -4.81e+05 |
| Running Env Steps   | 2625000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 525       |
-----------------------------------
--2024-08-12 07:02:16.083108 UTC---
| Itration            | 526       |
| PAGAR Loss          | 2.07e+04  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -3.33e+05 |
| Running Env Steps   | 2630000   |
| Running Forward KL  | 5.41      |
| Running Reverse KL  | 2.3       |
| Running Update Time | 526       |
-----------------------------------
--2024-08-12 07:04:42.570227 UTC---
| Itration            | 527       |
| PAGAR Loss          | 1.62e+04  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.88e+03  |
| Reward Loss         | -6.24e+05 |
| Running Env Steps   | 2635000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 2.35      |
| Running Update Time | 527       |
-----------------------------------
--2024-08-12 07:07:02.992770 UTC---
| Itration            | 528       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.37e+03  |
| Real Sto Return     | 4.05e+03  |
| Reward Loss         | -5.25e+06 |
| Running Env Steps   | 2640000   |
| Running Forward KL  | 6.94      |
| Running Reverse KL  | 117       |
| Running Update Time | 528       |
-----------------------------------
--2024-08-12 07:09:07.140100 UTC---
| Itration            | 529       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.48e+03  |
| Real Sto Return     | 1.93e+03  |
| Reward Loss         | -7.92e+06 |
| Running Env Steps   | 2645000   |
| Running Forward KL  | 9.33      |
| Running Reverse KL  | 272       |
| Running Update Time | 529       |
-----------------------------------
--2024-08-12 07:11:32.880514 UTC---
| Itration            | 530       |
| PAGAR Loss          | -1.49e+04 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -3.75e+05 |
| Running Env Steps   | 2650000   |
| Running Forward KL  | 5.23      |
| Running Reverse KL  | 2.27      |
| Running Update Time | 530       |
-----------------------------------
--2024-08-12 07:14:01.544875 UTC---
| Itration            | 531       |
| PAGAR Loss          | 1.24e+06  |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -6.93e+05 |
| Running Env Steps   | 2655000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 2.35      |
| Running Update Time | 531       |
-----------------------------------
--2024-08-12 07:16:26.302345 UTC---
| Itration            | 532       |
| PAGAR Loss          | -1.37e+06 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 2660000   |
| Running Forward KL  | 5.69      |
| Running Reverse KL  | 35.8      |
| Running Update Time | 532       |
-----------------------------------
--2024-08-12 07:18:56.355003 UTC---
| Itration            | 533       |
| PAGAR Loss          | 180       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -6.01e+05 |
| Running Env Steps   | 2665000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 1.86      |
| Running Update Time | 533       |
-----------------------------------
--2024-08-12 07:21:37.820118 UTC---
| Itration            | 534       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 2670000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 38.2      |
| Running Update Time | 534       |
-----------------------------------
--2024-08-12 07:24:33.472077 UTC---
| Itration            | 535       |
| PAGAR Loss          | -1.46e+07 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -1.2e+06  |
| Running Env Steps   | 2675000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 38.2      |
| Running Update Time | 535       |
-----------------------------------
--2024-08-12 07:27:30.252633 UTC---
| Itration            | 536       |
| PAGAR Loss          | 2.29e+04  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -5.03e+05 |
| Running Env Steps   | 2680000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 2.47      |
| Running Update Time | 536       |
-----------------------------------
--2024-08-12 07:30:19.492742 UTC---
| Itration            | 537       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.26e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 2685000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 68.1      |
| Running Update Time | 537       |
-----------------------------------
--2024-08-12 07:33:21.291518 UTC---
| Itration            | 538       |
| PAGAR Loss          | -8.9e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -2.08e+05 |
| Running Env Steps   | 2690000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 2.32      |
| Running Update Time | 538       |
-----------------------------------
--2024-08-12 07:36:04.929466 UTC---
| Itration            | 539       |
| PAGAR Loss          | 3.41e+04  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.41e+03  |
| Reward Loss         | -6.71e+05 |
| Running Env Steps   | 2695000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 2.73      |
| Running Update Time | 539       |
-----------------------------------
--2024-08-12 07:39:08.113123 UTC---
| Itration            | 540       |
| PAGAR Loss          | -1.35e+04 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -5.63e+05 |
| Running Env Steps   | 2700000   |
| Running Forward KL  | 5.86      |
| Running Reverse KL  | 2.82      |
| Running Update Time | 540       |
-----------------------------------
--2024-08-12 07:41:51.109448 UTC---
| Itration            | 541       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -4.16e+05 |
| Running Env Steps   | 2705000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 2.71      |
| Running Update Time | 541       |
-----------------------------------
--2024-08-12 07:44:22.629801 UTC---
| Itration            | 542       |
| PAGAR Loss          | -1.68e+06 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 2710000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 36.6      |
| Running Update Time | 542       |
-----------------------------------
--2024-08-12 07:46:49.189998 UTC---
| Itration            | 543       |
| PAGAR Loss          | -1.85e+04 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -4.62e+05 |
| Running Env Steps   | 2715000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 2.6       |
| Running Update Time | 543       |
-----------------------------------
--2024-08-12 07:49:14.960486 UTC---
| Itration            | 544       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -5.58e+05 |
| Running Env Steps   | 2720000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 36.4      |
| Running Update Time | 544       |
-----------------------------------
--2024-08-12 07:51:41.514283 UTC---
| Itration            | 545       |
| PAGAR Loss          | 6.04e+04  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -4.66e+05 |
| Running Env Steps   | 2725000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 545       |
-----------------------------------
--2024-08-12 07:54:06.782138 UTC---
| Itration            | 546       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -9.79e+05 |
| Running Env Steps   | 2730000   |
| Running Forward KL  | 5.16      |
| Running Reverse KL  | 2.5       |
| Running Update Time | 546       |
-----------------------------------
--2024-08-12 07:56:34.487612 UTC---
| Itration            | 547       |
| PAGAR Loss          | -2.12e+06 |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -3.21e+06 |
| Running Env Steps   | 2735000   |
| Running Forward KL  | 6.6       |
| Running Reverse KL  | 97.9      |
| Running Update Time | 547       |
-----------------------------------
--2024-08-12 07:59:29.203721 UTC---
| Itration            | 548       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.01e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 2740000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 2.9       |
| Running Update Time | 548       |
-----------------------------------
--2024-08-12 08:02:24.574114 UTC---
| Itration            | 549       |
| PAGAR Loss          | -3.28e+04 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -5.43e+05 |
| Running Env Steps   | 2745000   |
| Running Forward KL  | 7.06      |
| Running Reverse KL  | 3.58      |
| Running Update Time | 549       |
-----------------------------------
--2024-08-12 08:05:05.931590 UTC---
| Itration            | 550       |
| PAGAR Loss          | 5.01e+03  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -2.14e+05 |
| Running Env Steps   | 2750000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 2.1       |
| Running Update Time | 550       |
-----------------------------------
--2024-08-12 08:07:42.546079 UTC---
| Itration            | 551       |
| PAGAR Loss          | -1.9e+07  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.63e+06 |
| Running Env Steps   | 2755000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 35.2      |
| Running Update Time | 551       |
-----------------------------------
--2024-08-12 08:10:41.032599 UTC---
| Itration            | 552       |
| PAGAR Loss          | -1.53e+05 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -7.94e+05 |
| Running Env Steps   | 2760000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 33.2      |
| Running Update Time | 552       |
-----------------------------------
--2024-08-12 08:13:50.400213 UTC---
| Itration            | 553       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.86e+05 |
| Running Env Steps   | 2765000   |
| Running Forward KL  | 5.1       |
| Running Reverse KL  | 1.82      |
| Running Update Time | 553       |
-----------------------------------
--2024-08-12 08:16:56.983887 UTC---
| Itration            | 554       |
| PAGAR Loss          | 6.84e+03  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -5.19e+05 |
| Running Env Steps   | 2770000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 2.6       |
| Running Update Time | 554       |
-----------------------------------
--2024-08-12 08:20:02.746615 UTC---
| Itration            | 555       |
| PAGAR Loss          | 1.33e+04  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -3.92e+05 |
| Running Env Steps   | 2775000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 2.35      |
| Running Update Time | 555       |
-----------------------------------
--2024-08-12 08:23:13.145496 UTC---
| Itration            | 556       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -7.33e+05 |
| Running Env Steps   | 2780000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.33      |
| Running Update Time | 556       |
-----------------------------------
--2024-08-12 08:26:23.369376 UTC---
| Itration            | 557       |
| PAGAR Loss          | -7.87e+04 |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 2785000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 40.6      |
| Running Update Time | 557       |
-----------------------------------
--2024-08-12 08:29:36.450655 UTC---
| Itration            | 558       |
| PAGAR Loss          | -5.48e+04 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -6.96e+05 |
| Running Env Steps   | 2790000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 2.67      |
| Running Update Time | 558       |
-----------------------------------
--2024-08-12 08:32:43.919791 UTC---
| Itration            | 559       |
| PAGAR Loss          | -6.25e+08 |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 3.93e+03  |
| Reward Loss         | -1.98e+06 |
| Running Env Steps   | 2795000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 35.6      |
| Running Update Time | 559       |
-----------------------------------
--2024-08-12 08:35:27.261759 UTC---
| Itration            | 560       |
| PAGAR Loss          | 2.56e+05  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -2.53e+05 |
| Running Env Steps   | 2800000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 1.87      |
| Running Update Time | 560       |
-----------------------------------
--2024-08-12 08:38:06.967597 UTC---
| Itration            | 561       |
| PAGAR Loss          | -2.92e+05 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -2.13e+06 |
| Running Env Steps   | 2805000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 37.3      |
| Running Update Time | 561       |
-----------------------------------
--2024-08-12 08:40:22.852027 UTC---
| Itration            | 562       |
| PAGAR Loss          | nan       |
| Real Det Return     | 1.08e+03  |
| Real Sto Return     | 955       |
| Reward Loss         | -1.28e+07 |
| Running Env Steps   | 2810000   |
| Running Forward KL  | 14.6      |
| Running Reverse KL  | 280       |
| Running Update Time | 562       |
-----------------------------------
--2024-08-12 08:42:58.484643 UTC---
| Itration            | 563       |
| PAGAR Loss          | -1.71e+06 |
| Real Det Return     | 4.22e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -3.02e+06 |
| Running Env Steps   | 2815000   |
| Running Forward KL  | 7.24      |
| Running Reverse KL  | 81.7      |
| Running Update Time | 563       |
-----------------------------------
--2024-08-12 08:45:33.435523 UTC---
| Itration            | 564       |
| PAGAR Loss          | -2.17e+04 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -6.21e+05 |
| Running Env Steps   | 2820000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 2.6       |
| Running Update Time | 564       |
-----------------------------------
--2024-08-12 08:48:02.103473 UTC---
| Itration            | 565       |
| PAGAR Loss          | 1.36e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -3.93e+05 |
| Running Env Steps   | 2825000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 2.01      |
| Running Update Time | 565       |
-----------------------------------
--2024-08-12 08:50:26.395023 UTC---
| Itration            | 566       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -3.84e+05 |
| Running Env Steps   | 2830000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 10.4      |
| Running Update Time | 566       |
-----------------------------------
--2024-08-12 08:52:54.722873 UTC---
| Itration            | 567       |
| PAGAR Loss          | -2.57e+05 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -4.12e+05 |
| Running Env Steps   | 2835000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 36.3      |
| Running Update Time | 567       |
-----------------------------------
--2024-08-12 08:55:22.827837 UTC---
| Itration            | 568       |
| PAGAR Loss          | -3.05e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -5.24e+05 |
| Running Env Steps   | 2840000   |
| Running Forward KL  | 7.17      |
| Running Reverse KL  | 3.97      |
| Running Update Time | 568       |
-----------------------------------
--2024-08-12 08:57:50.256995 UTC---
| Itration            | 569       |
| PAGAR Loss          | -3.23e+05 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -3.76e+05 |
| Running Env Steps   | 2845000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 2.3       |
| Running Update Time | 569       |
-----------------------------------
--2024-08-12 09:00:18.434201 UTC---
| Itration            | 570       |
| PAGAR Loss          | -7.78e+03 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -5.34e+05 |
| Running Env Steps   | 2850000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 3.16      |
| Running Update Time | 570       |
-----------------------------------
--2024-08-12 09:02:42.379177 UTC---
| Itration            | 571       |
| PAGAR Loss          | -1.54e+06 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -6.35e+05 |
| Running Env Steps   | 2855000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 34.2      |
| Running Update Time | 571       |
-----------------------------------
--2024-08-12 09:05:09.337032 UTC---
| Itration            | 572       |
| PAGAR Loss          | 7.16e+03  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 2860000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 3.65      |
| Running Update Time | 572       |
-----------------------------------
--2024-08-12 09:07:35.278393 UTC---
| Itration            | 573       |
| PAGAR Loss          | -7.96e+04 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -4.15e+05 |
| Running Env Steps   | 2865000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 2.44      |
| Running Update Time | 573       |
-----------------------------------
--2024-08-12 09:10:01.510061 UTC---
| Itration            | 574       |
| PAGAR Loss          | 8.83e+05  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -3.49e+05 |
| Running Env Steps   | 2870000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 574       |
-----------------------------------
--2024-08-12 09:12:30.442734 UTC---
| Itration            | 575       |
| PAGAR Loss          | -1.4e+05  |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -5.06e+05 |
| Running Env Steps   | 2875000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 3.23      |
| Running Update Time | 575       |
-----------------------------------
--2024-08-12 09:14:57.601094 UTC---
| Itration            | 576       |
| PAGAR Loss          | 6.21e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -4.07e+05 |
| Running Env Steps   | 2880000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 576       |
-----------------------------------
--2024-08-12 09:17:26.963560 UTC---
| Itration            | 577       |
| PAGAR Loss          | -1.23e+04 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -7.34e+05 |
| Running Env Steps   | 2885000   |
| Running Forward KL  | 6.62      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 577       |
-----------------------------------
--2024-08-12 09:19:52.444308 UTC---
| Itration            | 578       |
| PAGAR Loss          | 7.25e+04  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.23e+05 |
| Running Env Steps   | 2890000   |
| Running Forward KL  | 5.32      |
| Running Reverse KL  | 1.84      |
| Running Update Time | 578       |
-----------------------------------
--2024-08-12 09:22:19.092989 UTC---
| Itration            | 579       |
| PAGAR Loss          | -7.03e+03 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -6.56e+05 |
| Running Env Steps   | 2895000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 2.63      |
| Running Update Time | 579       |
-----------------------------------
--2024-08-12 09:24:46.160081 UTC--
| Itration            | 580      |
| PAGAR Loss          | 3.8e+03  |
| Real Det Return     | 5.3e+03  |
| Real Sto Return     | 5.25e+03 |
| Reward Loss         | -2.8e+05 |
| Running Env Steps   | 2900000  |
| Running Forward KL  | 5.79     |
| Running Reverse KL  | 3.06     |
| Running Update Time | 580      |
----------------------------------
--2024-08-12 09:27:11.635293 UTC---
| Itration            | 581       |
| PAGAR Loss          | -1.27e+05 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -2.58e+05 |
| Running Env Steps   | 2905000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 1.92      |
| Running Update Time | 581       |
-----------------------------------
--2024-08-12 09:29:40.622430 UTC---
| Itration            | 582       |
| PAGAR Loss          | -5.06e+04 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -4.42e+05 |
| Running Env Steps   | 2910000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 2.65      |
| Running Update Time | 582       |
-----------------------------------
--2024-08-12 09:32:07.402746 UTC---
| Itration            | 583       |
| PAGAR Loss          | 2.52e+03  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.91e+05 |
| Running Env Steps   | 2915000   |
| Running Forward KL  | 5.91      |
| Running Reverse KL  | 2.51      |
| Running Update Time | 583       |
-----------------------------------
--2024-08-12 09:34:33.772470 UTC---
| Itration            | 584       |
| PAGAR Loss          | -5.18e+04 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 2920000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 38.9      |
| Running Update Time | 584       |
-----------------------------------
--2024-08-12 09:36:58.189500 UTC---
| Itration            | 585       |
| PAGAR Loss          | -2.51e+04 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -3.15e+05 |
| Running Env Steps   | 2925000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 2.25      |
| Running Update Time | 585       |
-----------------------------------
--2024-08-12 09:39:27.371441 UTC---
| Itration            | 586       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -1.09e+05 |
| Running Env Steps   | 2930000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 2.11      |
| Running Update Time | 586       |
-----------------------------------
--2024-08-12 09:41:54.373550 UTC---
| Itration            | 587       |
| PAGAR Loss          | 4.27e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -8.75e+04 |
| Running Env Steps   | 2935000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 3.06      |
| Running Update Time | 587       |
-----------------------------------
--2024-08-12 09:44:21.574370 UTC---
| Itration            | 588       |
| PAGAR Loss          | 6.24e+04  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -3.29e+05 |
| Running Env Steps   | 2940000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 3.61      |
| Running Update Time | 588       |
-----------------------------------
--2024-08-12 09:46:49.469101 UTC---
| Itration            | 589       |
| PAGAR Loss          | -1.68e+05 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -1.65e+05 |
| Running Env Steps   | 2945000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 589       |
-----------------------------------
--2024-08-12 09:49:14.436473 UTC---
| Itration            | 590       |
| PAGAR Loss          | 8.56e+04  |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -7.49e+05 |
| Running Env Steps   | 2950000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 590       |
-----------------------------------
--2024-08-12 09:51:42.586225 UTC---
| Itration            | 591       |
| PAGAR Loss          | -2.18e+04 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -4.4e+05  |
| Running Env Steps   | 2955000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 2.95      |
| Running Update Time | 591       |
-----------------------------------
--2024-08-12 09:54:07.323390 UTC---
| Itration            | 592       |
| PAGAR Loss          | -1.97e+04 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -3.54e+05 |
| Running Env Steps   | 2960000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 592       |
-----------------------------------
--2024-08-12 09:56:33.674025 UTC---
| Itration            | 593       |
| PAGAR Loss          | 1.31e+05  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -4.63e+05 |
| Running Env Steps   | 2965000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 2.94      |
| Running Update Time | 593       |
-----------------------------------
--2024-08-12 09:59:02.732291 UTC---
| Itration            | 594       |
| PAGAR Loss          | 1.15e+06  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -4.32e+05 |
| Running Env Steps   | 2970000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 2.41      |
| Running Update Time | 594       |
-----------------------------------
--2024-08-12 10:01:28.159152 UTC---
| Itration            | 595       |
| PAGAR Loss          | 2.56e+06  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -8.95e+05 |
| Running Env Steps   | 2975000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 595       |
-----------------------------------
--2024-08-12 10:03:57.450148 UTC---
| Itration            | 596       |
| PAGAR Loss          | 4.82e+03  |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -8.86e+05 |
| Running Env Steps   | 2980000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 3         |
| Running Update Time | 596       |
-----------------------------------
--2024-08-12 10:06:23.524072 UTC---
| Itration            | 597       |
| PAGAR Loss          | 5.32e+04  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -5.39e+05 |
| Running Env Steps   | 2985000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 3.02      |
| Running Update Time | 597       |
-----------------------------------
--2024-08-12 10:08:51.597005 UTC---
| Itration            | 598       |
| PAGAR Loss          | 4.18e+05  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -2.84e+05 |
| Running Env Steps   | 2990000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 3.19      |
| Running Update Time | 598       |
-----------------------------------
--2024-08-12 10:11:19.047153 UTC---
| Itration            | 599       |
| PAGAR Loss          | 4.02e+04  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -6.13e+05 |
| Running Env Steps   | 2995000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 2.82      |
| Running Update Time | 599       |
-----------------------------------
--2024-08-12 10:13:46.045104 UTC---
| Itration            | 600       |
| PAGAR Loss          | -1.08e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -3.16e+05 |
| Running Env Steps   | 3000000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 600       |
-----------------------------------
--2024-08-12 10:16:14.475425 UTC---
| Itration            | 601       |
| PAGAR Loss          | -4.61e+04 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -2.42e+05 |
| Running Env Steps   | 3005000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 2.12      |
| Running Update Time | 601       |
-----------------------------------
--2024-08-12 10:18:39.930500 UTC---
| Itration            | 602       |
| PAGAR Loss          | -3.1e+06  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.05e+05 |
| Running Env Steps   | 3010000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 2.83      |
| Running Update Time | 602       |
-----------------------------------
--2024-08-12 10:21:07.493047 UTC--
| Itration            | 603      |
| PAGAR Loss          | 2.64e+05 |
| Real Det Return     | 5.31e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | -4.8e+05 |
| Running Env Steps   | 3015000  |
| Running Forward KL  | 6.47     |
| Running Reverse KL  | 3.22     |
| Running Update Time | 603      |
----------------------------------
--2024-08-12 10:23:33.510934 UTC---
| Itration            | 604       |
| PAGAR Loss          | 1.09e+04  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -4.81e+05 |
| Running Env Steps   | 3020000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 604       |
-----------------------------------
--2024-08-12 10:25:57.863593 UTC---
| Itration            | 605       |
| PAGAR Loss          | 1.07e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -2.66e+05 |
| Running Env Steps   | 3025000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 2.43      |
| Running Update Time | 605       |
-----------------------------------
--2024-08-12 10:28:23.922644 UTC---
| Itration            | 606       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -3.46e+05 |
| Running Env Steps   | 3030000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 2.28      |
| Running Update Time | 606       |
-----------------------------------
--2024-08-12 10:30:52.106897 UTC---
| Itration            | 607       |
| PAGAR Loss          | -1.2e+05  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.94e+05 |
| Running Env Steps   | 3035000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 1.99      |
| Running Update Time | 607       |
-----------------------------------
--2024-08-12 10:33:20.527638 UTC---
| Itration            | 608       |
| PAGAR Loss          | 2.94e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -8.54e+05 |
| Running Env Steps   | 3040000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 39.6      |
| Running Update Time | 608       |
-----------------------------------
--2024-08-12 10:35:46.235293 UTC---
| Itration            | 609       |
| PAGAR Loss          | 3.63e+05  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -2.22e+05 |
| Running Env Steps   | 3045000   |
| Running Forward KL  | 5.31      |
| Running Reverse KL  | 2.33      |
| Running Update Time | 609       |
-----------------------------------
--2024-08-12 10:38:14.894593 UTC---
| Itration            | 610       |
| PAGAR Loss          | -7.02e+04 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -3.34e+05 |
| Running Env Steps   | 3050000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 610       |
-----------------------------------
--2024-08-12 10:40:42.908774 UTC---
| Itration            | 611       |
| PAGAR Loss          | 4.36e+03  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -2.46e+05 |
| Running Env Steps   | 3055000   |
| Running Forward KL  | 5.99      |
| Running Reverse KL  | 3.02      |
| Running Update Time | 611       |
-----------------------------------
--2024-08-12 10:43:11.164001 UTC---
| Itration            | 612       |
| PAGAR Loss          | -7.45e+04 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -5.13e+05 |
| Running Env Steps   | 3060000   |
| Running Forward KL  | 6.87      |
| Running Reverse KL  | 2.94      |
| Running Update Time | 612       |
-----------------------------------
--2024-08-12 10:45:40.341187 UTC---
| Itration            | 613       |
| PAGAR Loss          | 4.19e+04  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -7.36e+05 |
| Running Env Steps   | 3065000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 2.87      |
| Running Update Time | 613       |
-----------------------------------
--2024-08-12 10:48:06.315079 UTC---
| Itration            | 614       |
| PAGAR Loss          | -2.33e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.52e+03  |
| Reward Loss         | -6.51e+05 |
| Running Env Steps   | 3070000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 2.04      |
| Running Update Time | 614       |
-----------------------------------
--2024-08-12 10:50:36.660641 UTC---
| Itration            | 615       |
| PAGAR Loss          | -1.66e+04 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -2.06e+05 |
| Running Env Steps   | 3075000   |
| Running Forward KL  | 6.71      |
| Running Reverse KL  | 3.78      |
| Running Update Time | 615       |
-----------------------------------
--2024-08-12 10:52:58.110214 UTC---
| Itration            | 616       |
| PAGAR Loss          | -6.75e+04 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -9.24e+05 |
| Running Env Steps   | 3080000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 78.7      |
| Running Update Time | 616       |
-----------------------------------
--2024-08-12 10:55:25.182902 UTC---
| Itration            | 617       |
| PAGAR Loss          | -8.24e+04 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -4.65e+05 |
| Running Env Steps   | 3085000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 617       |
-----------------------------------
--2024-08-12 10:57:50.706154 UTC---
| Itration            | 618       |
| PAGAR Loss          | -1.12e+04 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -6.17e+05 |
| Running Env Steps   | 3090000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 9.01      |
| Running Update Time | 618       |
-----------------------------------
--2024-08-12 11:00:18.121093 UTC---
| Itration            | 619       |
| PAGAR Loss          | -1.06e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -3.68e+05 |
| Running Env Steps   | 3095000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 2.51      |
| Running Update Time | 619       |
-----------------------------------
--2024-08-12 11:02:45.337141 UTC---
| Itration            | 620       |
| PAGAR Loss          | -5.96e+05 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -4.67e+05 |
| Running Env Steps   | 3100000   |
| Running Forward KL  | 6.21      |
| Running Reverse KL  | 15        |
| Running Update Time | 620       |
-----------------------------------
--2024-08-12 11:05:07.007788 UTC---
| Itration            | 621       |
| PAGAR Loss          | 1.47e+07  |
| Real Det Return     | 4.44e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 3105000   |
| Running Forward KL  | 6.46      |
| Running Reverse KL  | 39.4      |
| Running Update Time | 621       |
-----------------------------------
--2024-08-12 11:07:36.715447 UTC---
| Itration            | 622       |
| PAGAR Loss          | 1.84e+05  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -4.38e+05 |
| Running Env Steps   | 3110000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 5.95      |
| Running Update Time | 622       |
-----------------------------------
--2024-08-12 11:10:08.073903 UTC---
| Itration            | 623       |
| PAGAR Loss          | -1.52e+05 |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.3e+06  |
| Running Env Steps   | 3115000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 3.27      |
| Running Update Time | 623       |
-----------------------------------
--2024-08-12 11:12:36.273193 UTC---
| Itration            | 624       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -5.08e+05 |
| Running Env Steps   | 3120000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 31.5      |
| Running Update Time | 624       |
-----------------------------------
--2024-08-12 11:15:05.624795 UTC---
| Itration            | 625       |
| PAGAR Loss          | -5.65e+04 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -7.44e+05 |
| Running Env Steps   | 3125000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 2.47      |
| Running Update Time | 625       |
-----------------------------------
--2024-08-12 11:17:32.979566 UTC---
| Itration            | 626       |
| PAGAR Loss          | -7.39e+04 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -4.7e+05  |
| Running Env Steps   | 3130000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 2.29      |
| Running Update Time | 626       |
-----------------------------------
--2024-08-12 11:20:01.033878 UTC---
| Itration            | 627       |
| PAGAR Loss          | 1.33e+04  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -4.47e+05 |
| Running Env Steps   | 3135000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 3.13      |
| Running Update Time | 627       |
-----------------------------------
--2024-08-12 11:22:28.856140 UTC---
| Itration            | 628       |
| PAGAR Loss          | -7.11e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -3.57e+05 |
| Running Env Steps   | 3140000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 2.13      |
| Running Update Time | 628       |
-----------------------------------
--2024-08-12 11:24:56.586979 UTC---
| Itration            | 629       |
| PAGAR Loss          | 3.01e+05  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.82e+05 |
| Running Env Steps   | 3145000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 629       |
-----------------------------------
--2024-08-12 11:27:26.317630 UTC---
| Itration            | 630       |
| PAGAR Loss          | -3.69e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -3.31e+05 |
| Running Env Steps   | 3150000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 2.18      |
| Running Update Time | 630       |
-----------------------------------
--2024-08-12 11:29:53.545547 UTC---
| Itration            | 631       |
| PAGAR Loss          | -2.95e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -5.52e+05 |
| Running Env Steps   | 3155000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 2.32      |
| Running Update Time | 631       |
-----------------------------------
--2024-08-12 11:32:21.999332 UTC---
| Itration            | 632       |
| PAGAR Loss          | 3.2e+08   |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -3.23e+05 |
| Running Env Steps   | 3160000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 1.85      |
| Running Update Time | 632       |
-----------------------------------
--2024-08-12 11:34:48.431410 UTC---
| Itration            | 633       |
| PAGAR Loss          | -2.19e+05 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.1e+06  |
| Running Env Steps   | 3165000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 34        |
| Running Update Time | 633       |
-----------------------------------
--2024-08-12 11:37:14.398804 UTC---
| Itration            | 634       |
| PAGAR Loss          | -2.08e+06 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -1.88e+06 |
| Running Env Steps   | 3170000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 83.6      |
| Running Update Time | 634       |
-----------------------------------
--2024-08-12 11:39:40.124027 UTC---
| Itration            | 635       |
| PAGAR Loss          | 3.26e+04  |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 3175000   |
| Running Forward KL  | 6.73      |
| Running Reverse KL  | 36.1      |
| Running Update Time | 635       |
-----------------------------------
--2024-08-12 11:42:06.354241 UTC---
| Itration            | 636       |
| PAGAR Loss          | 1.42e+04  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.45e+05 |
| Running Env Steps   | 3180000   |
| Running Forward KL  | 7.53      |
| Running Reverse KL  | 3.57      |
| Running Update Time | 636       |
-----------------------------------
--2024-08-12 11:44:36.506650 UTC---
| Itration            | 637       |
| PAGAR Loss          | -1.23e+04 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -6.92e+05 |
| Running Env Steps   | 3185000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 2.29      |
| Running Update Time | 637       |
-----------------------------------
--2024-08-12 11:47:04.205846 UTC--
| Itration            | 638      |
| PAGAR Loss          | 1.6e+04  |
| Real Det Return     | 5.14e+03 |
| Real Sto Return     | 5.15e+03 |
| Reward Loss         | -2.8e+06 |
| Running Env Steps   | 3190000  |
| Running Forward KL  | 6.67     |
| Running Reverse KL  | 66.8     |
| Running Update Time | 638      |
----------------------------------
--2024-08-12 11:49:32.628979 UTC---
| Itration            | 639       |
| PAGAR Loss          | -1.1e+05  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.73e+05 |
| Running Env Steps   | 3195000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 4.67      |
| Running Update Time | 639       |
-----------------------------------
--2024-08-12 11:51:57.856090 UTC---
| Itration            | 640       |
| PAGAR Loss          | -4.28e+04 |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 3200000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 36.6      |
| Running Update Time | 640       |
-----------------------------------
--2024-08-12 11:54:25.268429 UTC---
| Itration            | 641       |
| PAGAR Loss          | -9.7e+05  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -4.69e+05 |
| Running Env Steps   | 3205000   |
| Running Forward KL  | 6.02      |
| Running Reverse KL  | 17.9      |
| Running Update Time | 641       |
-----------------------------------
--2024-08-12 11:56:54.250983 UTC---
| Itration            | 642       |
| PAGAR Loss          | -2.3e+05  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -7.74e+05 |
| Running Env Steps   | 3210000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 32.2      |
| Running Update Time | 642       |
-----------------------------------
--2024-08-12 11:59:19.414976 UTC---
| Itration            | 643       |
| PAGAR Loss          | 2.43e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -2.87e+05 |
| Running Env Steps   | 3215000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 3.27      |
| Running Update Time | 643       |
-----------------------------------
--2024-08-12 12:01:48.871331 UTC---
| Itration            | 644       |
| PAGAR Loss          | -3.98e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -7.38e+05 |
| Running Env Steps   | 3220000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 2.85      |
| Running Update Time | 644       |
-----------------------------------
--2024-08-12 12:04:12.861319 UTC---
| Itration            | 645       |
| PAGAR Loss          | 1.75e+05  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -2.75e+05 |
| Running Env Steps   | 3225000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 2.8       |
| Running Update Time | 645       |
-----------------------------------
--2024-08-12 12:06:41.910982 UTC---
| Itration            | 646       |
| PAGAR Loss          | -2.41e+05 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.93e+06 |
| Running Env Steps   | 3230000   |
| Running Forward KL  | 6.69      |
| Running Reverse KL  | 36.5      |
| Running Update Time | 646       |
-----------------------------------
--2024-08-12 12:09:04.644330 UTC---
| Itration            | 647       |
| PAGAR Loss          | 9.64e+04  |
| Real Det Return     | 4.05e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 3235000   |
| Running Forward KL  | 7.4       |
| Running Reverse KL  | 98.1      |
| Running Update Time | 647       |
-----------------------------------
--2024-08-12 12:11:34.852056 UTC---
| Itration            | 648       |
| PAGAR Loss          | 3.33e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -3.47e+05 |
| Running Env Steps   | 3240000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 2.72      |
| Running Update Time | 648       |
-----------------------------------
--2024-08-12 12:14:06.134030 UTC---
| Itration            | 649       |
| PAGAR Loss          | -1.02e+05 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -3.27e+05 |
| Running Env Steps   | 3245000   |
| Running Forward KL  | 6.31      |
| Running Reverse KL  | 2.57      |
| Running Update Time | 649       |
-----------------------------------
--2024-08-12 12:16:29.190847 UTC---
| Itration            | 650       |
| PAGAR Loss          | -1.41e+05 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.69e+05 |
| Running Env Steps   | 3250000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 2.63      |
| Running Update Time | 650       |
-----------------------------------
--2024-08-12 12:19:01.042049 UTC---
| Itration            | 651       |
| PAGAR Loss          | 8.29e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -4.31e+05 |
| Running Env Steps   | 3255000   |
| Running Forward KL  | 6.92      |
| Running Reverse KL  | 3.29      |
| Running Update Time | 651       |
-----------------------------------
--2024-08-12 12:21:28.126641 UTC---
| Itration            | 652       |
| PAGAR Loss          | 5.28e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -7.95e+05 |
| Running Env Steps   | 3260000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 2.77      |
| Running Update Time | 652       |
-----------------------------------
--2024-08-12 12:23:57.210237 UTC---
| Itration            | 653       |
| PAGAR Loss          | -3.33e+04 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -2.51e+05 |
| Running Env Steps   | 3265000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 2.71      |
| Running Update Time | 653       |
-----------------------------------
--2024-08-12 12:26:24.952715 UTC---
| Itration            | 654       |
| PAGAR Loss          | 2.19e+04  |
| Real Det Return     | 5.12e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -3.17e+05 |
| Running Env Steps   | 3270000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 654       |
-----------------------------------
--2024-08-12 12:28:50.923517 UTC---
| Itration            | 655       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 3275000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 77.5      |
| Running Update Time | 655       |
-----------------------------------
--2024-08-12 12:31:17.794887 UTC---
| Itration            | 656       |
| PAGAR Loss          | 5.92e+06  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -5.42e+05 |
| Running Env Steps   | 3280000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 2.75      |
| Running Update Time | 656       |
-----------------------------------
--2024-08-12 12:33:45.819391 UTC---
| Itration            | 657       |
| PAGAR Loss          | 1.33e+06  |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 3285000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 3.13      |
| Running Update Time | 657       |
-----------------------------------
--2024-08-12 12:36:13.015210 UTC---
| Itration            | 658       |
| PAGAR Loss          | 2.26e+06  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -4.46e+05 |
| Running Env Steps   | 3290000   |
| Running Forward KL  | 6.86      |
| Running Reverse KL  | 3.58      |
| Running Update Time | 658       |
-----------------------------------
--2024-08-12 12:38:37.005539 UTC---
| Itration            | 659       |
| PAGAR Loss          | -1.28e+04 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -4.47e+05 |
| Running Env Steps   | 3295000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 2.58      |
| Running Update Time | 659       |
-----------------------------------
--2024-08-12 12:41:05.602238 UTC---
| Itration            | 660       |
| PAGAR Loss          | -2.6e+05  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -3.17e+05 |
| Running Env Steps   | 3300000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 2.69      |
| Running Update Time | 660       |
-----------------------------------
--2024-08-12 12:43:31.307231 UTC--
| Itration            | 661      |
| PAGAR Loss          | 2.02e+05 |
| Real Det Return     | 5.29e+03 |
| Real Sto Return     | 5.19e+03 |
| Reward Loss         | -8.3e+05 |
| Running Env Steps   | 3305000  |
| Running Forward KL  | 6.19     |
| Running Reverse KL  | 2.34     |
| Running Update Time | 661      |
----------------------------------
--2024-08-12 12:46:00.783240 UTC---
| Itration            | 662       |
| PAGAR Loss          | -2.15e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -3.99e+05 |
| Running Env Steps   | 3310000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 2.3       |
| Running Update Time | 662       |
-----------------------------------
--2024-08-12 12:48:26.130634 UTC---
| Itration            | 663       |
| PAGAR Loss          | 7.03e+03  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -1.49e+06 |
| Running Env Steps   | 3315000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 77.2      |
| Running Update Time | 663       |
-----------------------------------
--2024-08-12 12:50:53.770163 UTC---
| Itration            | 664       |
| PAGAR Loss          | 3.21e+06  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -7.52e+05 |
| Running Env Steps   | 3320000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 2.73      |
| Running Update Time | 664       |
-----------------------------------
--2024-08-12 12:53:25.504771 UTC---
| Itration            | 665       |
| PAGAR Loss          | 6.6e+04   |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -6.65e+05 |
| Running Env Steps   | 3325000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 3.18      |
| Running Update Time | 665       |
-----------------------------------
--2024-08-12 12:55:54.250326 UTC---
| Itration            | 666       |
| PAGAR Loss          | -3.79e+05 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -9.39e+05 |
| Running Env Steps   | 3330000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 8.52      |
| Running Update Time | 666       |
-----------------------------------
--2024-08-12 12:58:24.896792 UTC---
| Itration            | 667       |
| PAGAR Loss          | 4.88e+06  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -6.82e+05 |
| Running Env Steps   | 3335000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 2.59      |
| Running Update Time | 667       |
-----------------------------------
--2024-08-12 13:00:57.373915 UTC---
| Itration            | 668       |
| PAGAR Loss          | 1.32e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -8.57e+05 |
| Running Env Steps   | 3340000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 2.96      |
| Running Update Time | 668       |
-----------------------------------
--2024-08-12 13:03:24.971019 UTC---
| Itration            | 669       |
| PAGAR Loss          | -6.6e+04  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -9.85e+05 |
| Running Env Steps   | 3345000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 2.42      |
| Running Update Time | 669       |
-----------------------------------
--2024-08-12 13:05:56.043618 UTC---
| Itration            | 670       |
| PAGAR Loss          | -4.08e+05 |
| Real Det Return     | 4.97e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 3350000   |
| Running Forward KL  | 6.94      |
| Running Reverse KL  | 24.9      |
| Running Update Time | 670       |
-----------------------------------
--2024-08-12 13:08:23.980941 UTC---
| Itration            | 671       |
| PAGAR Loss          | -8.32e+05 |
| Real Det Return     | 5.04e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -3.37e+06 |
| Running Env Steps   | 3355000   |
| Running Forward KL  | 6.67      |
| Running Reverse KL  | 70.6      |
| Running Update Time | 671       |
-----------------------------------
--2024-08-12 13:10:52.953871 UTC---
| Itration            | 672       |
| PAGAR Loss          | -1.73e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -3.67e+05 |
| Running Env Steps   | 3360000   |
| Running Forward KL  | 6.73      |
| Running Reverse KL  | 2.87      |
| Running Update Time | 672       |
-----------------------------------
--2024-08-12 13:13:21.699540 UTC---
| Itration            | 673       |
| PAGAR Loss          | -1.66e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -7.31e+05 |
| Running Env Steps   | 3365000   |
| Running Forward KL  | 6.98      |
| Running Reverse KL  | 17.4      |
| Running Update Time | 673       |
-----------------------------------
--2024-08-12 13:15:49.686251 UTC---
| Itration            | 674       |
| PAGAR Loss          | -3.28e+04 |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 3370000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 36.2      |
| Running Update Time | 674       |
-----------------------------------
--2024-08-12 13:18:19.507406 UTC---
| Itration            | 675       |
| PAGAR Loss          | -4.35e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.36e+05 |
| Running Env Steps   | 3375000   |
| Running Forward KL  | 7.26      |
| Running Reverse KL  | 26.5      |
| Running Update Time | 675       |
-----------------------------------
--2024-08-12 13:20:47.905370 UTC---
| Itration            | 676       |
| PAGAR Loss          | -9.54e+04 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -5.67e+05 |
| Running Env Steps   | 3380000   |
| Running Forward KL  | 6.69      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 676       |
-----------------------------------
--2024-08-12 13:23:17.788307 UTC---
| Itration            | 677       |
| PAGAR Loss          | -4.2e+03  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -5.76e+05 |
| Running Env Steps   | 3385000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 2.7       |
| Running Update Time | 677       |
-----------------------------------
--2024-08-12 13:25:47.064000 UTC---
| Itration            | 678       |
| PAGAR Loss          | 8.53e+03  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.95e+05 |
| Running Env Steps   | 3390000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 678       |
-----------------------------------
--2024-08-12 13:28:18.383409 UTC---
| Itration            | 679       |
| PAGAR Loss          | 2.84e+04  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 3395000   |
| Running Forward KL  | 7.03      |
| Running Reverse KL  | 3.93      |
| Running Update Time | 679       |
-----------------------------------
--2024-08-12 13:30:47.005791 UTC--
| Itration            | 680      |
| PAGAR Loss          | 9.26e+03 |
| Real Det Return     | 5.3e+03  |
| Real Sto Return     | 5.18e+03 |
| Reward Loss         | -1e+06   |
| Running Env Steps   | 3400000  |
| Running Forward KL  | 6.99     |
| Running Reverse KL  | 3.8      |
| Running Update Time | 680      |
----------------------------------
--2024-08-12 13:33:16.010905 UTC---
| Itration            | 681       |
| PAGAR Loss          | -6.01e+04 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -5.21e+05 |
| Running Env Steps   | 3405000   |
| Running Forward KL  | 6.1       |
| Running Reverse KL  | 3.15      |
| Running Update Time | 681       |
-----------------------------------
--2024-08-12 13:35:44.172752 UTC---
| Itration            | 682       |
| PAGAR Loss          | 1.97e+07  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -7.09e+05 |
| Running Env Steps   | 3410000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 14.5      |
| Running Update Time | 682       |
-----------------------------------
--2024-08-12 13:38:13.457697 UTC---
| Itration            | 683       |
| PAGAR Loss          | 2.31e+04  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.11e+05 |
| Running Env Steps   | 3415000   |
| Running Forward KL  | 7.03      |
| Running Reverse KL  | 3.19      |
| Running Update Time | 683       |
-----------------------------------
--2024-08-12 13:40:42.701378 UTC---
| Itration            | 684       |
| PAGAR Loss          | 5.95e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -4.16e+05 |
| Running Env Steps   | 3420000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 3.07      |
| Running Update Time | 684       |
-----------------------------------
--2024-08-12 13:43:05.571192 UTC---
| Itration            | 685       |
| PAGAR Loss          | -1.04e+05 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -2.12e+06 |
| Running Env Steps   | 3425000   |
| Running Forward KL  | 7.42      |
| Running Reverse KL  | 78.7      |
| Running Update Time | 685       |
-----------------------------------
--2024-08-12 13:45:35.086739 UTC---
| Itration            | 686       |
| PAGAR Loss          | -1.37e+04 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -5.71e+05 |
| Running Env Steps   | 3430000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 17        |
| Running Update Time | 686       |
-----------------------------------
--2024-08-12 13:48:02.336507 UTC---
| Itration            | 687       |
| PAGAR Loss          | -3.64e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 3435000   |
| Running Forward KL  | 6.93      |
| Running Reverse KL  | 40.6      |
| Running Update Time | 687       |
-----------------------------------
--2024-08-12 13:50:29.491316 UTC---
| Itration            | 688       |
| PAGAR Loss          | -8.69e+03 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -3.79e+05 |
| Running Env Steps   | 3440000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 3.53      |
| Running Update Time | 688       |
-----------------------------------
--2024-08-12 13:52:58.990906 UTC---
| Itration            | 689       |
| PAGAR Loss          | -2.83e+05 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -3.21e+05 |
| Running Env Steps   | 3445000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 2.56      |
| Running Update Time | 689       |
-----------------------------------
--2024-08-12 13:55:25.820046 UTC---
| Itration            | 690       |
| PAGAR Loss          | -7.26e+03 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -6.35e+05 |
| Running Env Steps   | 3450000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 6.86      |
| Running Update Time | 690       |
-----------------------------------
--2024-08-12 13:57:52.546629 UTC---
| Itration            | 691       |
| PAGAR Loss          | 9.77e+06  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -3.42e+05 |
| Running Env Steps   | 3455000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 2.55      |
| Running Update Time | 691       |
-----------------------------------
--2024-08-12 14:00:20.732013 UTC---
| Itration            | 692       |
| PAGAR Loss          | -8.24e+04 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -7.42e+05 |
| Running Env Steps   | 3460000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 3.42      |
| Running Update Time | 692       |
-----------------------------------
--2024-08-12 14:02:45.454980 UTC---
| Itration            | 693       |
| PAGAR Loss          | -5.76e+06 |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -2.08e+06 |
| Running Env Steps   | 3465000   |
| Running Forward KL  | 7.95      |
| Running Reverse KL  | 119       |
| Running Update Time | 693       |
-----------------------------------
--2024-08-12 14:05:14.899167 UTC---
| Itration            | 694       |
| PAGAR Loss          | -2.18e+04 |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -1.25e+05 |
| Running Env Steps   | 3470000   |
| Running Forward KL  | 6.86      |
| Running Reverse KL  | 3.69      |
| Running Update Time | 694       |
-----------------------------------
--2024-08-12 14:07:42.509868 UTC---
| Itration            | 695       |
| PAGAR Loss          | 2.51e+05  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -3.52e+05 |
| Running Env Steps   | 3475000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 3.46      |
| Running Update Time | 695       |
-----------------------------------
--2024-08-12 14:10:10.868376 UTC---
| Itration            | 696       |
| PAGAR Loss          | 3.57e+05  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -4.69e+05 |
| Running Env Steps   | 3480000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 3.47      |
| Running Update Time | 696       |
-----------------------------------
--2024-08-12 14:12:38.856114 UTC---
| Itration            | 697       |
| PAGAR Loss          | -3.8e+04  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -4.99e+05 |
| Running Env Steps   | 3485000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 2.91      |
| Running Update Time | 697       |
-----------------------------------
--2024-08-12 14:15:07.344885 UTC---
| Itration            | 698       |
| PAGAR Loss          | -4.71e+05 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 3490000   |
| Running Forward KL  | 6.91      |
| Running Reverse KL  | 41.5      |
| Running Update Time | 698       |
-----------------------------------
--2024-08-12 14:17:35.906917 UTC---
| Itration            | 699       |
| PAGAR Loss          | -2.15e+05 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 3495000   |
| Running Forward KL  | 7.26      |
| Running Reverse KL  | 3.84      |
| Running Update Time | 699       |
-----------------------------------
--2024-08-12 14:20:02.733004 UTC---
| Itration            | 700       |
| PAGAR Loss          | -1.44e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.26e+03  |
| Reward Loss         | -1.53e+06 |
| Running Env Steps   | 3500000   |
| Running Forward KL  | 7.04      |
| Running Reverse KL  | 79.8      |
| Running Update Time | 700       |
-----------------------------------
--2024-08-12 14:22:30.399482 UTC---
| Itration            | 701       |
| PAGAR Loss          | 1.59e+06  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -5.77e+05 |
| Running Env Steps   | 3505000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 3.12      |
| Running Update Time | 701       |
-----------------------------------
--2024-08-12 14:25:00.938537 UTC---
| Itration            | 702       |
| PAGAR Loss          | 9.91e+04  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -8.12e+05 |
| Running Env Steps   | 3510000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 3.48      |
| Running Update Time | 702       |
-----------------------------------
--2024-08-12 14:27:30.107951 UTC---
| Itration            | 703       |
| PAGAR Loss          | 5.43e+06  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -6.35e+05 |
| Running Env Steps   | 3515000   |
| Running Forward KL  | 7.14      |
| Running Reverse KL  | 6.64      |
| Running Update Time | 703       |
-----------------------------------
--2024-08-12 14:29:59.138886 UTC---
| Itration            | 704       |
| PAGAR Loss          | 1.82e+05  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -7.95e+05 |
| Running Env Steps   | 3520000   |
| Running Forward KL  | 6.89      |
| Running Reverse KL  | 3.18      |
| Running Update Time | 704       |
-----------------------------------
--2024-08-12 14:32:26.533013 UTC---
| Itration            | 705       |
| PAGAR Loss          | 4.84e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 3525000   |
| Running Forward KL  | 7.34      |
| Running Reverse KL  | 41.7      |
| Running Update Time | 705       |
-----------------------------------
--2024-08-12 14:34:52.871543 UTC---
| Itration            | 706       |
| PAGAR Loss          | -1.82e+06 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 3530000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 2.84      |
| Running Update Time | 706       |
-----------------------------------
--2024-08-12 14:37:21.815181 UTC---
| Itration            | 707       |
| PAGAR Loss          | 6.75e+04  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 3535000   |
| Running Forward KL  | 6.62      |
| Running Reverse KL  | 3.28      |
| Running Update Time | 707       |
-----------------------------------
--2024-08-12 14:39:48.868955 UTC---
| Itration            | 708       |
| PAGAR Loss          | 1.99e+05  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -4.22e+05 |
| Running Env Steps   | 3540000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 2.79      |
| Running Update Time | 708       |
-----------------------------------
--2024-08-12 14:42:15.854336 UTC---
| Itration            | 709       |
| PAGAR Loss          | -2.26e+05 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -1.55e+06 |
| Running Env Steps   | 3545000   |
| Running Forward KL  | 7.22      |
| Running Reverse KL  | 115       |
| Running Update Time | 709       |
-----------------------------------
--2024-08-12 14:44:40.961737 UTC--
| Itration            | 710      |
| PAGAR Loss          | 4.68e+04 |
| Real Det Return     | 5.4e+03  |
| Real Sto Return     | 5.14e+03 |
| Reward Loss         | -6.8e+05 |
| Running Env Steps   | 3550000  |
| Running Forward KL  | 7.35     |
| Running Reverse KL  | 3.62     |
| Running Update Time | 710      |
----------------------------------
--2024-08-12 14:47:06.854507 UTC---
| Itration            | 711       |
| PAGAR Loss          | 1.35e+05  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -8.33e+05 |
| Running Env Steps   | 3555000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 2.88      |
| Running Update Time | 711       |
-----------------------------------
--2024-08-12 14:49:36.418836 UTC---
| Itration            | 712       |
| PAGAR Loss          | 7.76e+04  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -8.36e+05 |
| Running Env Steps   | 3560000   |
| Running Forward KL  | 7.51      |
| Running Reverse KL  | 4.45      |
| Running Update Time | 712       |
-----------------------------------
--2024-08-12 14:51:59.934051 UTC---
| Itration            | 713       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 3.8e+03   |
| Reward Loss         | -3.95e+06 |
| Running Env Steps   | 3565000   |
| Running Forward KL  | 9.57      |
| Running Reverse KL  | 183       |
| Running Update Time | 713       |
-----------------------------------
--2024-08-12 14:54:26.935303 UTC---
| Itration            | 714       |
| PAGAR Loss          | 2.9e+06   |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.67e+03  |
| Reward Loss         | -4.45e+05 |
| Running Env Steps   | 3570000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 13.9      |
| Running Update Time | 714       |
-----------------------------------
--2024-08-12 14:56:53.901417 UTC---
| Itration            | 715       |
| PAGAR Loss          | -3.32e+03 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 3575000   |
| Running Forward KL  | 7.33      |
| Running Reverse KL  | 4         |
| Running Update Time | 715       |
-----------------------------------
--2024-08-12 14:59:19.505074 UTC---
| Itration            | 716       |
| PAGAR Loss          | 2.13e+05  |
| Real Det Return     | 3.46e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -2.55e+06 |
| Running Env Steps   | 3580000   |
| Running Forward KL  | 7.94      |
| Running Reverse KL  | 74.8      |
| Running Update Time | 716       |
-----------------------------------
--2024-08-12 15:01:41.262706 UTC---
| Itration            | 717       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 3.35e+03  |
| Reward Loss         | -1.54e+06 |
| Running Env Steps   | 3585000   |
| Running Forward KL  | 8.17      |
| Running Reverse KL  | 142       |
| Running Update Time | 717       |
-----------------------------------
--2024-08-12 15:04:09.609526 UTC---
| Itration            | 718       |
| PAGAR Loss          | -1.22e+05 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -9.87e+05 |
| Running Env Steps   | 3590000   |
| Running Forward KL  | 7.03      |
| Running Reverse KL  | 29.9      |
| Running Update Time | 718       |
-----------------------------------
--2024-08-12 15:06:37.174014 UTC---
| Itration            | 719       |
| PAGAR Loss          | 4.69e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 3595000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 37.9      |
| Running Update Time | 719       |
-----------------------------------
--2024-08-12 15:09:05.544102 UTC--
| Itration            | 720      |
| PAGAR Loss          | 3.16e+05 |
| Real Det Return     | 5.19e+03 |
| Real Sto Return     | 4.87e+03 |
| Reward Loss         | -1.1e+06 |
| Running Env Steps   | 3600000  |
| Running Forward KL  | 5.96     |
| Running Reverse KL  | 2.6      |
| Running Update Time | 720      |
----------------------------------
--2024-08-12 15:11:34.698350 UTC---
| Itration            | 721       |
| PAGAR Loss          | 4.41e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.73e+05 |
| Running Env Steps   | 3605000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 721       |
-----------------------------------
--2024-08-12 15:14:04.075892 UTC---
| Itration            | 722       |
| PAGAR Loss          | 2.03e+05  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.17e+06 |
| Running Env Steps   | 3610000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 722       |
-----------------------------------
--2024-08-12 15:16:33.055359 UTC---
| Itration            | 723       |
| PAGAR Loss          | 6.38e+05  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -6.41e+05 |
| Running Env Steps   | 3615000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 3.41      |
| Running Update Time | 723       |
-----------------------------------
--2024-08-12 15:19:02.230509 UTC---
| Itration            | 724       |
| PAGAR Loss          | -2.17e+05 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -5.17e+05 |
| Running Env Steps   | 3620000   |
| Running Forward KL  | 6.45      |
| Running Reverse KL  | 28.4      |
| Running Update Time | 724       |
-----------------------------------
--2024-08-12 15:21:32.382608 UTC---
| Itration            | 725       |
| PAGAR Loss          | -1.26e+04 |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.25e+05 |
| Running Env Steps   | 3625000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 23.8      |
| Running Update Time | 725       |
-----------------------------------
--2024-08-12 15:23:58.735771 UTC---
| Itration            | 726       |
| PAGAR Loss          | 6.92e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 3630000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 2.62      |
| Running Update Time | 726       |
-----------------------------------
--2024-08-12 15:26:26.829590 UTC--
| Itration            | 727      |
| PAGAR Loss          | 3.55e+06 |
| Real Det Return     | 5.45e+03 |
| Real Sto Return     | 5.16e+03 |
| Reward Loss         | -4.2e+05 |
| Running Env Steps   | 3635000  |
| Running Forward KL  | 6.16     |
| Running Reverse KL  | 2.54     |
| Running Update Time | 727      |
----------------------------------
--2024-08-12 15:28:53.309819 UTC---
| Itration            | 728       |
| PAGAR Loss          | -5.54e+04 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -8.08e+05 |
| Running Env Steps   | 3640000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 2.18      |
| Running Update Time | 728       |
-----------------------------------
--2024-08-12 15:31:22.115342 UTC---
| Itration            | 729       |
| PAGAR Loss          | 7.52e+04  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -5.01e+05 |
| Running Env Steps   | 3645000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 2.87      |
| Running Update Time | 729       |
-----------------------------------
--2024-08-12 15:33:47.645677 UTC--
| Itration            | 730      |
| PAGAR Loss          | 7.64e+04 |
| Real Det Return     | 5.26e+03 |
| Real Sto Return     | 4.54e+03 |
| Reward Loss         | -8.9e+05 |
| Running Env Steps   | 3650000  |
| Running Forward KL  | 6.61     |
| Running Reverse KL  | 2.83     |
| Running Update Time | 730      |
----------------------------------
--2024-08-12 15:36:15.076066 UTC---
| Itration            | 731       |
| PAGAR Loss          | 5.74e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -4.32e+05 |
| Running Env Steps   | 3655000   |
| Running Forward KL  | 5.88      |
| Running Reverse KL  | 2.37      |
| Running Update Time | 731       |
-----------------------------------
--2024-08-12 15:38:43.278025 UTC---
| Itration            | 732       |
| PAGAR Loss          | 3.53e+05  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -8.68e+05 |
| Running Env Steps   | 3660000   |
| Running Forward KL  | 6.71      |
| Running Reverse KL  | 3.78      |
| Running Update Time | 732       |
-----------------------------------
--2024-08-12 15:41:10.188331 UTC---
| Itration            | 733       |
| PAGAR Loss          | 1.63e+05  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 3665000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 39.6      |
| Running Update Time | 733       |
-----------------------------------
--2024-08-12 15:43:40.805381 UTC---
| Itration            | 734       |
| PAGAR Loss          | 1.59e+05  |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.27e+06 |
| Running Env Steps   | 3670000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 3.21      |
| Running Update Time | 734       |
-----------------------------------
--2024-08-12 15:46:06.515185 UTC---
| Itration            | 735       |
| PAGAR Loss          | -7.44e+03 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -8.93e+05 |
| Running Env Steps   | 3675000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 735       |
-----------------------------------
--2024-08-12 15:48:37.560834 UTC---
| Itration            | 736       |
| PAGAR Loss          | -3.3e+04  |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 3680000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 2.29      |
| Running Update Time | 736       |
-----------------------------------
--2024-08-12 15:51:05.045500 UTC---
| Itration            | 737       |
| PAGAR Loss          | -5.47e+03 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 3685000   |
| Running Forward KL  | 6.73      |
| Running Reverse KL  | 41.3      |
| Running Update Time | 737       |
-----------------------------------
--2024-08-12 15:53:40.589761 UTC---
| Itration            | 738       |
| PAGAR Loss          | 9.3e+03   |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 3690000   |
| Running Forward KL  | 7.4       |
| Running Reverse KL  | 3.63      |
| Running Update Time | 738       |
-----------------------------------
--2024-08-12 15:56:09.852940 UTC---
| Itration            | 739       |
| PAGAR Loss          | 3.71e+04  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -6.97e+05 |
| Running Env Steps   | 3695000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 739       |
-----------------------------------
--2024-08-12 15:58:38.257933 UTC---
| Itration            | 740       |
| PAGAR Loss          | -1.19e+06 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -4.3e+05  |
| Running Env Steps   | 3700000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 7.22      |
| Running Update Time | 740       |
-----------------------------------
--2024-08-12 16:01:07.328905 UTC---
| Itration            | 741       |
| PAGAR Loss          | 3.95e+03  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 3705000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 39.5      |
| Running Update Time | 741       |
-----------------------------------
--2024-08-12 16:03:34.596299 UTC--
| Itration            | 742      |
| PAGAR Loss          | 1.83e+05 |
| Real Det Return     | 5.27e+03 |
| Real Sto Return     | 4.88e+03 |
| Reward Loss         | -9.1e+05 |
| Running Env Steps   | 3710000  |
| Running Forward KL  | 6.59     |
| Running Reverse KL  | 20.1     |
| Running Update Time | 742      |
----------------------------------
--2024-08-12 16:06:05.449923 UTC---
| Itration            | 743       |
| PAGAR Loss          | -1.83e+06 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -8.02e+05 |
| Running Env Steps   | 3715000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 5.43      |
| Running Update Time | 743       |
-----------------------------------
--2024-08-12 16:08:32.346199 UTC---
| Itration            | 744       |
| PAGAR Loss          | -7.72e+04 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -5.76e+05 |
| Running Env Steps   | 3720000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 2.28      |
| Running Update Time | 744       |
-----------------------------------
--2024-08-12 16:11:02.559438 UTC---
| Itration            | 745       |
| PAGAR Loss          | -2.79e+05 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -7.73e+05 |
| Running Env Steps   | 3725000   |
| Running Forward KL  | 6.16      |
| Running Reverse KL  | 2.84      |
| Running Update Time | 745       |
-----------------------------------
--2024-08-12 16:13:31.719990 UTC---
| Itration            | 746       |
| PAGAR Loss          | -2.15e+04 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -7.67e+05 |
| Running Env Steps   | 3730000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 746       |
-----------------------------------
--2024-08-12 16:16:00.394635 UTC---
| Itration            | 747       |
| PAGAR Loss          | 1.03e+06  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.84e+05 |
| Running Env Steps   | 3735000   |
| Running Forward KL  | 6.22      |
| Running Reverse KL  | 19.3      |
| Running Update Time | 747       |
-----------------------------------
--2024-08-12 16:18:31.584635 UTC---
| Itration            | 748       |
| PAGAR Loss          | -2.2e+05  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -1.51e+06 |
| Running Env Steps   | 3740000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 30.7      |
| Running Update Time | 748       |
-----------------------------------
--2024-08-12 16:20:54.726810 UTC---
| Itration            | 749       |
| PAGAR Loss          | -2.43e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -2.9e+05  |
| Running Env Steps   | 3745000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 2.26      |
| Running Update Time | 749       |
-----------------------------------
--2024-08-12 16:23:23.598403 UTC---
| Itration            | 750       |
| PAGAR Loss          | 7.45e+06  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.74e+03  |
| Reward Loss         | -4.73e+05 |
| Running Env Steps   | 3750000   |
| Running Forward KL  | 5.85      |
| Running Reverse KL  | 17.5      |
| Running Update Time | 750       |
-----------------------------------
--2024-08-12 16:25:47.934949 UTC---
| Itration            | 751       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -3.42e+05 |
| Running Env Steps   | 3755000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 46.5      |
| Running Update Time | 751       |
-----------------------------------
--2024-08-12 16:28:16.613625 UTC---
| Itration            | 752       |
| PAGAR Loss          | 2.64e+07  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -1.53e+06 |
| Running Env Steps   | 3760000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 83        |
| Running Update Time | 752       |
-----------------------------------
--2024-08-12 16:30:45.610262 UTC---
| Itration            | 753       |
| PAGAR Loss          | 6.45e+05  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -1.98e+05 |
| Running Env Steps   | 3765000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 2.71      |
| Running Update Time | 753       |
-----------------------------------
--2024-08-12 16:33:14.856687 UTC--
| Itration            | 754      |
| PAGAR Loss          | 4.56e+05 |
| Real Det Return     | 5.41e+03 |
| Real Sto Return     | 5.29e+03 |
| Reward Loss         | -4.9e+05 |
| Running Env Steps   | 3770000  |
| Running Forward KL  | 5.83     |
| Running Reverse KL  | 2.56     |
| Running Update Time | 754      |
----------------------------------
--2024-08-12 16:35:45.840184 UTC---
| Itration            | 755       |
| PAGAR Loss          | 4.74e+05  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -6.28e+05 |
| Running Env Steps   | 3775000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 34.1      |
| Running Update Time | 755       |
-----------------------------------
--2024-08-12 16:38:16.729089 UTC---
| Itration            | 756       |
| PAGAR Loss          | 6.41e+09  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -6.82e+05 |
| Running Env Steps   | 3780000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 31.8      |
| Running Update Time | 756       |
-----------------------------------
--2024-08-12 16:40:47.948560 UTC---
| Itration            | 757       |
| PAGAR Loss          | -2.06e+05 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -6.89e+05 |
| Running Env Steps   | 3785000   |
| Running Forward KL  | 6.4       |
| Running Reverse KL  | 9.73      |
| Running Update Time | 757       |
-----------------------------------
--2024-08-12 16:43:16.971983 UTC---
| Itration            | 758       |
| PAGAR Loss          | 4.21e+04  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -7.65e+05 |
| Running Env Steps   | 3790000   |
| Running Forward KL  | 6.93      |
| Running Reverse KL  | 3.14      |
| Running Update Time | 758       |
-----------------------------------
--2024-08-12 16:45:45.720893 UTC---
| Itration            | 759       |
| PAGAR Loss          | -1.23e+05 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -5.16e+05 |
| Running Env Steps   | 3795000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 3.78      |
| Running Update Time | 759       |
-----------------------------------
--2024-08-12 16:48:14.535266 UTC---
| Itration            | 760       |
| PAGAR Loss          | -9.17e+05 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -3.3e+06  |
| Running Env Steps   | 3800000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 40.7      |
| Running Update Time | 760       |
-----------------------------------
--2024-08-12 16:51:08.366950 UTC---
| Itration            | 761       |
| PAGAR Loss          | -4.21e+05 |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 4.38e+03  |
| Reward Loss         | -3.65e+05 |
| Running Env Steps   | 3805000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 18.5      |
| Running Update Time | 761       |
-----------------------------------
--2024-08-12 16:54:08.425015 UTC---
| Itration            | 762       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -7.82e+05 |
| Running Env Steps   | 3810000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 2.71      |
| Running Update Time | 762       |
-----------------------------------
--2024-08-12 16:57:01.705679 UTC---
| Itration            | 763       |
| PAGAR Loss          | -2.82e+06 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -2.97e+05 |
| Running Env Steps   | 3815000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 14.7      |
| Running Update Time | 763       |
-----------------------------------
--2024-08-12 16:59:48.293011 UTC---
| Itration            | 764       |
| PAGAR Loss          | -1.22e+08 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 3820000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 12.6      |
| Running Update Time | 764       |
-----------------------------------
--2024-08-12 17:02:28.497173 UTC---
| Itration            | 765       |
| PAGAR Loss          | -3.67e+04 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -5.3e+05  |
| Running Env Steps   | 3825000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 2.91      |
| Running Update Time | 765       |
-----------------------------------
--2024-08-12 17:05:16.917655 UTC---
| Itration            | 766       |
| PAGAR Loss          | 2.11e+05  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -6.01e+05 |
| Running Env Steps   | 3830000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 2.96      |
| Running Update Time | 766       |
-----------------------------------
--2024-08-12 17:07:50.986888 UTC---
| Itration            | 767       |
| PAGAR Loss          | 8.48e+04  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -5.68e+05 |
| Running Env Steps   | 3835000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 2.57      |
| Running Update Time | 767       |
-----------------------------------
--2024-08-12 17:10:25.765183 UTC---
| Itration            | 768       |
| PAGAR Loss          | -2.22e+08 |
| Real Det Return     | 4.64e+03  |
| Real Sto Return     | 4.01e+03  |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 3840000   |
| Running Forward KL  | 7.12      |
| Running Reverse KL  | 73.7      |
| Running Update Time | 768       |
-----------------------------------
--2024-08-12 17:13:19.191293 UTC---
| Itration            | 769       |
| PAGAR Loss          | 9.34e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 3845000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 3.56      |
| Running Update Time | 769       |
-----------------------------------
--2024-08-12 17:16:02.085031 UTC---
| Itration            | 770       |
| PAGAR Loss          | 1.97e+07  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -1.04e+06 |
| Running Env Steps   | 3850000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 3.36      |
| Running Update Time | 770       |
-----------------------------------
--2024-08-12 17:18:56.458186 UTC--
| Itration            | 771      |
| PAGAR Loss          | 1.95e+05 |
| Real Det Return     | 5.53e+03 |
| Real Sto Return     | 5.17e+03 |
| Reward Loss         | -6.3e+05 |
| Running Env Steps   | 3855000  |
| Running Forward KL  | 6.71     |
| Running Reverse KL  | 3.53     |
| Running Update Time | 771      |
----------------------------------
--2024-08-12 17:21:44.859084 UTC---
| Itration            | 772       |
| PAGAR Loss          | 7.27e+04  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -7.71e+05 |
| Running Env Steps   | 3860000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 3.65      |
| Running Update Time | 772       |
-----------------------------------
--2024-08-12 17:24:46.262225 UTC---
| Itration            | 773       |
| PAGAR Loss          | -3.35e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.51e+06 |
| Running Env Steps   | 3865000   |
| Running Forward KL  | 7.28      |
| Running Reverse KL  | 58.3      |
| Running Update Time | 773       |
-----------------------------------
--2024-08-12 17:27:38.145538 UTC---
| Itration            | 774       |
| PAGAR Loss          | 1.56e+05  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -6.09e+05 |
| Running Env Steps   | 3870000   |
| Running Forward KL  | 6.85      |
| Running Reverse KL  | 3.13      |
| Running Update Time | 774       |
-----------------------------------
--2024-08-12 17:30:19.423060 UTC---
| Itration            | 775       |
| PAGAR Loss          | -3.49e+06 |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 4.61e+03  |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 3875000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 52.6      |
| Running Update Time | 775       |
-----------------------------------
--2024-08-12 17:32:46.875082 UTC---
| Itration            | 776       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -3.63e+05 |
| Running Env Steps   | 3880000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 15.8      |
| Running Update Time | 776       |
-----------------------------------
--2024-08-12 17:35:13.511078 UTC---
| Itration            | 777       |
| PAGAR Loss          | 4.82e+06  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -2.24e+06 |
| Running Env Steps   | 3885000   |
| Running Forward KL  | 7.07      |
| Running Reverse KL  | 80.1      |
| Running Update Time | 777       |
-----------------------------------
--2024-08-12 17:37:44.083544 UTC---
| Itration            | 778       |
| PAGAR Loss          | -3.04e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -7.82e+05 |
| Running Env Steps   | 3890000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 778       |
-----------------------------------
--2024-08-12 17:40:09.235912 UTC---
| Itration            | 779       |
| PAGAR Loss          | 1.03e+05  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -9.47e+05 |
| Running Env Steps   | 3895000   |
| Running Forward KL  | 7.17      |
| Running Reverse KL  | 3.84      |
| Running Update Time | 779       |
-----------------------------------
--2024-08-12 17:42:38.449130 UTC---
| Itration            | 780       |
| PAGAR Loss          | 1.66e+06  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 3900000   |
| Running Forward KL  | 6.83      |
| Running Reverse KL  | 3.6       |
| Running Update Time | 780       |
-----------------------------------
--2024-08-12 17:45:08.327696 UTC---
| Itration            | 781       |
| PAGAR Loss          | 4.65e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -9.49e+05 |
| Running Env Steps   | 3905000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 2.92      |
| Running Update Time | 781       |
-----------------------------------
--2024-08-12 17:47:38.923314 UTC---
| Itration            | 782       |
| PAGAR Loss          | 1.48e+05  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -7.89e+05 |
| Running Env Steps   | 3910000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 2.79      |
| Running Update Time | 782       |
-----------------------------------
--2024-08-12 17:50:10.437756 UTC---
| Itration            | 783       |
| PAGAR Loss          | -7.17e+07 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -2.23e+05 |
| Running Env Steps   | 3915000   |
| Running Forward KL  | 6.72      |
| Running Reverse KL  | 35.5      |
| Running Update Time | 783       |
-----------------------------------
--2024-08-12 17:52:36.914464 UTC---
| Itration            | 784       |
| PAGAR Loss          | -1.48e+05 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -5.06e+05 |
| Running Env Steps   | 3920000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 2.92      |
| Running Update Time | 784       |
-----------------------------------
--2024-08-12 17:55:07.325556 UTC---
| Itration            | 785       |
| PAGAR Loss          | -6.37e+05 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -3.29e+05 |
| Running Env Steps   | 3925000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 16.3      |
| Running Update Time | 785       |
-----------------------------------
--2024-08-12 17:57:36.198172 UTC---
| Itration            | 786       |
| PAGAR Loss          | 1.08e+06  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -6.85e+05 |
| Running Env Steps   | 3930000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 2.82      |
| Running Update Time | 786       |
-----------------------------------
--2024-08-12 18:00:04.563652 UTC---
| Itration            | 787       |
| PAGAR Loss          | 1.71e+05  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -8.21e+05 |
| Running Env Steps   | 3935000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 3.83      |
| Running Update Time | 787       |
-----------------------------------
--2024-08-12 18:02:32.568287 UTC---
| Itration            | 788       |
| PAGAR Loss          | -1.3e+06  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -5.03e+05 |
| Running Env Steps   | 3940000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 6.79      |
| Running Update Time | 788       |
-----------------------------------
--2024-08-12 18:04:57.886830 UTC---
| Itration            | 789       |
| PAGAR Loss          | -6.6e+05  |
| Real Det Return     | 4.5e+03   |
| Real Sto Return     | 4.36e+03  |
| Reward Loss         | -2.26e+06 |
| Running Env Steps   | 3945000   |
| Running Forward KL  | 7.45      |
| Running Reverse KL  | 4.48      |
| Running Update Time | 789       |
-----------------------------------
--2024-08-12 18:07:26.687378 UTC--
| Itration            | 790      |
| PAGAR Loss          | 4.42e+06 |
| Real Det Return     | 5.17e+03 |
| Real Sto Return     | 5.11e+03 |
| Reward Loss         | -1.2e+06 |
| Running Env Steps   | 3950000  |
| Running Forward KL  | 6.58     |
| Running Reverse KL  | 3.89     |
| Running Update Time | 790      |
----------------------------------
--2024-08-12 18:09:55.228704 UTC---
| Itration            | 791       |
| PAGAR Loss          | -1.06e+08 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 3955000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 19.5      |
| Running Update Time | 791       |
-----------------------------------
--2024-08-12 18:12:22.009632 UTC---
| Itration            | 792       |
| PAGAR Loss          | 5.58e+09  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -7.33e+05 |
| Running Env Steps   | 3960000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 6.44      |
| Running Update Time | 792       |
-----------------------------------
--2024-08-12 18:14:46.652566 UTC---
| Itration            | 793       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.07e+03  |
| Reward Loss         | -3.79e+06 |
| Running Env Steps   | 3965000   |
| Running Forward KL  | 7.99      |
| Running Reverse KL  | 124       |
| Running Update Time | 793       |
-----------------------------------
--2024-08-12 18:17:15.311898 UTC---
| Itration            | 794       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -2.49e+06 |
| Running Env Steps   | 3970000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 56.7      |
| Running Update Time | 794       |
-----------------------------------
--2024-08-12 18:19:44.032954 UTC---
| Itration            | 795       |
| PAGAR Loss          | 1.8e+06   |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -9.86e+05 |
| Running Env Steps   | 3975000   |
| Running Forward KL  | 6.33      |
| Running Reverse KL  | 3.07      |
| Running Update Time | 795       |
-----------------------------------
--2024-08-12 18:22:12.258094 UTC---
| Itration            | 796       |
| PAGAR Loss          | -3.91e+06 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -4.97e+05 |
| Running Env Steps   | 3980000   |
| Running Forward KL  | 6.49      |
| Running Reverse KL  | 3.12      |
| Running Update Time | 796       |
-----------------------------------
--2024-08-12 18:24:40.334319 UTC---
| Itration            | 797       |
| PAGAR Loss          | 1.42e+05  |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -7.34e+05 |
| Running Env Steps   | 3985000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 3.46      |
| Running Update Time | 797       |
-----------------------------------
--2024-08-12 18:27:08.995698 UTC---
| Itration            | 798       |
| PAGAR Loss          | 6.33e+05  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -8.37e+05 |
| Running Env Steps   | 3990000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 3.1       |
| Running Update Time | 798       |
-----------------------------------
--2024-08-12 18:29:40.262265 UTC---
| Itration            | 799       |
| PAGAR Loss          | 1.89e+05  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -7.71e+05 |
| Running Env Steps   | 3995000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 14.9      |
| Running Update Time | 799       |
-----------------------------------
--2024-08-12 18:32:08.679141 UTC---
| Itration            | 800       |
| PAGAR Loss          | -1.21e+05 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -5.26e+05 |
| Running Env Steps   | 4000000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 2.83      |
| Running Update Time | 800       |
-----------------------------------
--2024-08-12 18:34:39.331224 UTC---
| Itration            | 801       |
| PAGAR Loss          | 5.07e+05  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -6.91e+05 |
| Running Env Steps   | 4005000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 3.27      |
| Running Update Time | 801       |
-----------------------------------
--2024-08-12 18:37:08.707226 UTC---
| Itration            | 802       |
| PAGAR Loss          | -2.3e+05  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -7.74e+05 |
| Running Env Steps   | 4010000   |
| Running Forward KL  | 6.73      |
| Running Reverse KL  | 34.4      |
| Running Update Time | 802       |
-----------------------------------
--2024-08-12 18:39:38.640933 UTC---
| Itration            | 803       |
| PAGAR Loss          | -2.14e+06 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -2.71e+06 |
| Running Env Steps   | 4015000   |
| Running Forward KL  | 6.57      |
| Running Reverse KL  | 40        |
| Running Update Time | 803       |
-----------------------------------
--2024-08-12 18:42:09.862981 UTC---
| Itration            | 804       |
| PAGAR Loss          | 5.72e+05  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 4020000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 2.77      |
| Running Update Time | 804       |
-----------------------------------
--2024-08-12 18:44:39.576736 UTC--
| Itration            | 805      |
| PAGAR Loss          | 2.28e+05 |
| Real Det Return     | 5.39e+03 |
| Real Sto Return     | 5.11e+03 |
| Reward Loss         | -6e+05   |
| Running Env Steps   | 4025000  |
| Running Forward KL  | 6.7      |
| Running Reverse KL  | 3.41     |
| Running Update Time | 805      |
----------------------------------
--2024-08-12 18:47:10.275946 UTC---
| Itration            | 806       |
| PAGAR Loss          | 6.51e+04  |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -7.97e+05 |
| Running Env Steps   | 4030000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 2.87      |
| Running Update Time | 806       |
-----------------------------------
--2024-08-12 18:49:36.926505 UTC---
| Itration            | 807       |
| PAGAR Loss          | -1.33e+05 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 4035000   |
| Running Forward KL  | 6.74      |
| Running Reverse KL  | 32.3      |
| Running Update Time | 807       |
-----------------------------------
--2024-08-12 18:52:06.854422 UTC---
| Itration            | 808       |
| PAGAR Loss          | 1.76e+05  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -9.41e+05 |
| Running Env Steps   | 4040000   |
| Running Forward KL  | 6.79      |
| Running Reverse KL  | 7.24      |
| Running Update Time | 808       |
-----------------------------------
--2024-08-12 18:54:37.670935 UTC---
| Itration            | 809       |
| PAGAR Loss          | 2.8e+04   |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -7.05e+05 |
| Running Env Steps   | 4045000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 3.01      |
| Running Update Time | 809       |
-----------------------------------
--2024-08-12 18:57:06.688204 UTC---
| Itration            | 810       |
| PAGAR Loss          | 7.93e+04  |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.55e+06 |
| Running Env Steps   | 4050000   |
| Running Forward KL  | 6.86      |
| Running Reverse KL  | 3.55      |
| Running Update Time | 810       |
-----------------------------------
--2024-08-12 18:59:35.090600 UTC---
| Itration            | 811       |
| PAGAR Loss          | -3.75e+05 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.25e+05 |
| Running Env Steps   | 4055000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 17.2      |
| Running Update Time | 811       |
-----------------------------------
--2024-08-12 19:02:03.554550 UTC---
| Itration            | 812       |
| PAGAR Loss          | -1.27e+06 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -5.52e+05 |
| Running Env Steps   | 4060000   |
| Running Forward KL  | 7.33      |
| Running Reverse KL  | 29.6      |
| Running Update Time | 812       |
-----------------------------------
--2024-08-12 19:04:33.054556 UTC---
| Itration            | 813       |
| PAGAR Loss          | 1.08e+05  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.49e+06 |
| Running Env Steps   | 4065000   |
| Running Forward KL  | 7.04      |
| Running Reverse KL  | 3.7       |
| Running Update Time | 813       |
-----------------------------------
--2024-08-12 19:07:03.080649 UTC---
| Itration            | 814       |
| PAGAR Loss          | -1.45e+07 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -2.14e+05 |
| Running Env Steps   | 4070000   |
| Running Forward KL  | 7.24      |
| Running Reverse KL  | 22.4      |
| Running Update Time | 814       |
-----------------------------------
--2024-08-12 19:09:32.943323 UTC---
| Itration            | 815       |
| PAGAR Loss          | -6.33e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -2.56e+06 |
| Running Env Steps   | 4075000   |
| Running Forward KL  | 6.97      |
| Running Reverse KL  | 35.9      |
| Running Update Time | 815       |
-----------------------------------
--2024-08-12 19:12:04.863353 UTC---
| Itration            | 816       |
| PAGAR Loss          | 2.09e+04  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.63e+05 |
| Running Env Steps   | 4080000   |
| Running Forward KL  | 6.39      |
| Running Reverse KL  | 3.28      |
| Running Update Time | 816       |
-----------------------------------
--2024-08-12 19:14:29.382524 UTC---
| Itration            | 817       |
| PAGAR Loss          | -4.92e+04 |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -6.45e+04 |
| Running Env Steps   | 4085000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 15.8      |
| Running Update Time | 817       |
-----------------------------------
--2024-08-12 19:16:59.182452 UTC---
| Itration            | 818       |
| PAGAR Loss          | 2.53e+06  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 4090000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 3.86      |
| Running Update Time | 818       |
-----------------------------------
--2024-08-12 19:19:26.429050 UTC---
| Itration            | 819       |
| PAGAR Loss          | -3.03e+05 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -6.52e+05 |
| Running Env Steps   | 4095000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 819       |
-----------------------------------
--2024-08-12 19:21:55.354931 UTC---
| Itration            | 820       |
| PAGAR Loss          | 1.01e+05  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -7.46e+05 |
| Running Env Steps   | 4100000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 3.25      |
| Running Update Time | 820       |
-----------------------------------
--2024-08-12 19:24:24.308102 UTC---
| Itration            | 821       |
| PAGAR Loss          | -5.29e+04 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -7.69e+05 |
| Running Env Steps   | 4105000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 15.7      |
| Running Update Time | 821       |
-----------------------------------
--2024-08-12 19:26:50.868540 UTC---
| Itration            | 822       |
| PAGAR Loss          | 2.49e+06  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -6.16e+05 |
| Running Env Steps   | 4110000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 3.22      |
| Running Update Time | 822       |
-----------------------------------
--2024-08-12 19:29:20.367679 UTC---
| Itration            | 823       |
| PAGAR Loss          | -2.08e+04 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -7.63e+05 |
| Running Env Steps   | 4115000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 3.53      |
| Running Update Time | 823       |
-----------------------------------
--2024-08-12 19:31:51.456630 UTC---
| Itration            | 824       |
| PAGAR Loss          | -5.9e+07  |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.73e+06 |
| Running Env Steps   | 4120000   |
| Running Forward KL  | 7.59      |
| Running Reverse KL  | 33.4      |
| Running Update Time | 824       |
-----------------------------------
--2024-08-12 19:34:19.614173 UTC---
| Itration            | 825       |
| PAGAR Loss          | -2.04e+06 |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | -1.58e+06 |
| Running Env Steps   | 4125000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 31.2      |
| Running Update Time | 825       |
-----------------------------------
--2024-08-12 19:36:48.354452 UTC---
| Itration            | 826       |
| PAGAR Loss          | -9.12e+04 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -7.14e+05 |
| Running Env Steps   | 4130000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 4.11      |
| Running Update Time | 826       |
-----------------------------------
--2024-08-12 19:39:10.990972 UTC---
| Itration            | 827       |
| PAGAR Loss          | -1.83e+06 |
| Real Det Return     | 4.26e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 4135000   |
| Running Forward KL  | 7.62      |
| Running Reverse KL  | 69.7      |
| Running Update Time | 827       |
-----------------------------------
--2024-08-12 19:41:41.726368 UTC---
| Itration            | 828       |
| PAGAR Loss          | -7.53e+05 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -5.09e+05 |
| Running Env Steps   | 4140000   |
| Running Forward KL  | 7.34      |
| Running Reverse KL  | 20.7      |
| Running Update Time | 828       |
-----------------------------------
--2024-08-12 19:44:11.942858 UTC---
| Itration            | 829       |
| PAGAR Loss          | -7.19e+05 |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -4.54e+05 |
| Running Env Steps   | 4145000   |
| Running Forward KL  | 7.33      |
| Running Reverse KL  | 27.5      |
| Running Update Time | 829       |
-----------------------------------
--2024-08-12 19:46:37.327294 UTC---
| Itration            | 830       |
| PAGAR Loss          | -2.7e+06  |
| Real Det Return     | 5.07e+03  |
| Real Sto Return     | 4.33e+03  |
| Reward Loss         | -8.42e+05 |
| Running Env Steps   | 4150000   |
| Running Forward KL  | 7.76      |
| Running Reverse KL  | 40.1      |
| Running Update Time | 830       |
-----------------------------------
--2024-08-12 19:49:06.419232 UTC---
| Itration            | 831       |
| PAGAR Loss          | -9.75e+04 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 4155000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 30.5      |
| Running Update Time | 831       |
-----------------------------------
--2024-08-12 19:51:34.275081 UTC---
| Itration            | 832       |
| PAGAR Loss          | -5.37e+05 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -2.11e+06 |
| Running Env Steps   | 4160000   |
| Running Forward KL  | 7.88      |
| Running Reverse KL  | 69.4      |
| Running Update Time | 832       |
-----------------------------------
--2024-08-12 19:54:02.079264 UTC---
| Itration            | 833       |
| PAGAR Loss          | -2.17e+05 |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -6e+05    |
| Running Env Steps   | 4165000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 13.4      |
| Running Update Time | 833       |
-----------------------------------
--2024-08-12 19:56:29.793369 UTC---
| Itration            | 834       |
| PAGAR Loss          | 1.59e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 4170000   |
| Running Forward KL  | 7.3       |
| Running Reverse KL  | 3.81      |
| Running Update Time | 834       |
-----------------------------------
--2024-08-12 19:59:00.550936 UTC---
| Itration            | 835       |
| PAGAR Loss          | -4.39e+05 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 4175000   |
| Running Forward KL  | 7.82      |
| Running Reverse KL  | 69.6      |
| Running Update Time | 835       |
-----------------------------------
--2024-08-12 20:01:31.398538 UTC---
| Itration            | 836       |
| PAGAR Loss          | 2.2e+05   |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 4180000   |
| Running Forward KL  | 7.3       |
| Running Reverse KL  | 3.55      |
| Running Update Time | 836       |
-----------------------------------
--2024-08-12 20:04:00.475373 UTC---
| Itration            | 837       |
| PAGAR Loss          | -1.63e+06 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -6.51e+05 |
| Running Env Steps   | 4185000   |
| Running Forward KL  | 7.77      |
| Running Reverse KL  | 24.7      |
| Running Update Time | 837       |
-----------------------------------
--2024-08-12 20:06:30.467691 UTC---
| Itration            | 838       |
| PAGAR Loss          | -1.02e+06 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -7.27e+05 |
| Running Env Steps   | 4190000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 32.1      |
| Running Update Time | 838       |
-----------------------------------
--2024-08-12 20:08:56.240492 UTC---
| Itration            | 839       |
| PAGAR Loss          | -1.11e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 4195000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 3.33      |
| Running Update Time | 839       |
-----------------------------------
--2024-08-12 20:11:27.022303 UTC---
| Itration            | 840       |
| PAGAR Loss          | -1.55e+05 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.95e+06 |
| Running Env Steps   | 4200000   |
| Running Forward KL  | 7.61      |
| Running Reverse KL  | 49.8      |
| Running Update Time | 840       |
-----------------------------------
--2024-08-12 20:13:55.127069 UTC---
| Itration            | 841       |
| PAGAR Loss          | 1.51e+07  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -9.22e+05 |
| Running Env Steps   | 4205000   |
| Running Forward KL  | 7.06      |
| Running Reverse KL  | 20        |
| Running Update Time | 841       |
-----------------------------------
--2024-08-12 20:16:24.296554 UTC---
| Itration            | 842       |
| PAGAR Loss          | 2.53e+05  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -9.98e+05 |
| Running Env Steps   | 4210000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 3.6       |
| Running Update Time | 842       |
-----------------------------------
--2024-08-12 20:18:54.827560 UTC---
| Itration            | 843       |
| PAGAR Loss          | -98.8     |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -9.95e+05 |
| Running Env Steps   | 4215000   |
| Running Forward KL  | 7.22      |
| Running Reverse KL  | 3.71      |
| Running Update Time | 843       |
-----------------------------------
--2024-08-12 20:21:21.504549 UTC---
| Itration            | 844       |
| PAGAR Loss          | -1.36e+06 |
| Real Det Return     | 4.42e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -5.83e+05 |
| Running Env Steps   | 4220000   |
| Running Forward KL  | 7.83      |
| Running Reverse KL  | 61.6      |
| Running Update Time | 844       |
-----------------------------------
--2024-08-12 20:23:49.290802 UTC---
| Itration            | 845       |
| PAGAR Loss          | -6.66e+05 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -7.34e+05 |
| Running Env Steps   | 4225000   |
| Running Forward KL  | 7.3       |
| Running Reverse KL  | 39.3      |
| Running Update Time | 845       |
-----------------------------------
--2024-08-12 20:26:12.049227 UTC---
| Itration            | 846       |
| PAGAR Loss          | -4.63e+06 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.43e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 4230000   |
| Running Forward KL  | 7.61      |
| Running Reverse KL  | 4.89      |
| Running Update Time | 846       |
-----------------------------------
--2024-08-12 20:28:41.800227 UTC---
| Itration            | 847       |
| PAGAR Loss          | -1.99e+05 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -2.41e+06 |
| Running Env Steps   | 4235000   |
| Running Forward KL  | 8.02      |
| Running Reverse KL  | 42.5      |
| Running Update Time | 847       |
-----------------------------------
--2024-08-12 20:31:10.623889 UTC---
| Itration            | 848       |
| PAGAR Loss          | 2.55e+05  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -8.06e+05 |
| Running Env Steps   | 4240000   |
| Running Forward KL  | 7.68      |
| Running Reverse KL  | 3.99      |
| Running Update Time | 848       |
-----------------------------------
--2024-08-12 20:33:39.940968 UTC---
| Itration            | 849       |
| PAGAR Loss          | -1.66e+05 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 4245000   |
| Running Forward KL  | 7.64      |
| Running Reverse KL  | 23.3      |
| Running Update Time | 849       |
-----------------------------------
--2024-08-12 20:36:09.109092 UTC---
| Itration            | 850       |
| PAGAR Loss          | 1.73e+05  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 4250000   |
| Running Forward KL  | 7.82      |
| Running Reverse KL  | 4.3       |
| Running Update Time | 850       |
-----------------------------------
--2024-08-12 20:38:37.064328 UTC---
| Itration            | 851       |
| PAGAR Loss          | 5.89e+04  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -9.38e+05 |
| Running Env Steps   | 4255000   |
| Running Forward KL  | 7.37      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 851       |
-----------------------------------
--2024-08-12 20:41:05.722673 UTC---
| Itration            | 852       |
| PAGAR Loss          | -1.12e+06 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.5e+03   |
| Reward Loss         | -5.25e+06 |
| Running Env Steps   | 4260000   |
| Running Forward KL  | 9.16      |
| Running Reverse KL  | 120       |
| Running Update Time | 852       |
-----------------------------------
--2024-08-12 20:43:31.985003 UTC---
| Itration            | 853       |
| PAGAR Loss          | 1.49e+06  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 4265000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 38.9      |
| Running Update Time | 853       |
-----------------------------------
--2024-08-12 20:46:02.460617 UTC---
| Itration            | 854       |
| PAGAR Loss          | 3.33e+04  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 4270000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 4.35      |
| Running Update Time | 854       |
-----------------------------------
--2024-08-12 20:48:30.947981 UTC---
| Itration            | 855       |
| PAGAR Loss          | 1.11e+06  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -8.05e+05 |
| Running Env Steps   | 4275000   |
| Running Forward KL  | 7.26      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 855       |
-----------------------------------
--2024-08-12 20:50:58.765218 UTC---
| Itration            | 856       |
| PAGAR Loss          | -1.26e+06 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 4280000   |
| Running Forward KL  | 7.97      |
| Running Reverse KL  | 37.1      |
| Running Update Time | 856       |
-----------------------------------
--2024-08-12 20:53:28.758984 UTC---
| Itration            | 857       |
| PAGAR Loss          | 1.12e+07  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 4285000   |
| Running Forward KL  | 7.23      |
| Running Reverse KL  | 3.55      |
| Running Update Time | 857       |
-----------------------------------
--2024-08-12 20:55:57.976962 UTC---
| Itration            | 858       |
| PAGAR Loss          | -3.35e+05 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 4290000   |
| Running Forward KL  | 7.33      |
| Running Reverse KL  | 3.82      |
| Running Update Time | 858       |
-----------------------------------
--2024-08-12 20:58:28.511680 UTC---
| Itration            | 859       |
| PAGAR Loss          | -1.83e+05 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 4295000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 14.1      |
| Running Update Time | 859       |
-----------------------------------
--2024-08-12 21:00:55.867893 UTC---
| Itration            | 860       |
| PAGAR Loss          | 2.54e+05  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -9.09e+05 |
| Running Env Steps   | 4300000   |
| Running Forward KL  | 7.43      |
| Running Reverse KL  | 10.8      |
| Running Update Time | 860       |
-----------------------------------
--2024-08-12 21:03:23.645219 UTC---
| Itration            | 861       |
| PAGAR Loss          | -1.83e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -9.64e+05 |
| Running Env Steps   | 4305000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 6.29      |
| Running Update Time | 861       |
-----------------------------------
--2024-08-12 21:05:46.132577 UTC---
| Itration            | 862       |
| PAGAR Loss          | -5.23e+05 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 3.77e+03  |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 4310000   |
| Running Forward KL  | 7.75      |
| Running Reverse KL  | 9.41      |
| Running Update Time | 862       |
-----------------------------------
--2024-08-12 21:08:15.515433 UTC---
| Itration            | 863       |
| PAGAR Loss          | -8.31e+05 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 4315000   |
| Running Forward KL  | 7.33      |
| Running Reverse KL  | 4.35      |
| Running Update Time | 863       |
-----------------------------------
--2024-08-12 21:10:43.881864 UTC---
| Itration            | 864       |
| PAGAR Loss          | 1.29e+05  |
| Real Det Return     | 4.88e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 4320000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 3.92      |
| Running Update Time | 864       |
-----------------------------------
--2024-08-12 21:13:12.778769 UTC---
| Itration            | 865       |
| PAGAR Loss          | -2.15e+04 |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 4325000   |
| Running Forward KL  | 7.52      |
| Running Reverse KL  | 3.9       |
| Running Update Time | 865       |
-----------------------------------
--2024-08-12 21:15:41.149649 UTC---
| Itration            | 866       |
| PAGAR Loss          | 4.77e+05  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -3.46e+06 |
| Running Env Steps   | 4330000   |
| Running Forward KL  | 7.67      |
| Running Reverse KL  | 101       |
| Running Update Time | 866       |
-----------------------------------
--2024-08-12 21:18:06.154779 UTC---
| Itration            | 867       |
| PAGAR Loss          | -2.46e+05 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 4335000   |
| Running Forward KL  | 7.42      |
| Running Reverse KL  | 7.9       |
| Running Update Time | 867       |
-----------------------------------
--2024-08-12 21:20:37.417842 UTC---
| Itration            | 868       |
| PAGAR Loss          | -4.07e+04 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 4340000   |
| Running Forward KL  | 6.7       |
| Running Reverse KL  | 4.09      |
| Running Update Time | 868       |
-----------------------------------
--2024-08-12 21:23:07.318172 UTC---
| Itration            | 869       |
| PAGAR Loss          | -4.31e+04 |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -2.16e+06 |
| Running Env Steps   | 4345000   |
| Running Forward KL  | 8.32      |
| Running Reverse KL  | 60        |
| Running Update Time | 869       |
-----------------------------------
--2024-08-12 21:25:37.246951 UTC---
| Itration            | 870       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.07e+06 |
| Running Env Steps   | 4350000   |
| Running Forward KL  | 7.32      |
| Running Reverse KL  | 14        |
| Running Update Time | 870       |
-----------------------------------
--2024-08-12 21:28:06.793677 UTC---
| Itration            | 871       |
| PAGAR Loss          | 1.11e+05  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 4355000   |
| Running Forward KL  | 7.39      |
| Running Reverse KL  | 4.62      |
| Running Update Time | 871       |
-----------------------------------
--2024-08-12 21:30:33.694305 UTC---
| Itration            | 872       |
| PAGAR Loss          | -1.03e+05 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 4360000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 872       |
-----------------------------------
--2024-08-12 21:33:05.887435 UTC---
| Itration            | 873       |
| PAGAR Loss          | -4.18e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 4365000   |
| Running Forward KL  | 7.02      |
| Running Reverse KL  | 3.98      |
| Running Update Time | 873       |
-----------------------------------
--2024-08-12 21:35:35.639918 UTC---
| Itration            | 874       |
| PAGAR Loss          | 7.58e+05  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.84e+06 |
| Running Env Steps   | 4370000   |
| Running Forward KL  | 7.1       |
| Running Reverse KL  | 4.06      |
| Running Update Time | 874       |
-----------------------------------
--2024-08-12 21:38:05.208362 UTC---
| Itration            | 875       |
| PAGAR Loss          | -8.99e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -7e+05    |
| Running Env Steps   | 4375000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 7.67      |
| Running Update Time | 875       |
-----------------------------------
--2024-08-12 21:40:32.832712 UTC---
| Itration            | 876       |
| PAGAR Loss          | -6.06e+04 |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 4380000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 876       |
-----------------------------------
--2024-08-12 21:43:01.434600 UTC---
| Itration            | 877       |
| PAGAR Loss          | -4.98e+04 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 4385000   |
| Running Forward KL  | 7.21      |
| Running Reverse KL  | 4.41      |
| Running Update Time | 877       |
-----------------------------------
--2024-08-12 21:45:29.428167 UTC---
| Itration            | 878       |
| PAGAR Loss          | 3.38e+04  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 4390000   |
| Running Forward KL  | 6.81      |
| Running Reverse KL  | 4.12      |
| Running Update Time | 878       |
-----------------------------------
--2024-08-12 21:47:46.302627 UTC---
| Itration            | 879       |
| PAGAR Loss          | -5.35e+05 |
| Real Det Return     | 1.48e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -6.91e+05 |
| Running Env Steps   | 4395000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 44.4      |
| Running Update Time | 879       |
-----------------------------------
--2024-08-12 21:50:14.117031 UTC---
| Itration            | 880       |
| PAGAR Loss          | -1.18e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 4400000   |
| Running Forward KL  | 7.31      |
| Running Reverse KL  | 5.24      |
| Running Update Time | 880       |
-----------------------------------
--2024-08-12 21:52:43.380750 UTC---
| Itration            | 881       |
| PAGAR Loss          | -2.62e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -2.07e+06 |
| Running Env Steps   | 4405000   |
| Running Forward KL  | 7.17      |
| Running Reverse KL  | 34.4      |
| Running Update Time | 881       |
-----------------------------------
--2024-08-12 21:55:10.218608 UTC---
| Itration            | 882       |
| PAGAR Loss          | 1.24e+03  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 4410000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 882       |
-----------------------------------
--2024-08-12 21:57:36.379595 UTC---
| Itration            | 883       |
| PAGAR Loss          | 1.71e+06  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -9.77e+05 |
| Running Env Steps   | 4415000   |
| Running Forward KL  | 7.4       |
| Running Reverse KL  | 4.5       |
| Running Update Time | 883       |
-----------------------------------
--2024-08-12 22:00:06.318844 UTC---
| Itration            | 884       |
| PAGAR Loss          | 2.08e+04  |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.75e+06 |
| Running Env Steps   | 4420000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 4.36      |
| Running Update Time | 884       |
-----------------------------------
--2024-08-12 22:02:36.255518 UTC---
| Itration            | 885       |
| PAGAR Loss          | 5.8e+05   |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -2.61e+06 |
| Running Env Steps   | 4425000   |
| Running Forward KL  | 7.53      |
| Running Reverse KL  | 42.4      |
| Running Update Time | 885       |
-----------------------------------
--2024-08-12 22:05:06.953041 UTC---
| Itration            | 886       |
| PAGAR Loss          | 8.36e+04  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.51e+06 |
| Running Env Steps   | 4430000   |
| Running Forward KL  | 7.57      |
| Running Reverse KL  | 4.14      |
| Running Update Time | 886       |
-----------------------------------
--2024-08-12 22:07:20.814122 UTC---
| Itration            | 887       |
| PAGAR Loss          | nan       |
| Real Det Return     | 934       |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -4.15e+06 |
| Running Env Steps   | 4435000   |
| Running Forward KL  | 8.23      |
| Running Reverse KL  | 135       |
| Running Update Time | 887       |
-----------------------------------
--2024-08-12 22:09:49.093540 UTC---
| Itration            | 888       |
| PAGAR Loss          | -2.57e+05 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -9.38e+05 |
| Running Env Steps   | 4440000   |
| Running Forward KL  | 7.21      |
| Running Reverse KL  | 17.2      |
| Running Update Time | 888       |
-----------------------------------
--2024-08-12 22:12:19.002185 UTC---
| Itration            | 889       |
| PAGAR Loss          | 3.3e+04   |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 4445000   |
| Running Forward KL  | 6.59      |
| Running Reverse KL  | 3.85      |
| Running Update Time | 889       |
-----------------------------------
--2024-08-12 22:14:42.686964 UTC--
| Itration            | 890      |
| PAGAR Loss          | 2.43e+04 |
| Real Det Return     | 4.8e+03  |
| Real Sto Return     | 4.68e+03 |
| Reward Loss         | -1.8e+06 |
| Running Env Steps   | 4450000  |
| Running Forward KL  | 6.83     |
| Running Reverse KL  | 4.22     |
| Running Update Time | 890      |
----------------------------------
--2024-08-12 22:17:13.641808 UTC---
| Itration            | 891       |
| PAGAR Loss          | 2.18e+04  |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.82e+06 |
| Running Env Steps   | 4455000   |
| Running Forward KL  | 7.04      |
| Running Reverse KL  | 4.3       |
| Running Update Time | 891       |
-----------------------------------
--2024-08-12 22:19:35.127706 UTC---
| Itration            | 892       |
| PAGAR Loss          | -3.65e+05 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 4460000   |
| Running Forward KL  | 7.17      |
| Running Reverse KL  | 6.92      |
| Running Update Time | 892       |
-----------------------------------
--2024-08-12 22:22:05.513465 UTC---
| Itration            | 893       |
| PAGAR Loss          | -8.39e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -1.19e+06 |
| Running Env Steps   | 4465000   |
| Running Forward KL  | 7.14      |
| Running Reverse KL  | 3.8       |
| Running Update Time | 893       |
-----------------------------------
--2024-08-12 22:24:33.262305 UTC---
| Itration            | 894       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 4470000   |
| Running Forward KL  | 8.31      |
| Running Reverse KL  | 84.4      |
| Running Update Time | 894       |
-----------------------------------
--2024-08-12 22:27:02.694295 UTC---
| Itration            | 895       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 4475000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 4.19      |
| Running Update Time | 895       |
-----------------------------------
--2024-08-12 22:29:34.215223 UTC---
| Itration            | 896       |
| PAGAR Loss          | 8.42e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.35e+06 |
| Running Env Steps   | 4480000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 4.05      |
| Running Update Time | 896       |
-----------------------------------
--2024-08-12 22:32:01.111491 UTC---
| Itration            | 897       |
| PAGAR Loss          | -1.53e+05 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -1.57e+06 |
| Running Env Steps   | 4485000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 37.3      |
| Running Update Time | 897       |
-----------------------------------
--2024-08-12 22:34:31.140327 UTC---
| Itration            | 898       |
| PAGAR Loss          | -2.75e+05 |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -2.14e+06 |
| Running Env Steps   | 4490000   |
| Running Forward KL  | 7.32      |
| Running Reverse KL  | 33.4      |
| Running Update Time | 898       |
-----------------------------------
--2024-08-12 22:36:59.119152 UTC---
| Itration            | 899       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 4495000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 4.11      |
| Running Update Time | 899       |
-----------------------------------
--2024-08-12 22:39:26.364149 UTC---
| Itration            | 900       |
| PAGAR Loss          | -7.96e+05 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -2.01e+06 |
| Running Env Steps   | 4500000   |
| Running Forward KL  | 7.34      |
| Running Reverse KL  | 66.3      |
| Running Update Time | 900       |
-----------------------------------
--2024-08-12 22:41:53.634066 UTC---
| Itration            | 901       |
| PAGAR Loss          | -1.06e+07 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 4505000   |
| Running Forward KL  | 7.15      |
| Running Reverse KL  | 60.1      |
| Running Update Time | 901       |
-----------------------------------
--2024-08-12 22:44:23.031945 UTC---
| Itration            | 902       |
| PAGAR Loss          | -1.12e+06 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -6.36e+05 |
| Running Env Steps   | 4510000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 38.5      |
| Running Update Time | 902       |
-----------------------------------
--2024-08-12 22:46:54.670555 UTC---
| Itration            | 903       |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.86e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -3.32e+06 |
| Running Env Steps   | 4515000   |
| Running Forward KL  | 7.94      |
| Running Reverse KL  | 4.97      |
| Running Update Time | 903       |
-----------------------------------
--2024-08-12 22:49:23.151055 UTC---
| Itration            | 904       |
| PAGAR Loss          | -1.54e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -9.53e+05 |
| Running Env Steps   | 4520000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 32.2      |
| Running Update Time | 904       |
-----------------------------------
--2024-08-12 22:51:52.761862 UTC---
| Itration            | 905       |
| PAGAR Loss          | -4.64e+05 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -1.13e+06 |
| Running Env Steps   | 4525000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 19.5      |
| Running Update Time | 905       |
-----------------------------------
--2024-08-12 22:54:19.009625 UTC---
| Itration            | 906       |
| PAGAR Loss          | -3.03e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -3.32e+06 |
| Running Env Steps   | 4530000   |
| Running Forward KL  | 7.43      |
| Running Reverse KL  | 78.3      |
| Running Update Time | 906       |
-----------------------------------
--2024-08-12 22:56:48.576584 UTC---
| Itration            | 907       |
| PAGAR Loss          | -1.28e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -9.43e+05 |
| Running Env Steps   | 4535000   |
| Running Forward KL  | 6.94      |
| Running Reverse KL  | 22.9      |
| Running Update Time | 907       |
-----------------------------------
--2024-08-12 22:59:18.374562 UTC---
| Itration            | 908       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 4540000   |
| Running Forward KL  | 7.25      |
| Running Reverse KL  | 14.6      |
| Running Update Time | 908       |
-----------------------------------
--2024-08-12 23:01:48.158303 UTC---
| Itration            | 909       |
| PAGAR Loss          | -3.72e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.83e+06 |
| Running Env Steps   | 4545000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 39.3      |
| Running Update Time | 909       |
-----------------------------------
--2024-08-12 23:04:14.489783 UTC---
| Itration            | 910       |
| PAGAR Loss          | -8.15e+04 |
| Real Det Return     | 4.71e+03  |
| Real Sto Return     | 4.11e+03  |
| Reward Loss         | -2.33e+06 |
| Running Env Steps   | 4550000   |
| Running Forward KL  | 7.9       |
| Running Reverse KL  | 5.73      |
| Running Update Time | 910       |
-----------------------------------
--2024-08-12 23:06:43.718076 UTC---
| Itration            | 911       |
| PAGAR Loss          | 1.61e+05  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.71e+06 |
| Running Env Steps   | 4555000   |
| Running Forward KL  | 7.27      |
| Running Reverse KL  | 4.48      |
| Running Update Time | 911       |
-----------------------------------
--2024-08-12 23:09:12.256830 UTC---
| Itration            | 912       |
| PAGAR Loss          | -1.42e+05 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -2.97e+06 |
| Running Env Steps   | 4560000   |
| Running Forward KL  | 7.66      |
| Running Reverse KL  | 72        |
| Running Update Time | 912       |
-----------------------------------
--2024-08-12 23:11:37.810748 UTC---
| Itration            | 913       |
| PAGAR Loss          | -5.03e+05 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -8.16e+05 |
| Running Env Steps   | 4565000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 30        |
| Running Update Time | 913       |
-----------------------------------
--2024-08-12 23:13:57.736354 UTC---
| Itration            | 914       |
| PAGAR Loss          | -7.74e+04 |
| Real Det Return     | 2.99e+03  |
| Real Sto Return     | 4.05e+03  |
| Reward Loss         | -1.56e+06 |
| Running Env Steps   | 4570000   |
| Running Forward KL  | 7.09      |
| Running Reverse KL  | 4.81      |
| Running Update Time | 914       |
-----------------------------------
--2024-08-12 23:16:13.841454 UTC---
| Itration            | 915       |
| PAGAR Loss          | 1.25e+05  |
| Real Det Return     | 2.62e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 4575000   |
| Running Forward KL  | 7.24      |
| Running Reverse KL  | 4.87      |
| Running Update Time | 915       |
-----------------------------------
--2024-08-12 23:18:44.377505 UTC---
| Itration            | 916       |
| PAGAR Loss          | 1.11e+05  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 4580000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 4.85      |
| Running Update Time | 916       |
-----------------------------------
--2024-08-12 23:21:09.712838 UTC---
| Itration            | 917       |
| PAGAR Loss          | -2.11e+08 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -2.07e+06 |
| Running Env Steps   | 4585000   |
| Running Forward KL  | 7.92      |
| Running Reverse KL  | 75.2      |
| Running Update Time | 917       |
-----------------------------------
--2024-08-12 23:23:37.042702 UTC---
| Itration            | 918       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 4590000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 29.3      |
| Running Update Time | 918       |
-----------------------------------
--2024-08-12 23:26:05.903304 UTC---
| Itration            | 919       |
| PAGAR Loss          | -2.22e+03 |
| Real Det Return     | 5.09e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -2.17e+06 |
| Running Env Steps   | 4595000   |
| Running Forward KL  | 7.3       |
| Running Reverse KL  | 5.07      |
| Running Update Time | 919       |
-----------------------------------
--2024-08-12 23:28:32.571135 UTC---
| Itration            | 920       |
| PAGAR Loss          | -1.49e+05 |
| Real Det Return     | 4.51e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -8.8e+05  |
| Running Env Steps   | 4600000   |
| Running Forward KL  | 7.38      |
| Running Reverse KL  | 39.2      |
| Running Update Time | 920       |
-----------------------------------
--2024-08-12 23:31:03.118098 UTC---
| Itration            | 921       |
| PAGAR Loss          | 26.4      |
| Real Det Return     | 5.08e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -2.25e+06 |
| Running Env Steps   | 4605000   |
| Running Forward KL  | 8.1       |
| Running Reverse KL  | 5.87      |
| Running Update Time | 921       |
-----------------------------------
--2024-08-12 23:33:30.588154 UTC---
| Itration            | 922       |
| PAGAR Loss          | 5.5e+05   |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.17e+06 |
| Running Env Steps   | 4610000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 4.77      |
| Running Update Time | 922       |
-----------------------------------
--2024-08-12 23:36:00.207019 UTC---
| Itration            | 923       |
| PAGAR Loss          | -4.47e+04 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 4615000   |
| Running Forward KL  | 7.09      |
| Running Reverse KL  | 4.78      |
| Running Update Time | 923       |
-----------------------------------
--2024-08-12 23:38:28.286342 UTC---
| Itration            | 924       |
| PAGAR Loss          | -1.51e+05 |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.5e+06  |
| Running Env Steps   | 4620000   |
| Running Forward KL  | 6.92      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 924       |
-----------------------------------
--2024-08-12 23:40:57.665307 UTC---
| Itration            | 925       |
| PAGAR Loss          | -2.71e+05 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -9.14e+05 |
| Running Env Steps   | 4625000   |
| Running Forward KL  | 7.29      |
| Running Reverse KL  | 6.27      |
| Running Update Time | 925       |
-----------------------------------
--2024-08-12 23:43:26.288637 UTC---
| Itration            | 926       |
| PAGAR Loss          | 7.32e+03  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 4630000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 4.69      |
| Running Update Time | 926       |
-----------------------------------
--2024-08-12 23:45:54.588228 UTC---
| Itration            | 927       |
| PAGAR Loss          | -1.1e+04  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 4635000   |
| Running Forward KL  | 7.17      |
| Running Reverse KL  | 4.77      |
| Running Update Time | 927       |
-----------------------------------
--2024-08-12 23:48:18.675940 UTC---
| Itration            | 928       |
| PAGAR Loss          | -8.14e+06 |
| Real Det Return     | 4.9e+03   |
| Real Sto Return     | 3.98e+03  |
| Reward Loss         | -4.35e+05 |
| Running Env Steps   | 4640000   |
| Running Forward KL  | 8.47      |
| Running Reverse KL  | 119       |
| Running Update Time | 928       |
-----------------------------------
--2024-08-12 23:50:46.162534 UTC---
| Itration            | 929       |
| PAGAR Loss          | 4.17e+04  |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -2.21e+06 |
| Running Env Steps   | 4645000   |
| Running Forward KL  | 7.27      |
| Running Reverse KL  | 5.01      |
| Running Update Time | 929       |
-----------------------------------
--2024-08-12 23:53:15.395144 UTC---
| Itration            | 930       |
| PAGAR Loss          | -4.68e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 4650000   |
| Running Forward KL  | 7.34      |
| Running Reverse KL  | 4.65      |
| Running Update Time | 930       |
-----------------------------------
--2024-08-12 23:55:42.849274 UTC---
| Itration            | 931       |
| PAGAR Loss          | 3.54e+05  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -1.02e+06 |
| Running Env Steps   | 4655000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 931       |
-----------------------------------
--2024-08-12 23:58:11.915066 UTC---
| Itration            | 932       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 4660000   |
| Running Forward KL  | 7.13      |
| Running Reverse KL  | 8.81      |
| Running Update Time | 932       |
-----------------------------------
--2024-08-13 00:00:38.940342 UTC--
| Itration            | 933      |
| PAGAR Loss          | 2.87e+04 |
| Real Det Return     | 5.33e+03 |
| Real Sto Return     | 5.21e+03 |
| Reward Loss         | -1.5e+06 |
| Running Env Steps   | 4665000  |
| Running Forward KL  | 6.69     |
| Running Reverse KL  | 4.36     |
| Running Update Time | 933      |
----------------------------------
--2024-08-13 00:03:06.342166 UTC--
| Itration            | 934      |
| PAGAR Loss          | 4.43e+05 |
| Real Det Return     | 5.43e+03 |
| Real Sto Return     | 5.28e+03 |
| Reward Loss         | -1.2e+06 |
| Running Env Steps   | 4670000  |
| Running Forward KL  | 6.82     |
| Running Reverse KL  | 4.65     |
| Running Update Time | 934      |
----------------------------------
--2024-08-13 00:05:33.663051 UTC---
| Itration            | 935       |
| PAGAR Loss          | -4.95e+05 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.17e+06 |
| Running Env Steps   | 4675000   |
| Running Forward KL  | 7.12      |
| Running Reverse KL  | 7.5       |
| Running Update Time | 935       |
-----------------------------------
--2024-08-13 00:07:56.932861 UTC---
| Itration            | 936       |
| PAGAR Loss          | -4.2e+04  |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 4680000   |
| Running Forward KL  | 7         |
| Running Reverse KL  | 5         |
| Running Update Time | 936       |
-----------------------------------
--2024-08-13 00:10:25.827255 UTC---
| Itration            | 937       |
| PAGAR Loss          | 1.32e+07  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 4685000   |
| Running Forward KL  | 6.9       |
| Running Reverse KL  | 4.4       |
| Running Update Time | 937       |
-----------------------------------
--2024-08-13 00:12:48.523520 UTC---
| Itration            | 938       |
| PAGAR Loss          | -7.45e+04 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.64e+03  |
| Reward Loss         | -8.53e+05 |
| Running Env Steps   | 4690000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 4.42      |
| Running Update Time | 938       |
-----------------------------------
--2024-08-13 00:15:18.340454 UTC---
| Itration            | 939       |
| PAGAR Loss          | 1.43e+04  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 4695000   |
| Running Forward KL  | 6.97      |
| Running Reverse KL  | 4.5       |
| Running Update Time | 939       |
-----------------------------------
--2024-08-13 00:17:45.534411 UTC---
| Itration            | 940       |
| PAGAR Loss          | 5.13e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.6e+03   |
| Reward Loss         | -1.17e+06 |
| Running Env Steps   | 4700000   |
| Running Forward KL  | 7.11      |
| Running Reverse KL  | 5.01      |
| Running Update Time | 940       |
-----------------------------------
--2024-08-13 00:20:13.396908 UTC---
| Itration            | 941       |
| PAGAR Loss          | -7.66e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 4705000   |
| Running Forward KL  | 7.55      |
| Running Reverse KL  | 26.6      |
| Running Update Time | 941       |
-----------------------------------
--2024-08-13 00:22:41.181760 UTC---
| Itration            | 942       |
| PAGAR Loss          | -1.29e+06 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.66e+06 |
| Running Env Steps   | 4710000   |
| Running Forward KL  | 7.26      |
| Running Reverse KL  | 38.4      |
| Running Update Time | 942       |
-----------------------------------
--2024-08-13 00:25:08.168947 UTC---
| Itration            | 943       |
| PAGAR Loss          | -6.25e+05 |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -7.43e+05 |
| Running Env Steps   | 4715000   |
| Running Forward KL  | 7.45      |
| Running Reverse KL  | 36.7      |
| Running Update Time | 943       |
-----------------------------------
--2024-08-13 00:27:37.073890 UTC---
| Itration            | 944       |
| PAGAR Loss          | 4.82e+05  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 4720000   |
| Running Forward KL  | 7.23      |
| Running Reverse KL  | 4.45      |
| Running Update Time | 944       |
-----------------------------------
--2024-08-13 00:30:02.231282 UTC---
| Itration            | 945       |
| PAGAR Loss          | -6.83e+04 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 4725000   |
| Running Forward KL  | 7.3       |
| Running Reverse KL  | 5.02      |
| Running Update Time | 945       |
-----------------------------------
--2024-08-13 00:32:12.471453 UTC---
| Itration            | 946       |
| PAGAR Loss          | 3.75e+05  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 4730000   |
| Running Forward KL  | 7.5       |
| Running Reverse KL  | 5.38      |
| Running Update Time | 946       |
-----------------------------------
--2024-08-13 00:34:32.556702 UTC--
| Itration            | 947      |
| PAGAR Loss          | 7.87e+04 |
| Real Det Return     | 5.13e+03 |
| Real Sto Return     | 5.03e+03 |
| Reward Loss         | -2.2e+06 |
| Running Env Steps   | 4735000  |
| Running Forward KL  | 7.21     |
| Running Reverse KL  | 4.83     |
| Running Update Time | 947      |
----------------------------------
--2024-08-13 00:36:52.762038 UTC---
| Itration            | 948       |
| PAGAR Loss          | -1.9e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 4740000   |
| Running Forward KL  | 7.14      |
| Running Reverse KL  | 4.9       |
| Running Update Time | 948       |
-----------------------------------
--2024-08-13 00:39:12.492804 UTC---
| Itration            | 949       |
| PAGAR Loss          | 7.2e+05   |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 4745000   |
| Running Forward KL  | 7.27      |
| Running Reverse KL  | 5.12      |
| Running Update Time | 949       |
-----------------------------------
--2024-08-13 00:41:30.204223 UTC---
| Itration            | 950       |
| PAGAR Loss          | 1.84e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 4750000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 4.99      |
| Running Update Time | 950       |
-----------------------------------
--2024-08-13 00:43:48.511720 UTC---
| Itration            | 951       |
| PAGAR Loss          | 6.08e+04  |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 4.31e+03  |
| Reward Loss         | -3.24e+06 |
| Running Env Steps   | 4755000   |
| Running Forward KL  | 7.85      |
| Running Reverse KL  | 40.4      |
| Running Update Time | 951       |
-----------------------------------
--2024-08-13 00:46:08.214104 UTC---
| Itration            | 952       |
| PAGAR Loss          | 7.25e+05  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.45e+06 |
| Running Env Steps   | 4760000   |
| Running Forward KL  | 6.83      |
| Running Reverse KL  | 4.96      |
| Running Update Time | 952       |
-----------------------------------
--2024-08-13 00:48:27.791220 UTC---
| Itration            | 953       |
| PAGAR Loss          | 1.07e+05  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 4765000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 5.29      |
| Running Update Time | 953       |
-----------------------------------
--2024-08-13 00:50:46.923476 UTC---
| Itration            | 954       |
| PAGAR Loss          | -4.62e+07 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -9.64e+05 |
| Running Env Steps   | 4770000   |
| Running Forward KL  | 7.68      |
| Running Reverse KL  | 57.2      |
| Running Update Time | 954       |
-----------------------------------
--2024-08-13 00:53:04.164209 UTC---
| Itration            | 955       |
| PAGAR Loss          | -2.82e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 4775000   |
| Running Forward KL  | 7.19      |
| Running Reverse KL  | 5.29      |
| Running Update Time | 955       |
-----------------------------------
--2024-08-13 00:55:19.183043 UTC---
| Itration            | 956       |
| PAGAR Loss          | -2e+05    |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.64e+06 |
| Running Env Steps   | 4780000   |
| Running Forward KL  | 7.09      |
| Running Reverse KL  | 4.34      |
| Running Update Time | 956       |
-----------------------------------
--2024-08-13 00:57:35.720722 UTC---
| Itration            | 957       |
| PAGAR Loss          | -3.83e+04 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -1.81e+06 |
| Running Env Steps   | 4785000   |
| Running Forward KL  | 6.93      |
| Running Reverse KL  | 4.49      |
| Running Update Time | 957       |
-----------------------------------
--2024-08-13 00:59:50.148216 UTC--
| Itration            | 958      |
| PAGAR Loss          | 1.19e+05 |
| Real Det Return     | 5.43e+03 |
| Real Sto Return     | 4.77e+03 |
| Reward Loss         | -1.7e+06 |
| Running Env Steps   | 4790000  |
| Running Forward KL  | 7.55     |
| Running Reverse KL  | 47       |
| Running Update Time | 958      |
----------------------------------
--2024-08-13 01:02:05.058189 UTC---
| Itration            | 959       |
| PAGAR Loss          | 9.09e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 4795000   |
| Running Forward KL  | 6.97      |
| Running Reverse KL  | 5.17      |
| Running Update Time | 959       |
-----------------------------------
--2024-08-13 01:04:19.383607 UTC---
| Itration            | 960       |
| PAGAR Loss          | -6.58e+04 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.28e+06 |
| Running Env Steps   | 4800000   |
| Running Forward KL  | 6.84      |
| Running Reverse KL  | 4.66      |
| Running Update Time | 960       |
-----------------------------------
--2024-08-13 01:06:34.108169 UTC---
| Itration            | 961       |
| PAGAR Loss          | -1.36e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -9.24e+05 |
| Running Env Steps   | 4805000   |
| Running Forward KL  | 7.44      |
| Running Reverse KL  | 16.8      |
| Running Update Time | 961       |
-----------------------------------
--2024-08-13 01:08:44.302805 UTC---
| Itration            | 962       |
| PAGAR Loss          | -2.29e+05 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -1.26e+06 |
| Running Env Steps   | 4810000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 9.06      |
| Running Update Time | 962       |
-----------------------------------
--2024-08-13 01:10:32.326883 UTC---
| Itration            | 963       |
| PAGAR Loss          | 1.79e+05  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.33e+06 |
| Running Env Steps   | 4815000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 4.44      |
| Running Update Time | 963       |
-----------------------------------
--2024-08-13 01:12:20.330338 UTC---
| Itration            | 964       |
| PAGAR Loss          | -2.29e+05 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.5e+06  |
| Running Env Steps   | 4820000   |
| Running Forward KL  | 6.52      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 964       |
-----------------------------------
--2024-08-13 01:14:08.626384 UTC--
| Itration            | 965      |
| PAGAR Loss          | 7.36e+04 |
| Real Det Return     | 5.21e+03 |
| Real Sto Return     | 5.17e+03 |
| Reward Loss         | -2e+06   |
| Running Env Steps   | 4825000  |
| Running Forward KL  | 7.14     |
| Running Reverse KL  | 4.84     |
| Running Update Time | 965      |
----------------------------------
--2024-08-13 01:15:56.690204 UTC---
| Itration            | 966       |
| PAGAR Loss          | -6.2e+04  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 4830000   |
| Running Forward KL  | 7.26      |
| Running Reverse KL  | 13.2      |
| Running Update Time | 966       |
-----------------------------------
--2024-08-13 01:17:44.496492 UTC---
| Itration            | 967       |
| PAGAR Loss          | 5.52e+04  |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -6.11e+05 |
| Running Env Steps   | 4835000   |
| Running Forward KL  | 7.24      |
| Running Reverse KL  | 4.72      |
| Running Update Time | 967       |
-----------------------------------
--2024-08-13 01:19:32.710629 UTC---
| Itration            | 968       |
| PAGAR Loss          | -8.94e+04 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 4840000   |
| Running Forward KL  | 6.51      |
| Running Reverse KL  | 7.15      |
| Running Update Time | 968       |
-----------------------------------
--2024-08-13 01:21:20.204956 UTC---
| Itration            | 969       |
| PAGAR Loss          | 1.07e+06  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -9.29e+05 |
| Running Env Steps   | 4845000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 5.11      |
| Running Update Time | 969       |
-----------------------------------
--2024-08-13 01:23:08.545457 UTC---
| Itration            | 970       |
| PAGAR Loss          | 7.79e+03  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -6.68e+05 |
| Running Env Steps   | 4850000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 4.51      |
| Running Update Time | 970       |
-----------------------------------
--2024-08-13 01:24:55.794050 UTC---
| Itration            | 971       |
| PAGAR Loss          | 1.26e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 4855000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 29.6      |
| Running Update Time | 971       |
-----------------------------------
--2024-08-13 01:26:43.168744 UTC---
| Itration            | 972       |
| PAGAR Loss          | -1.93e+05 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 4860000   |
| Running Forward KL  | 6.89      |
| Running Reverse KL  | 32.1      |
| Running Update Time | 972       |
-----------------------------------
--2024-08-13 01:28:30.941389 UTC---
| Itration            | 973       |
| PAGAR Loss          | 3.33e+06  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 4865000   |
| Running Forward KL  | 7.18      |
| Running Reverse KL  | 24.9      |
| Running Update Time | 973       |
-----------------------------------
--2024-08-13 01:30:18.684298 UTC---
| Itration            | 974       |
| PAGAR Loss          | -7.97e+05 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -6.87e+05 |
| Running Env Steps   | 4870000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 7.98      |
| Running Update Time | 974       |
-----------------------------------
--2024-08-13 01:32:06.667472 UTC---
| Itration            | 975       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -5.01e+05 |
| Running Env Steps   | 4875000   |
| Running Forward KL  | 7.62      |
| Running Reverse KL  | 83        |
| Running Update Time | 975       |
-----------------------------------
--2024-08-13 01:33:54.255664 UTC---
| Itration            | 976       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 4880000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 4.2       |
| Running Update Time | 976       |
-----------------------------------
--2024-08-13 01:35:41.384693 UTC---
| Itration            | 977       |
| PAGAR Loss          | -5.12e+04 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -9.84e+05 |
| Running Env Steps   | 4885000   |
| Running Forward KL  | 6.7       |
| Running Reverse KL  | 3.22      |
| Running Update Time | 977       |
-----------------------------------
--2024-08-13 01:37:29.353516 UTC---
| Itration            | 978       |
| PAGAR Loss          | 2.26e+05  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -9.93e+05 |
| Running Env Steps   | 4890000   |
| Running Forward KL  | 7.12      |
| Running Reverse KL  | 33.3      |
| Running Update Time | 978       |
-----------------------------------
--2024-08-13 01:39:17.607397 UTC--
| Itration            | 979      |
| PAGAR Loss          | 1.09e+05 |
| Real Det Return     | 5.44e+03 |
| Real Sto Return     | 5.36e+03 |
| Reward Loss         | 1.61e+05 |
| Running Env Steps   | 4895000  |
| Running Forward KL  | 6.56     |
| Running Reverse KL  | 28.1     |
| Running Update Time | 979      |
----------------------------------
--2024-08-13 01:41:05.024002 UTC---
| Itration            | 980       |
| PAGAR Loss          | 1.93e+05  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 4900000   |
| Running Forward KL  | 7.02      |
| Running Reverse KL  | 3.89      |
| Running Update Time | 980       |
-----------------------------------
--2024-08-13 01:42:52.315772 UTC---
| Itration            | 981       |
| PAGAR Loss          | 8.32e+05  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -4.59e+05 |
| Running Env Steps   | 4905000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 981       |
-----------------------------------
--2024-08-13 01:44:40.014853 UTC---
| Itration            | 982       |
| PAGAR Loss          | -1.12e+07 |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -3.82e+05 |
| Running Env Steps   | 4910000   |
| Running Forward KL  | 7.17      |
| Running Reverse KL  | 3.63      |
| Running Update Time | 982       |
-----------------------------------
--2024-08-13 01:46:26.993766 UTC---
| Itration            | 983       |
| PAGAR Loss          | -3.26e+07 |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 4.84e+03  |
| Reward Loss         | 3.29e+05  |
| Running Env Steps   | 4915000   |
| Running Forward KL  | 6.95      |
| Running Reverse KL  | 17.7      |
| Running Update Time | 983       |
-----------------------------------
--2024-08-13 01:48:14.968513 UTC---
| Itration            | 984       |
| PAGAR Loss          | -1.88e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -3.32e+05 |
| Running Env Steps   | 4920000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 18.2      |
| Running Update Time | 984       |
-----------------------------------
--2024-08-13 01:50:02.847849 UTC---
| Itration            | 985       |
| PAGAR Loss          | -6.56e+05 |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | 3.08e+05  |
| Running Env Steps   | 4925000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 6.33      |
| Running Update Time | 985       |
-----------------------------------
--2024-08-13 01:51:50.924215 UTC---
| Itration            | 986       |
| PAGAR Loss          | -4.03e+05 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.36e+06 |
| Running Env Steps   | 4930000   |
| Running Forward KL  | 7.46      |
| Running Reverse KL  | 29.9      |
| Running Update Time | 986       |
-----------------------------------
--2024-08-13 01:53:39.075148 UTC---
| Itration            | 987       |
| PAGAR Loss          | -4.82e+12 |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 4935000   |
| Running Forward KL  | 6.82      |
| Running Reverse KL  | 32.8      |
| Running Update Time | 987       |
-----------------------------------
--2024-08-13 01:55:24.940172 UTC---
| Itration            | 988       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.16e+03  |
| Reward Loss         | -5.78e+05 |
| Running Env Steps   | 4940000   |
| Running Forward KL  | 6.88      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 988       |
-----------------------------------
--2024-08-13 01:57:11.428016 UTC---
| Itration            | 989       |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -7.38e+06 |
| Running Env Steps   | 4945000   |
| Running Forward KL  | 7.84      |
| Running Reverse KL  | 125       |
| Running Update Time | 989       |
-----------------------------------
--2024-08-13 01:58:57.807142 UTC---
| Itration            | 990       |
| PAGAR Loss          | -3.31e+05 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -2.25e+05 |
| Running Env Steps   | 4950000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 21.3      |
| Running Update Time | 990       |
-----------------------------------
--2024-08-13 02:00:40.533911 UTC---
| Itration            | 991       |
| PAGAR Loss          | -1.21e+07 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -1.4e+05  |
| Running Env Steps   | 4955000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 10.1      |
| Running Update Time | 991       |
-----------------------------------
--2024-08-13 02:02:28.864343 UTC---
| Itration            | 992       |
| PAGAR Loss          | 8.96e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 4960000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 3.41      |
| Running Update Time | 992       |
-----------------------------------
--2024-08-13 02:04:16.694021 UTC---
| Itration            | 993       |
| PAGAR Loss          | -2.23e+05 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.94e+05 |
| Running Env Steps   | 4965000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 5.08      |
| Running Update Time | 993       |
-----------------------------------
--2024-08-13 02:06:04.885912 UTC---
| Itration            | 994       |
| PAGAR Loss          | -9.81e+04 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -7.04e+05 |
| Running Env Steps   | 4970000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 2.95      |
| Running Update Time | 994       |
-----------------------------------
--2024-08-13 02:07:53.070773 UTC---
| Itration            | 995       |
| PAGAR Loss          | -7.49e+04 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -5.77e+05 |
| Running Env Steps   | 4975000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 2.88      |
| Running Update Time | 995       |
-----------------------------------
--2024-08-13 02:09:40.917271 UTC---
| Itration            | 996       |
| PAGAR Loss          | -1.9e+07  |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -5.53e+05 |
| Running Env Steps   | 4980000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 19.8      |
| Running Update Time | 996       |
-----------------------------------
--2024-08-13 02:11:29.096942 UTC---
| Itration            | 997       |
| PAGAR Loss          | 5.61e+09  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -7.34e+05 |
| Running Env Steps   | 4985000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 997       |
-----------------------------------
--2024-08-13 02:13:14.067459 UTC---
| Itration            | 998       |
| PAGAR Loss          | 3.25e+06  |
| Real Det Return     | 4.25e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -4.83e+05 |
| Running Env Steps   | 4990000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 49.5      |
| Running Update Time | 998       |
-----------------------------------
--2024-08-13 02:14:57.071806 UTC---
| Itration            | 999       |
| PAGAR Loss          | 3.01e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -7.63e+05 |
| Running Env Steps   | 4995000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 3.03      |
| Running Update Time | 999       |
-----------------------------------
--2024-08-13 02:16:43.749168 UTC---
| Itration            | 1000      |
| PAGAR Loss          | 7e+04     |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -9.03e+05 |
| Running Env Steps   | 5000000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 2.51      |
| Running Update Time | 1000      |
-----------------------------------
--2024-08-13 02:18:28.258988 UTC---
| Itration            | 1001      |
| PAGAR Loss          | 5.65e+04  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -8.15e+05 |
| Running Env Steps   | 5005000   |
| Running Forward KL  | 5.81      |
| Running Reverse KL  | 10.3      |
| Running Update Time | 1001      |
-----------------------------------
--2024-08-13 02:20:11.416619 UTC---
| Itration            | 1002      |
| PAGAR Loss          | 8.92e+05  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -9.31e+05 |
| Running Env Steps   | 5010000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 3.02      |
| Running Update Time | 1002      |
-----------------------------------
--2024-08-13 02:21:54.758151 UTC---
| Itration            | 1003      |
| PAGAR Loss          | 2.85e+08  |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.89e+05 |
| Running Env Steps   | 5015000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 3.04      |
| Running Update Time | 1003      |
-----------------------------------
--2024-08-13 02:23:40.357527 UTC---
| Itration            | 1004      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -7.25e+05 |
| Running Env Steps   | 5020000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 23.8      |
| Running Update Time | 1004      |
-----------------------------------
--2024-08-13 02:25:28.636967 UTC---
| Itration            | 1005      |
| PAGAR Loss          | -6.39e+04 |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -7.35e+05 |
| Running Env Steps   | 5025000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 2.49      |
| Running Update Time | 1005      |
-----------------------------------
--2024-08-13 02:27:16.174600 UTC---
| Itration            | 1006      |
| PAGAR Loss          | -3.98e+05 |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -2.05e+06 |
| Running Env Steps   | 5030000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 61.7      |
| Running Update Time | 1006      |
-----------------------------------
--2024-08-13 02:29:04.152043 UTC---
| Itration            | 1007      |
| PAGAR Loss          | -1.51e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 5035000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 22.9      |
| Running Update Time | 1007      |
-----------------------------------
--2024-08-13 02:30:50.482372 UTC---
| Itration            | 1008      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -9.58e+05 |
| Running Env Steps   | 5040000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 1008      |
-----------------------------------
--2024-08-13 02:32:34.419095 UTC---
| Itration            | 1009      |
| PAGAR Loss          | 2.02e+05  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -5.52e+05 |
| Running Env Steps   | 5045000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 3.64      |
| Running Update Time | 1009      |
-----------------------------------
--2024-08-13 02:34:21.759388 UTC---
| Itration            | 1010      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 5050000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 59.7      |
| Running Update Time | 1010      |
-----------------------------------
--2024-08-13 02:36:09.398165 UTC---
| Itration            | 1011      |
| PAGAR Loss          | 1.8e+08   |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -8.75e+05 |
| Running Env Steps   | 5055000   |
| Running Forward KL  | 5.89      |
| Running Reverse KL  | 3.43      |
| Running Update Time | 1011      |
-----------------------------------
--2024-08-13 02:37:56.469900 UTC---
| Itration            | 1012      |
| PAGAR Loss          | 3.06e+06  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -3.62e+05 |
| Running Env Steps   | 5060000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 2.85      |
| Running Update Time | 1012      |
-----------------------------------
--2024-08-13 02:39:44.510422 UTC---
| Itration            | 1013      |
| PAGAR Loss          | 1.04e+05  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -9.99e+05 |
| Running Env Steps   | 5065000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 3.49      |
| Running Update Time | 1013      |
-----------------------------------
--2024-08-13 02:41:32.896604 UTC---
| Itration            | 1014      |
| PAGAR Loss          | -1.6e+05  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.92e+06 |
| Running Env Steps   | 5070000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 17.3      |
| Running Update Time | 1014      |
-----------------------------------
--2024-08-13 02:43:20.453778 UTC---
| Itration            | 1015      |
| PAGAR Loss          | -1.77e+06 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.75e+03  |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 5075000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 3.16      |
| Running Update Time | 1015      |
-----------------------------------
--2024-08-13 02:45:07.876157 UTC---
| Itration            | 1016      |
| PAGAR Loss          | 2.33e+07  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -8.82e+05 |
| Running Env Steps   | 5080000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 3.11      |
| Running Update Time | 1016      |
-----------------------------------
--2024-08-13 02:46:55.442368 UTC---
| Itration            | 1017      |
| PAGAR Loss          | -2.72e+04 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -9.77e+05 |
| Running Env Steps   | 5085000   |
| Running Forward KL  | 5.61      |
| Running Reverse KL  | 2.98      |
| Running Update Time | 1017      |
-----------------------------------
--2024-08-13 02:48:43.715529 UTC---
| Itration            | 1018      |
| PAGAR Loss          | -1.07e+06 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 5090000   |
| Running Forward KL  | 5.85      |
| Running Reverse KL  | 37.8      |
| Running Update Time | 1018      |
-----------------------------------
--2024-08-13 02:50:31.231282 UTC---
| Itration            | 1019      |
| PAGAR Loss          | 1.87e+04  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 5095000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 3.48      |
| Running Update Time | 1019      |
-----------------------------------
--2024-08-13 02:52:15.943844 UTC---
| Itration            | 1020      |
| PAGAR Loss          | -5.6e+06  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 5100000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 30.5      |
| Running Update Time | 1020      |
-----------------------------------
--2024-08-13 02:54:04.271853 UTC---
| Itration            | 1021      |
| PAGAR Loss          | -3.78e+04 |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -6.77e+05 |
| Running Env Steps   | 5105000   |
| Running Forward KL  | 6.14      |
| Running Reverse KL  | 3.15      |
| Running Update Time | 1021      |
-----------------------------------
--2024-08-13 02:55:52.606040 UTC---
| Itration            | 1022      |
| PAGAR Loss          | 9.63e+04  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -8.17e+05 |
| Running Env Steps   | 5110000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 3.48      |
| Running Update Time | 1022      |
-----------------------------------
--2024-08-13 02:57:40.380765 UTC---
| Itration            | 1023      |
| PAGAR Loss          | 2.56e+05  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.38e+06 |
| Running Env Steps   | 5115000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 3.82      |
| Running Update Time | 1023      |
-----------------------------------
--2024-08-13 02:59:28.577613 UTC---
| Itration            | 1024      |
| PAGAR Loss          | -1.95e+05 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 5120000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 2.97      |
| Running Update Time | 1024      |
-----------------------------------
--2024-08-13 03:01:14.204985 UTC---
| Itration            | 1025      |
| PAGAR Loss          | -2.15e+05 |
| Real Det Return     | 4.33e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.32e+06 |
| Running Env Steps   | 5125000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 3.18      |
| Running Update Time | 1025      |
-----------------------------------
--2024-08-13 03:02:58.164132 UTC---
| Itration            | 1026      |
| PAGAR Loss          | -1.93e+04 |
| Real Det Return     | 3.43e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -1.72e+06 |
| Running Env Steps   | 5130000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 3.52      |
| Running Update Time | 1026      |
-----------------------------------
--2024-08-13 03:04:46.221320 UTC---
| Itration            | 1027      |
| PAGAR Loss          | 2.75e+04  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 5135000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 3.69      |
| Running Update Time | 1027      |
-----------------------------------
--2024-08-13 03:06:33.572805 UTC---
| Itration            | 1028      |
| PAGAR Loss          | 5.94e+04  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -1.38e+06 |
| Running Env Steps   | 5140000   |
| Running Forward KL  | 5.88      |
| Running Reverse KL  | 3.09      |
| Running Update Time | 1028      |
-----------------------------------
--2024-08-13 03:08:20.954798 UTC---
| Itration            | 1029      |
| PAGAR Loss          | 9.86e+04  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -7.05e+05 |
| Running Env Steps   | 5145000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 1029      |
-----------------------------------
--2024-08-13 03:10:08.710529 UTC---
| Itration            | 1030      |
| PAGAR Loss          | 2.11e+07  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -7.61e+05 |
| Running Env Steps   | 5150000   |
| Running Forward KL  | 5.29      |
| Running Reverse KL  | 2.75      |
| Running Update Time | 1030      |
-----------------------------------
--2024-08-13 03:11:57.079873 UTC---
| Itration            | 1031      |
| PAGAR Loss          | -8.16e+04 |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -5.26e+05 |
| Running Env Steps   | 5155000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 3.32      |
| Running Update Time | 1031      |
-----------------------------------
--2024-08-13 03:13:44.925070 UTC---
| Itration            | 1032      |
| PAGAR Loss          | 2.65e+05  |
| Real Det Return     | 5.05e+03  |
| Real Sto Return     | 4.72e+03  |
| Reward Loss         | -1.71e+06 |
| Running Env Steps   | 5160000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 2.98      |
| Running Update Time | 1032      |
-----------------------------------
--2024-08-13 03:15:32.714974 UTC---
| Itration            | 1033      |
| PAGAR Loss          | -1.85e+05 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -1.89e+06 |
| Running Env Steps   | 5165000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 3.32      |
| Running Update Time | 1033      |
-----------------------------------
--2024-08-13 03:17:19.308572 UTC---
| Itration            | 1034      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -2.22e+06 |
| Running Env Steps   | 5170000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 38.2      |
| Running Update Time | 1034      |
-----------------------------------
--2024-08-13 03:19:06.337315 UTC---
| Itration            | 1035      |
| PAGAR Loss          | -9.15e+08 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -2.06e+06 |
| Running Env Steps   | 5175000   |
| Running Forward KL  | 6.56      |
| Running Reverse KL  | 67        |
| Running Update Time | 1035      |
-----------------------------------
--2024-08-13 03:20:54.580071 UTC---
| Itration            | 1036      |
| PAGAR Loss          | -1.13e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.62e+06 |
| Running Env Steps   | 5180000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 2.97      |
| Running Update Time | 1036      |
-----------------------------------
--2024-08-13 03:22:41.349359 UTC---
| Itration            | 1037      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -4.04e+06 |
| Running Env Steps   | 5185000   |
| Running Forward KL  | 7.35      |
| Running Reverse KL  | 141       |
| Running Update Time | 1037      |
-----------------------------------
--2024-08-13 03:24:29.265479 UTC---
| Itration            | 1038      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -2.48e+06 |
| Running Env Steps   | 5190000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 36.4      |
| Running Update Time | 1038      |
-----------------------------------
--2024-08-13 03:26:17.256666 UTC---
| Itration            | 1039      |
| PAGAR Loss          | -5.53e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -9.74e+05 |
| Running Env Steps   | 5195000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 3.24      |
| Running Update Time | 1039      |
-----------------------------------
--2024-08-13 03:28:05.678058 UTC---
| Itration            | 1040      |
| PAGAR Loss          | -1.75e+05 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.61e+06 |
| Running Env Steps   | 5200000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 3.18      |
| Running Update Time | 1040      |
-----------------------------------
--2024-08-13 03:29:53.724742 UTC---
| Itration            | 1041      |
| PAGAR Loss          | 3.94e+03  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -8.11e+05 |
| Running Env Steps   | 5205000   |
| Running Forward KL  | 5.75      |
| Running Reverse KL  | 3.44      |
| Running Update Time | 1041      |
-----------------------------------
--2024-08-13 03:31:42.011752 UTC---
| Itration            | 1042      |
| PAGAR Loss          | -2.29e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 5210000   |
| Running Forward KL  | 6.07      |
| Running Reverse KL  | 3.26      |
| Running Update Time | 1042      |
-----------------------------------
--2024-08-13 03:33:30.329609 UTC---
| Itration            | 1043      |
| PAGAR Loss          | 4.9e+05   |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.38e+06 |
| Running Env Steps   | 5215000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 3.21      |
| Running Update Time | 1043      |
-----------------------------------
--2024-08-13 03:35:18.586111 UTC--
| Itration            | 1044     |
| PAGAR Loss          | 1.08e+04 |
| Real Det Return     | 5.42e+03 |
| Real Sto Return     | 5.32e+03 |
| Reward Loss         | -6.7e+05 |
| Running Env Steps   | 5220000  |
| Running Forward KL  | 5.62     |
| Running Reverse KL  | 3.12     |
| Running Update Time | 1044     |
----------------------------------
--2024-08-13 03:37:06.694389 UTC---
| Itration            | 1045      |
| PAGAR Loss          | -1.28e+05 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -2.51e+06 |
| Running Env Steps   | 5225000   |
| Running Forward KL  | 6.91      |
| Running Reverse KL  | 42.7      |
| Running Update Time | 1045      |
-----------------------------------
--2024-08-13 03:38:55.053811 UTC---
| Itration            | 1046      |
| PAGAR Loss          | -1.13e+04 |
| Real Det Return     | 5.14e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 5230000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 3.84      |
| Running Update Time | 1046      |
-----------------------------------
--2024-08-13 03:40:43.348934 UTC---
| Itration            | 1047      |
| PAGAR Loss          | -7e+06    |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -9.92e+05 |
| Running Env Steps   | 5235000   |
| Running Forward KL  | 5.28      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 1047      |
-----------------------------------
--2024-08-13 03:42:31.336986 UTC---
| Itration            | 1048      |
| PAGAR Loss          | 2.15e+04  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 5240000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 3         |
| Running Update Time | 1048      |
-----------------------------------
--2024-08-13 03:44:19.257075 UTC---
| Itration            | 1049      |
| PAGAR Loss          | -4.66e+03 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.36e+06 |
| Running Env Steps   | 5245000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 1049      |
-----------------------------------
--2024-08-13 03:46:07.782514 UTC---
| Itration            | 1050      |
| PAGAR Loss          | 3.81e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 5250000   |
| Running Forward KL  | 5.97      |
| Running Reverse KL  | 3.38      |
| Running Update Time | 1050      |
-----------------------------------
--2024-08-13 03:47:56.277228 UTC---
| Itration            | 1051      |
| PAGAR Loss          | -4.41e+04 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.35e+06 |
| Running Env Steps   | 5255000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 2.91      |
| Running Update Time | 1051      |
-----------------------------------
--2024-08-13 03:49:44.650863 UTC---
| Itration            | 1052      |
| PAGAR Loss          | -3.66e+04 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -9.34e+05 |
| Running Env Steps   | 5260000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 3.82      |
| Running Update Time | 1052      |
-----------------------------------
--2024-08-13 03:51:33.128381 UTC---
| Itration            | 1053      |
| PAGAR Loss          | -4.43e+04 |
| Real Det Return     | 5.18e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.99e+06 |
| Running Env Steps   | 5265000   |
| Running Forward KL  | 6.36      |
| Running Reverse KL  | 3.54      |
| Running Update Time | 1053      |
-----------------------------------
--2024-08-13 03:53:21.550064 UTC---
| Itration            | 1054      |
| PAGAR Loss          | 8.28e+03  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 5270000   |
| Running Forward KL  | 6.19      |
| Running Reverse KL  | 3.92      |
| Running Update Time | 1054      |
-----------------------------------
--2024-08-13 03:55:10.049160 UTC---
| Itration            | 1055      |
| PAGAR Loss          | 8.41e+04  |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -9.74e+05 |
| Running Env Steps   | 5275000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 3.26      |
| Running Update Time | 1055      |
-----------------------------------
--2024-08-13 03:56:58.096445 UTC---
| Itration            | 1056      |
| PAGAR Loss          | -3.91e+05 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -8.24e+05 |
| Running Env Steps   | 5280000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 2.59      |
| Running Update Time | 1056      |
-----------------------------------
--2024-08-13 03:58:45.632343 UTC---
| Itration            | 1057      |
| PAGAR Loss          | -1.03e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -8.89e+05 |
| Running Env Steps   | 5285000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 3.1       |
| Running Update Time | 1057      |
-----------------------------------
--2024-08-13 04:00:32.801562 UTC---
| Itration            | 1058      |
| PAGAR Loss          | 3.14e+04  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.62e+03  |
| Reward Loss         | -8.66e+05 |
| Running Env Steps   | 5290000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 3.26      |
| Running Update Time | 1058      |
-----------------------------------
--2024-08-13 04:02:19.320980 UTC---
| Itration            | 1059      |
| PAGAR Loss          | 9.98e+04  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.26e+03  |
| Reward Loss         | -1.67e+06 |
| Running Env Steps   | 5295000   |
| Running Forward KL  | 5.91      |
| Running Reverse KL  | 28.2      |
| Running Update Time | 1059      |
-----------------------------------
--2024-08-13 04:04:07.688653 UTC--
| Itration            | 1060     |
| PAGAR Loss          | 6.6e+05  |
| Real Det Return     | 5.42e+03 |
| Real Sto Return     | 5.26e+03 |
| Reward Loss         | -8.6e+05 |
| Running Env Steps   | 5300000  |
| Running Forward KL  | 5.26     |
| Running Reverse KL  | 2.46     |
| Running Update Time | 1060     |
----------------------------------
--2024-08-13 04:05:55.639306 UTC---
| Itration            | 1061      |
| PAGAR Loss          | 2.4e+04   |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 5305000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 2.87      |
| Running Update Time | 1061      |
-----------------------------------
--2024-08-13 04:07:43.664181 UTC---
| Itration            | 1062      |
| PAGAR Loss          | -4.97e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.16e+06 |
| Running Env Steps   | 5310000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 3.15      |
| Running Update Time | 1062      |
-----------------------------------
--2024-08-13 04:09:31.394869 UTC---
| Itration            | 1063      |
| PAGAR Loss          | -4.45e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -2.24e+06 |
| Running Env Steps   | 5315000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 36.3      |
| Running Update Time | 1063      |
-----------------------------------
--2024-08-13 04:11:19.879713 UTC---
| Itration            | 1064      |
| PAGAR Loss          | 6.21e+04  |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -7.39e+05 |
| Running Env Steps   | 5320000   |
| Running Forward KL  | 6         |
| Running Reverse KL  | 3.78      |
| Running Update Time | 1064      |
-----------------------------------
--2024-08-13 04:13:08.045909 UTC---
| Itration            | 1065      |
| PAGAR Loss          | -9.2e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -8.18e+05 |
| Running Env Steps   | 5325000   |
| Running Forward KL  | 6.02      |
| Running Reverse KL  | 3.26      |
| Running Update Time | 1065      |
-----------------------------------
--2024-08-13 04:14:56.065459 UTC---
| Itration            | 1066      |
| PAGAR Loss          | 5.34e+04  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -1.63e+06 |
| Running Env Steps   | 5330000   |
| Running Forward KL  | 5.91      |
| Running Reverse KL  | 3.15      |
| Running Update Time | 1066      |
-----------------------------------
--2024-08-13 04:16:44.341430 UTC---
| Itration            | 1067      |
| PAGAR Loss          | -4.81e+04 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -1.2e+06  |
| Running Env Steps   | 5335000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 32.5      |
| Running Update Time | 1067      |
-----------------------------------
--2024-08-13 04:18:32.181665 UTC---
| Itration            | 1068      |
| PAGAR Loss          | 1.28e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.07e+06 |
| Running Env Steps   | 5340000   |
| Running Forward KL  | 5.4       |
| Running Reverse KL  | 3.07      |
| Running Update Time | 1068      |
-----------------------------------
--2024-08-13 04:20:19.621341 UTC---
| Itration            | 1069      |
| PAGAR Loss          | 1.45e+06  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -1.83e+06 |
| Running Env Steps   | 5345000   |
| Running Forward KL  | 5.95      |
| Running Reverse KL  | 29.8      |
| Running Update Time | 1069      |
-----------------------------------
--2024-08-13 04:22:08.141818 UTC---
| Itration            | 1070      |
| PAGAR Loss          | -1.3e+05  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 5350000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 3.21      |
| Running Update Time | 1070      |
-----------------------------------
--2024-08-13 04:23:56.527065 UTC---
| Itration            | 1071      |
| PAGAR Loss          | -5.82e+04 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 5355000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 3.68      |
| Running Update Time | 1071      |
-----------------------------------
--2024-08-13 04:25:44.922529 UTC---
| Itration            | 1072      |
| PAGAR Loss          | 2.83e+04  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -7.93e+05 |
| Running Env Steps   | 5360000   |
| Running Forward KL  | 5.62      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 1072      |
-----------------------------------
--2024-08-13 04:27:32.790710 UTC---
| Itration            | 1073      |
| PAGAR Loss          | 2.88e+04  |
| Real Det Return     | 5.13e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -1.84e+06 |
| Running Env Steps   | 5365000   |
| Running Forward KL  | 6.1       |
| Running Reverse KL  | 3.6       |
| Running Update Time | 1073      |
-----------------------------------
--2024-08-13 04:29:21.010719 UTC---
| Itration            | 1074      |
| PAGAR Loss          | 2.41e+05  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.43e+06 |
| Running Env Steps   | 5370000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 3.11      |
| Running Update Time | 1074      |
-----------------------------------
--2024-08-13 04:31:09.030018 UTC---
| Itration            | 1075      |
| PAGAR Loss          | 5.27e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -1.12e+06 |
| Running Env Steps   | 5375000   |
| Running Forward KL  | 5.87      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 1075      |
-----------------------------------
--2024-08-13 04:32:56.793553 UTC---
| Itration            | 1076      |
| PAGAR Loss          | -3.4e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -6.44e+05 |
| Running Env Steps   | 5380000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 2.67      |
| Running Update Time | 1076      |
-----------------------------------
--2024-08-13 04:34:45.087927 UTC---
| Itration            | 1077      |
| PAGAR Loss          | 7.74e+04  |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -1.83e+06 |
| Running Env Steps   | 5385000   |
| Running Forward KL  | 5.81      |
| Running Reverse KL  | 3.56      |
| Running Update Time | 1077      |
-----------------------------------
--2024-08-13 04:36:33.428385 UTC---
| Itration            | 1078      |
| PAGAR Loss          | -2.32e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -1.35e+06 |
| Running Env Steps   | 5390000   |
| Running Forward KL  | 5.53      |
| Running Reverse KL  | 36.8      |
| Running Update Time | 1078      |
-----------------------------------
--2024-08-13 04:38:19.575369 UTC--
| Itration            | 1079     |
| PAGAR Loss          | -1.1e+05 |
| Real Det Return     | 5.38e+03 |
| Real Sto Return     | 5.26e+03 |
| Reward Loss         | -7.6e+05 |
| Running Env Steps   | 5395000  |
| Running Forward KL  | 5.39     |
| Running Reverse KL  | 3.1      |
| Running Update Time | 1079     |
----------------------------------
--2024-08-13 04:40:04.252050 UTC---
| Itration            | 1080      |
| PAGAR Loss          | -2.67e+05 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -1.42e+06 |
| Running Env Steps   | 5400000   |
| Running Forward KL  | 5.81      |
| Running Reverse KL  | 3.14      |
| Running Update Time | 1080      |
-----------------------------------
--2024-08-13 04:41:51.776377 UTC---
| Itration            | 1081      |
| PAGAR Loss          | 1.11e+05  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.47e+06 |
| Running Env Steps   | 5405000   |
| Running Forward KL  | 5.58      |
| Running Reverse KL  | 2.91      |
| Running Update Time | 1081      |
-----------------------------------
--2024-08-13 04:43:40.129040 UTC---
| Itration            | 1082      |
| PAGAR Loss          | 1.85e+05  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -6.77e+05 |
| Running Env Steps   | 5410000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 3.08      |
| Running Update Time | 1082      |
-----------------------------------
--2024-08-13 04:45:28.459308 UTC---
| Itration            | 1083      |
| PAGAR Loss          | 5.54e+05  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 5415000   |
| Running Forward KL  | 5.58      |
| Running Reverse KL  | 3.06      |
| Running Update Time | 1083      |
-----------------------------------
--2024-08-13 04:47:17.279315 UTC---
| Itration            | 1084      |
| PAGAR Loss          | 2.4e+05   |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 5420000   |
| Running Forward KL  | 5.86      |
| Running Reverse KL  | 3.17      |
| Running Update Time | 1084      |
-----------------------------------
--2024-08-13 04:49:06.076211 UTC---
| Itration            | 1085      |
| PAGAR Loss          | -1.16e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -7.24e+05 |
| Running Env Steps   | 5425000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 3.05      |
| Running Update Time | 1085      |
-----------------------------------
--2024-08-13 04:50:54.293756 UTC---
| Itration            | 1086      |
| PAGAR Loss          | 7.85e+05  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 5430000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 3.48      |
| Running Update Time | 1086      |
-----------------------------------
--2024-08-13 04:52:42.667906 UTC---
| Itration            | 1087      |
| PAGAR Loss          | 9.57e+04  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -5.95e+05 |
| Running Env Steps   | 5435000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 1087      |
-----------------------------------
--2024-08-13 04:54:30.941051 UTC---
| Itration            | 1088      |
| PAGAR Loss          | 1.26e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -9.02e+05 |
| Running Env Steps   | 5440000   |
| Running Forward KL  | 5.52      |
| Running Reverse KL  | 2.81      |
| Running Update Time | 1088      |
-----------------------------------
--2024-08-13 04:56:18.688800 UTC---
| Itration            | 1089      |
| PAGAR Loss          | 4.54e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 5445000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 3.23      |
| Running Update Time | 1089      |
-----------------------------------
--2024-08-13 04:58:05.418958 UTC---
| Itration            | 1090      |
| PAGAR Loss          | 6.2e+04   |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -6.34e+05 |
| Running Env Steps   | 5450000   |
| Running Forward KL  | 5.14      |
| Running Reverse KL  | 2.59      |
| Running Update Time | 1090      |
-----------------------------------
--2024-08-13 04:59:53.734642 UTC---
| Itration            | 1091      |
| PAGAR Loss          | 9.62e+04  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -3.55e+05 |
| Running Env Steps   | 5455000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 3.34      |
| Running Update Time | 1091      |
-----------------------------------
--2024-08-13 05:01:41.996827 UTC---
| Itration            | 1092      |
| PAGAR Loss          | 1.26e+05  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 5460000   |
| Running Forward KL  | 5.86      |
| Running Reverse KL  | 3.31      |
| Running Update Time | 1092      |
-----------------------------------
--2024-08-13 05:03:28.202355 UTC---
| Itration            | 1093      |
| PAGAR Loss          | -3.81e+06 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.27e+03  |
| Reward Loss         | -1.4e+06  |
| Running Env Steps   | 5465000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 40.6      |
| Running Update Time | 1093      |
-----------------------------------
--2024-08-13 05:05:04.963268 UTC---
| Itration            | 1094      |
| PAGAR Loss          | nan       |
| Real Det Return     | 486       |
| Real Sto Return     | 4.23e+03  |
| Reward Loss         | -4.32e+06 |
| Running Env Steps   | 5470000   |
| Running Forward KL  | 7.28      |
| Running Reverse KL  | 121       |
| Running Update Time | 1094      |
-----------------------------------
--2024-08-13 05:06:52.178771 UTC---
| Itration            | 1095      |
| PAGAR Loss          | -1.39e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -8.45e+05 |
| Running Env Steps   | 5475000   |
| Running Forward KL  | 5.78      |
| Running Reverse KL  | 3.22      |
| Running Update Time | 1095      |
-----------------------------------
--2024-08-13 05:08:40.894681 UTC---
| Itration            | 1096      |
| PAGAR Loss          | 1.3e+05   |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.34e+06 |
| Running Env Steps   | 5480000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 4.17      |
| Running Update Time | 1096      |
-----------------------------------
--2024-08-13 05:10:29.243172 UTC---
| Itration            | 1097      |
| PAGAR Loss          | 6.04e+04  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.17e+06 |
| Running Env Steps   | 5485000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 3.22      |
| Running Update Time | 1097      |
-----------------------------------
--2024-08-13 05:12:15.431466 UTC--
| Itration            | 1098     |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.42e+03 |
| Real Sto Return     | 4.34e+03 |
| Reward Loss         | -1.6e+06 |
| Running Env Steps   | 5490000  |
| Running Forward KL  | 6.28     |
| Running Reverse KL  | 70.6     |
| Running Update Time | 1098     |
----------------------------------
--2024-08-13 05:14:03.122784 UTC---
| Itration            | 1099      |
| PAGAR Loss          | 2.06e+05  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -5.68e+05 |
| Running Env Steps   | 5495000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 3.04      |
| Running Update Time | 1099      |
-----------------------------------
--2024-08-13 05:15:50.800948 UTC---
| Itration            | 1100      |
| PAGAR Loss          | -3.1e+04  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -5.98e+05 |
| Running Env Steps   | 5500000   |
| Running Forward KL  | 4.86      |
| Running Reverse KL  | 2.2       |
| Running Update Time | 1100      |
-----------------------------------
--2024-08-13 05:17:39.330236 UTC---
| Itration            | 1101      |
| PAGAR Loss          | 8.02e+04  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -4.32e+05 |
| Running Env Steps   | 5505000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 2.48      |
| Running Update Time | 1101      |
-----------------------------------
--2024-08-13 05:19:26.659421 UTC---
| Itration            | 1102      |
| PAGAR Loss          | -4.69e+05 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | 3.59e+06  |
| Running Env Steps   | 5510000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 38.9      |
| Running Update Time | 1102      |
-----------------------------------
--2024-08-13 05:21:14.983192 UTC---
| Itration            | 1103      |
| PAGAR Loss          | 6.41e+04  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -9.81e+05 |
| Running Env Steps   | 5515000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 3.44      |
| Running Update Time | 1103      |
-----------------------------------
--2024-08-13 05:23:02.856626 UTC---
| Itration            | 1104      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.81e+06 |
| Running Env Steps   | 5520000   |
| Running Forward KL  | 6.61      |
| Running Reverse KL  | 103       |
| Running Update Time | 1104      |
-----------------------------------
--2024-08-13 05:24:50.920017 UTC---
| Itration            | 1105      |
| PAGAR Loss          | 7.26e+05  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -8.49e+05 |
| Running Env Steps   | 5525000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 3.06      |
| Running Update Time | 1105      |
-----------------------------------
--2024-08-13 05:26:25.063008 UTC---
| Itration            | 1106      |
| PAGAR Loss          | -2.34e+06 |
| Real Det Return     | 716       |
| Real Sto Return     | 2.74e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 5530000   |
| Running Forward KL  | 7.06      |
| Running Reverse KL  | 132       |
| Running Update Time | 1106      |
-----------------------------------
--2024-08-13 05:28:12.061901 UTC---
| Itration            | 1107      |
| PAGAR Loss          | 1.03e+06  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -1.29e+06 |
| Running Env Steps   | 5535000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 2.97      |
| Running Update Time | 1107      |
-----------------------------------
--2024-08-13 05:29:57.847055 UTC---
| Itration            | 1108      |
| PAGAR Loss          | -1.49e+06 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.23e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 5540000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 65.2      |
| Running Update Time | 1108      |
-----------------------------------
--2024-08-13 05:31:43.015706 UTC---
| Itration            | 1109      |
| PAGAR Loss          | -2.77e+07 |
| Real Det Return     | 4e+03     |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 5545000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 106       |
| Running Update Time | 1109      |
-----------------------------------
--2024-08-13 05:33:30.754902 UTC---
| Itration            | 1110      |
| PAGAR Loss          | 2.59e+04  |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -1.91e+06 |
| Running Env Steps   | 5550000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 3.56      |
| Running Update Time | 1110      |
-----------------------------------
--2024-08-13 05:35:18.596109 UTC---
| Itration            | 1111      |
| PAGAR Loss          | -5.27e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -8.12e+05 |
| Running Env Steps   | 5555000   |
| Running Forward KL  | 5.48      |
| Running Reverse KL  | 2.68      |
| Running Update Time | 1111      |
-----------------------------------
--2024-08-13 05:37:06.590093 UTC---
| Itration            | 1112      |
| PAGAR Loss          | -4.11e+07 |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -1.31e+06 |
| Running Env Steps   | 5560000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 36.7      |
| Running Update Time | 1112      |
-----------------------------------
--2024-08-13 05:38:50.562049 UTC---
| Itration            | 1113      |
| PAGAR Loss          | -2.42e+03 |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -1.98e+05 |
| Running Env Steps   | 5565000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 2.95      |
| Running Update Time | 1113      |
-----------------------------------
--2024-08-13 05:40:38.851614 UTC---
| Itration            | 1114      |
| PAGAR Loss          | 2.14e+04  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.28e+06 |
| Running Env Steps   | 5570000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 2.84      |
| Running Update Time | 1114      |
-----------------------------------
--2024-08-13 05:42:25.910905 UTC---
| Itration            | 1115      |
| PAGAR Loss          | 6.84e+06  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -6.71e+05 |
| Running Env Steps   | 5575000   |
| Running Forward KL  | 5.69      |
| Running Reverse KL  | 38.1      |
| Running Update Time | 1115      |
-----------------------------------
--2024-08-13 05:44:13.834725 UTC---
| Itration            | 1116      |
| PAGAR Loss          | -3.51e+05 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 5580000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 21.9      |
| Running Update Time | 1116      |
-----------------------------------
--2024-08-13 05:46:02.084009 UTC---
| Itration            | 1117      |
| PAGAR Loss          | 1.68e+05  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -9.75e+05 |
| Running Env Steps   | 5585000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 1117      |
-----------------------------------
--2024-08-13 05:47:50.411333 UTC---
| Itration            | 1118      |
| PAGAR Loss          | 9.84e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -9.91e+05 |
| Running Env Steps   | 5590000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 2.52      |
| Running Update Time | 1118      |
-----------------------------------
--2024-08-13 05:49:34.234092 UTC---
| Itration            | 1119      |
| PAGAR Loss          | -2.89e+04 |
| Real Det Return     | 3.15e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -8.93e+05 |
| Running Env Steps   | 5595000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 36.7      |
| Running Update Time | 1119      |
-----------------------------------
--2024-08-13 05:51:21.997967 UTC---
| Itration            | 1120      |
| PAGAR Loss          | -2.9e+05  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -5.89e+05 |
| Running Env Steps   | 5600000   |
| Running Forward KL  | 5.18      |
| Running Reverse KL  | 37.4      |
| Running Update Time | 1120      |
-----------------------------------
--2024-08-13 05:53:09.058502 UTC---
| Itration            | 1121      |
| PAGAR Loss          | -4.62e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -2.18e+06 |
| Running Env Steps   | 5605000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 65.7      |
| Running Update Time | 1121      |
-----------------------------------
--2024-08-13 05:54:57.371720 UTC---
| Itration            | 1122      |
| PAGAR Loss          | 1.07e+05  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -7.13e+05 |
| Running Env Steps   | 5610000   |
| Running Forward KL  | 5         |
| Running Reverse KL  | 2.07      |
| Running Update Time | 1122      |
-----------------------------------
--2024-08-13 05:56:45.466017 UTC---
| Itration            | 1123      |
| PAGAR Loss          | 2.07e+07  |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -8.06e+05 |
| Running Env Steps   | 5615000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 2.47      |
| Running Update Time | 1123      |
-----------------------------------
--2024-08-13 05:58:33.757711 UTC---
| Itration            | 1124      |
| PAGAR Loss          | 9.13e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -1.53e+06 |
| Running Env Steps   | 5620000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 40.9      |
| Running Update Time | 1124      |
-----------------------------------
--2024-08-13 06:00:22.164760 UTC---
| Itration            | 1125      |
| PAGAR Loss          | 4.6e+05   |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -8.91e+05 |
| Running Env Steps   | 5625000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 1125      |
-----------------------------------
--2024-08-13 06:02:10.609307 UTC---
| Itration            | 1126      |
| PAGAR Loss          | -9.97e+03 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -1.11e+06 |
| Running Env Steps   | 5630000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 2.5       |
| Running Update Time | 1126      |
-----------------------------------
--2024-08-13 06:03:58.134557 UTC---
| Itration            | 1127      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -6.38e+05 |
| Running Env Steps   | 5635000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 3.14      |
| Running Update Time | 1127      |
-----------------------------------
--2024-08-13 06:05:46.482947 UTC---
| Itration            | 1128      |
| PAGAR Loss          | -7.57e+05 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 5640000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 38.5      |
| Running Update Time | 1128      |
-----------------------------------
--2024-08-13 06:07:34.309863 UTC---
| Itration            | 1129      |
| PAGAR Loss          | -5.29e+04 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -6.5e+05  |
| Running Env Steps   | 5645000   |
| Running Forward KL  | 5.32      |
| Running Reverse KL  | 2.86      |
| Running Update Time | 1129      |
-----------------------------------
--2024-08-13 06:09:20.105840 UTC---
| Itration            | 1130      |
| PAGAR Loss          | 4.61e+05  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.04e+03  |
| Reward Loss         | -7.61e+05 |
| Running Env Steps   | 5650000   |
| Running Forward KL  | 4.96      |
| Running Reverse KL  | 2.18      |
| Running Update Time | 1130      |
-----------------------------------
--2024-08-13 06:11:08.311861 UTC--
| Itration            | 1131     |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.39e+03 |
| Real Sto Return     | 5.26e+03 |
| Reward Loss         | -1.1e+06 |
| Running Env Steps   | 5655000  |
| Running Forward KL  | 5.74     |
| Running Reverse KL  | 34.8     |
| Running Update Time | 1131     |
----------------------------------
--2024-08-13 06:12:56.699422 UTC---
| Itration            | 1132      |
| PAGAR Loss          | -2.26e+05 |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.69e+06 |
| Running Env Steps   | 5660000   |
| Running Forward KL  | 6.31      |
| Running Reverse KL  | 3.38      |
| Running Update Time | 1132      |
-----------------------------------
--2024-08-13 06:14:44.412523 UTC---
| Itration            | 1133      |
| PAGAR Loss          | 9.95e+06  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -4.28e+05 |
| Running Env Steps   | 5665000   |
| Running Forward KL  | 5.01      |
| Running Reverse KL  | 2.52      |
| Running Update Time | 1133      |
-----------------------------------
--2024-08-13 06:16:32.258338 UTC---
| Itration            | 1134      |
| PAGAR Loss          | 1.03e+06  |
| Real Det Return     | 4.91e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -5.73e+05 |
| Running Env Steps   | 5670000   |
| Running Forward KL  | 4.74      |
| Running Reverse KL  | 2.07      |
| Running Update Time | 1134      |
-----------------------------------
--2024-08-13 06:18:20.431299 UTC---
| Itration            | 1135      |
| PAGAR Loss          | -1.84e+05 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -4.76e+05 |
| Running Env Steps   | 5675000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 32.3      |
| Running Update Time | 1135      |
-----------------------------------
--2024-08-13 06:20:08.258556 UTC---
| Itration            | 1136      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -8.77e+05 |
| Running Env Steps   | 5680000   |
| Running Forward KL  | 5.25      |
| Running Reverse KL  | 2.55      |
| Running Update Time | 1136      |
-----------------------------------
--2024-08-13 06:21:56.366371 UTC---
| Itration            | 1137      |
| PAGAR Loss          | -1.33e+05 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 5685000   |
| Running Forward KL  | 5.36      |
| Running Reverse KL  | 25        |
| Running Update Time | 1137      |
-----------------------------------
--2024-08-13 06:23:44.373796 UTC---
| Itration            | 1138      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -4.94e+05 |
| Running Env Steps   | 5690000   |
| Running Forward KL  | 4.97      |
| Running Reverse KL  | 33.3      |
| Running Update Time | 1138      |
-----------------------------------
--2024-08-13 06:25:32.666441 UTC---
| Itration            | 1139      |
| PAGAR Loss          | 4.24e+05  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -3.08e+05 |
| Running Env Steps   | 5695000   |
| Running Forward KL  | 5         |
| Running Reverse KL  | 2.36      |
| Running Update Time | 1139      |
-----------------------------------
--2024-08-13 06:27:19.042907 UTC--
| Itration            | 1140     |
| PAGAR Loss          | 2.86e+04 |
| Real Det Return     | 5.32e+03 |
| Real Sto Return     | 5.01e+03 |
| Reward Loss         | -8.7e+05 |
| Running Env Steps   | 5700000  |
| Running Forward KL  | 5.49     |
| Running Reverse KL  | 2.77     |
| Running Update Time | 1140     |
----------------------------------
--2024-08-13 06:29:02.970764 UTC---
| Itration            | 1141      |
| PAGAR Loss          | 8.2e+05   |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 3.9e+03   |
| Reward Loss         | -7.23e+05 |
| Running Env Steps   | 5705000   |
| Running Forward KL  | 5.27      |
| Running Reverse KL  | 34        |
| Running Update Time | 1141      |
-----------------------------------
--2024-08-13 06:30:50.633766 UTC---
| Itration            | 1142      |
| PAGAR Loss          | -6.25e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 5710000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 38.5      |
| Running Update Time | 1142      |
-----------------------------------
--2024-08-13 06:32:38.949342 UTC---
| Itration            | 1143      |
| PAGAR Loss          | 5.04e+04  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -7.71e+05 |
| Running Env Steps   | 5715000   |
| Running Forward KL  | 4.92      |
| Running Reverse KL  | 2.21      |
| Running Update Time | 1143      |
-----------------------------------
--2024-08-13 06:34:24.941129 UTC---
| Itration            | 1144      |
| PAGAR Loss          | -2e+06    |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -1.49e+06 |
| Running Env Steps   | 5720000   |
| Running Forward KL  | 6.35      |
| Running Reverse KL  | 93.2      |
| Running Update Time | 1144      |
-----------------------------------
--2024-08-13 06:36:12.815641 UTC---
| Itration            | 1145      |
| PAGAR Loss          | -1.27e+04 |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -7.86e+05 |
| Running Env Steps   | 5725000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 33.7      |
| Running Update Time | 1145      |
-----------------------------------
--2024-08-13 06:37:58.775213 UTC---
| Itration            | 1146      |
| PAGAR Loss          | -6.59e+05 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.25e+03  |
| Reward Loss         | -2.74e+06 |
| Running Env Steps   | 5730000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 99.7      |
| Running Update Time | 1146      |
-----------------------------------
--2024-08-13 06:39:47.038848 UTC---
| Itration            | 1147      |
| PAGAR Loss          | 4.22e+04  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -4.26e+05 |
| Running Env Steps   | 5735000   |
| Running Forward KL  | 4.63      |
| Running Reverse KL  | 2.07      |
| Running Update Time | 1147      |
-----------------------------------
--2024-08-13 06:41:34.861413 UTC---
| Itration            | 1148      |
| PAGAR Loss          | 2.35e+06  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -4.43e+05 |
| Running Env Steps   | 5740000   |
| Running Forward KL  | 5.16      |
| Running Reverse KL  | 23.3      |
| Running Update Time | 1148      |
-----------------------------------
--2024-08-13 06:43:23.243726 UTC---
| Itration            | 1149      |
| PAGAR Loss          | 5.64e+03  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -4.59e+05 |
| Running Env Steps   | 5745000   |
| Running Forward KL  | 4.57      |
| Running Reverse KL  | 2.3       |
| Running Update Time | 1149      |
-----------------------------------
--2024-08-13 06:45:09.927908 UTC---
| Itration            | 1150      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.94e+03  |
| Reward Loss         | -6.29e+05 |
| Running Env Steps   | 5750000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 2.48      |
| Running Update Time | 1150      |
-----------------------------------
--2024-08-13 06:46:56.532354 UTC---
| Itration            | 1151      |
| PAGAR Loss          | -6.85e+05 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -2.14e+06 |
| Running Env Steps   | 5755000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 75        |
| Running Update Time | 1151      |
-----------------------------------
--2024-08-13 06:48:43.341385 UTC---
| Itration            | 1152      |
| PAGAR Loss          | -3.68e+05 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.53e+03  |
| Reward Loss         | -7.36e+05 |
| Running Env Steps   | 5760000   |
| Running Forward KL  | 4.96      |
| Running Reverse KL  | 6.28      |
| Running Update Time | 1152      |
-----------------------------------
--2024-08-13 06:50:31.120517 UTC---
| Itration            | 1153      |
| PAGAR Loss          | -3.1e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -4.42e+05 |
| Running Env Steps   | 5765000   |
| Running Forward KL  | 4.67      |
| Running Reverse KL  | 1.89      |
| Running Update Time | 1153      |
-----------------------------------
--2024-08-13 06:52:26.940831 UTC---
| Itration            | 1154      |
| PAGAR Loss          | -3.36e+06 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -1.65e+06 |
| Running Env Steps   | 5770000   |
| Running Forward KL  | 5.31      |
| Running Reverse KL  | 35.7      |
| Running Update Time | 1154      |
-----------------------------------
--2024-08-13 06:54:46.850931 UTC---
| Itration            | 1155      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -8.46e+05 |
| Running Env Steps   | 5775000   |
| Running Forward KL  | 4.82      |
| Running Reverse KL  | 2.15      |
| Running Update Time | 1155      |
-----------------------------------
--2024-08-13 06:57:00.116652 UTC---
| Itration            | 1156      |
| PAGAR Loss          | -2.37e+05 |
| Real Det Return     | 3.8e+03   |
| Real Sto Return     | 3.53e+03  |
| Reward Loss         | -2.73e+06 |
| Running Env Steps   | 5780000   |
| Running Forward KL  | 5.37      |
| Running Reverse KL  | 96.8      |
| Running Update Time | 1156      |
-----------------------------------
--2024-08-13 06:59:17.821666 UTC---
| Itration            | 1157      |
| PAGAR Loss          | -8.38e+04 |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.29e+03  |
| Reward Loss         | -1.41e+06 |
| Running Env Steps   | 5785000   |
| Running Forward KL  | 5.69      |
| Running Reverse KL  | 35.6      |
| Running Update Time | 1157      |
-----------------------------------
--2024-08-13 07:01:37.233576 UTC---
| Itration            | 1158      |
| PAGAR Loss          | 4.44e+04  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -6.57e+05 |
| Running Env Steps   | 5790000   |
| Running Forward KL  | 4.65      |
| Running Reverse KL  | 1.99      |
| Running Update Time | 1158      |
-----------------------------------
--2024-08-13 07:03:57.106016 UTC---
| Itration            | 1159      |
| PAGAR Loss          | -5.91e+04 |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 5795000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 2.96      |
| Running Update Time | 1159      |
-----------------------------------
--2024-08-13 07:06:14.190759 UTC---
| Itration            | 1160      |
| PAGAR Loss          | 1.62e+04  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.65e+03  |
| Reward Loss         | -1.15e+06 |
| Running Env Steps   | 5800000   |
| Running Forward KL  | 4.98      |
| Running Reverse KL  | 2.86      |
| Running Update Time | 1160      |
-----------------------------------
--2024-08-13 07:08:02.047323 UTC---
| Itration            | 1161      |
| PAGAR Loss          | 4.67e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -1.22e+06 |
| Running Env Steps   | 5805000   |
| Running Forward KL  | 4.64      |
| Running Reverse KL  | 2.4       |
| Running Update Time | 1161      |
-----------------------------------
--2024-08-13 07:09:50.026327 UTC---
| Itration            | 1162      |
| PAGAR Loss          | -4.76e+04 |
| Real Det Return     | 5.11e+03  |
| Real Sto Return     | 5.08e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 5810000   |
| Running Forward KL  | 5.2       |
| Running Reverse KL  | 2.87      |
| Running Update Time | 1162      |
-----------------------------------
--2024-08-13 07:11:35.238021 UTC---
| Itration            | 1163      |
| PAGAR Loss          | 3.67e+05  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4e+03     |
| Reward Loss         | -1.23e+06 |
| Running Env Steps   | 5815000   |
| Running Forward KL  | 5.12      |
| Running Reverse KL  | 55.3      |
| Running Update Time | 1163      |
-----------------------------------
--2024-08-13 07:13:22.614856 UTC---
| Itration            | 1164      |
| PAGAR Loss          | 2.87e+05  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -4.47e+05 |
| Running Env Steps   | 5820000   |
| Running Forward KL  | 4.81      |
| Running Reverse KL  | 12.1      |
| Running Update Time | 1164      |
-----------------------------------
--2024-08-13 07:15:10.746969 UTC---
| Itration            | 1165      |
| PAGAR Loss          | 2.21e+05  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.08e+05 |
| Running Env Steps   | 5825000   |
| Running Forward KL  | 4.8       |
| Running Reverse KL  | 2.74      |
| Running Update Time | 1165      |
-----------------------------------
--2024-08-13 07:16:58.701284 UTC---
| Itration            | 1166      |
| PAGAR Loss          | -4.4e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -2.33e+05 |
| Running Env Steps   | 5830000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 2.49      |
| Running Update Time | 1166      |
-----------------------------------
--2024-08-13 07:18:46.136159 UTC---
| Itration            | 1167      |
| PAGAR Loss          | -1.33e+05 |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -2.93e+05 |
| Running Env Steps   | 5835000   |
| Running Forward KL  | 5.11      |
| Running Reverse KL  | 2.71      |
| Running Update Time | 1167      |
-----------------------------------
--2024-08-13 07:20:33.941029 UTC---
| Itration            | 1168      |
| PAGAR Loss          | -2.34e+04 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -4.44e+05 |
| Running Env Steps   | 5840000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 2.05      |
| Running Update Time | 1168      |
-----------------------------------
--2024-08-13 07:22:21.742794 UTC---
| Itration            | 1169      |
| PAGAR Loss          | -2.71e+05 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -9.17e+05 |
| Running Env Steps   | 5845000   |
| Running Forward KL  | 4.64      |
| Running Reverse KL  | 2.08      |
| Running Update Time | 1169      |
-----------------------------------
--2024-08-13 07:24:08.657771 UTC---
| Itration            | 1170      |
| PAGAR Loss          | -3.39e+04 |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -7.17e+05 |
| Running Env Steps   | 5850000   |
| Running Forward KL  | 4.74      |
| Running Reverse KL  | 11.4      |
| Running Update Time | 1170      |
-----------------------------------
--2024-08-13 07:25:56.525622 UTC---
| Itration            | 1171      |
| PAGAR Loss          | -1.23e+07 |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -9.12e+05 |
| Running Env Steps   | 5855000   |
| Running Forward KL  | 4.8       |
| Running Reverse KL  | 2.49      |
| Running Update Time | 1171      |
-----------------------------------
--2024-08-13 07:27:43.092857 UTC---
| Itration            | 1172      |
| PAGAR Loss          | -6.42e+04 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -3.88e+05 |
| Running Env Steps   | 5860000   |
| Running Forward KL  | 4.88      |
| Running Reverse KL  | 2.94      |
| Running Update Time | 1172      |
-----------------------------------
--2024-08-13 07:29:30.672709 UTC---
| Itration            | 1173      |
| PAGAR Loss          | 9.32e+03  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -2.01e+05 |
| Running Env Steps   | 5865000   |
| Running Forward KL  | 4.41      |
| Running Reverse KL  | 2.04      |
| Running Update Time | 1173      |
-----------------------------------
--2024-08-13 07:31:18.695031 UTC---
| Itration            | 1174      |
| PAGAR Loss          | 1.86e+04  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -3.28e+05 |
| Running Env Steps   | 5870000   |
| Running Forward KL  | 4.58      |
| Running Reverse KL  | 2.14      |
| Running Update Time | 1174      |
-----------------------------------
--2024-08-13 07:33:06.236303 UTC---
| Itration            | 1175      |
| PAGAR Loss          | 1.12e+04  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -6.21e+05 |
| Running Env Steps   | 5875000   |
| Running Forward KL  | 4.92      |
| Running Reverse KL  | 2.42      |
| Running Update Time | 1175      |
-----------------------------------
--2024-08-13 07:34:51.811106 UTC---
| Itration            | 1176      |
| PAGAR Loss          | 8.32e+05  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -8.33e+05 |
| Running Env Steps   | 5880000   |
| Running Forward KL  | 4.78      |
| Running Reverse KL  | 2.58      |
| Running Update Time | 1176      |
-----------------------------------
--2024-08-13 07:36:35.916333 UTC---
| Itration            | 1177      |
| PAGAR Loss          | -4.03e+05 |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -2.19e+05 |
| Running Env Steps   | 5885000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 8.53      |
| Running Update Time | 1177      |
-----------------------------------
--2024-08-13 07:38:20.611001 UTC--
| Itration            | 1178     |
| PAGAR Loss          | 3.4e+04  |
| Real Det Return     | 5.33e+03 |
| Real Sto Return     | 5.21e+03 |
| Reward Loss         | -7.7e+05 |
| Running Env Steps   | 5890000  |
| Running Forward KL  | 4.57     |
| Running Reverse KL  | 2.44     |
| Running Update Time | 1178     |
----------------------------------
--2024-08-13 07:40:06.918177 UTC--
| Itration            | 1179     |
| PAGAR Loss          | 2.95e+04 |
| Real Det Return     | 5.22e+03 |
| Real Sto Return     | 5.27e+03 |
| Reward Loss         | -5.1e+05 |
| Running Env Steps   | 5895000  |
| Running Forward KL  | 4.08     |
| Running Reverse KL  | 2.2      |
| Running Update Time | 1179     |
----------------------------------
--2024-08-13 07:41:51.912849 UTC---
| Itration            | 1180      |
| PAGAR Loss          | -1.49e+06 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -1.9e+04  |
| Running Env Steps   | 5900000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 18.9      |
| Running Update Time | 1180      |
-----------------------------------
--2024-08-13 07:43:39.298992 UTC--
| Itration            | 1181     |
| PAGAR Loss          | 4.69e+04 |
| Real Det Return     | 5.43e+03 |
| Real Sto Return     | 5.33e+03 |
| Reward Loss         | -1.7e+05 |
| Running Env Steps   | 5905000  |
| Running Forward KL  | 4.79     |
| Running Reverse KL  | 2.74     |
| Running Update Time | 1181     |
----------------------------------
--2024-08-13 07:45:25.487665 UTC---
| Itration            | 1182      |
| PAGAR Loss          | -1.88e+05 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 5910000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 98.5      |
| Running Update Time | 1182      |
-----------------------------------
--2024-08-13 07:47:13.638062 UTC--
| Itration            | 1183     |
| PAGAR Loss          | 3.38e+04 |
| Real Det Return     | 5.34e+03 |
| Real Sto Return     | 5.12e+03 |
| Reward Loss         | -7.1e+05 |
| Running Env Steps   | 5915000  |
| Running Forward KL  | 4.58     |
| Running Reverse KL  | 1.95     |
| Running Update Time | 1183     |
----------------------------------
--2024-08-13 07:49:00.814460 UTC---
| Itration            | 1184      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.05e+05 |
| Running Env Steps   | 5920000   |
| Running Forward KL  | 4.9       |
| Running Reverse KL  | 8.25      |
| Running Update Time | 1184      |
-----------------------------------
--2024-08-13 07:50:48.532936 UTC---
| Itration            | 1185      |
| PAGAR Loss          | 8.72e+04  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -8.34e+05 |
| Running Env Steps   | 5925000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 2.94      |
| Running Update Time | 1185      |
-----------------------------------
--2024-08-13 07:52:36.246834 UTC---
| Itration            | 1186      |
| PAGAR Loss          | 1.17e+05  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -5.73e+05 |
| Running Env Steps   | 5930000   |
| Running Forward KL  | 4.36      |
| Running Reverse KL  | 2.1       |
| Running Update Time | 1186      |
-----------------------------------
--2024-08-13 07:54:24.277094 UTC---
| Itration            | 1187      |
| PAGAR Loss          | -2.26e+05 |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -6.24e+05 |
| Running Env Steps   | 5935000   |
| Running Forward KL  | 4.66      |
| Running Reverse KL  | 2.02      |
| Running Update Time | 1187      |
-----------------------------------
--2024-08-13 07:56:11.944464 UTC---
| Itration            | 1188      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -2.04e+06 |
| Running Env Steps   | 5940000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 62.5      |
| Running Update Time | 1188      |
-----------------------------------
--2024-08-13 07:57:59.310897 UTC---
| Itration            | 1189      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -3.39e+05 |
| Running Env Steps   | 5945000   |
| Running Forward KL  | 4.67      |
| Running Reverse KL  | 17.3      |
| Running Update Time | 1189      |
-----------------------------------
--2024-08-13 07:59:47.235330 UTC---
| Itration            | 1190      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -7.53e+05 |
| Running Env Steps   | 5950000   |
| Running Forward KL  | 4.33      |
| Running Reverse KL  | 1.93      |
| Running Update Time | 1190      |
-----------------------------------
--2024-08-13 08:01:35.266571 UTC---
| Itration            | 1191      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -3.26e+05 |
| Running Env Steps   | 5955000   |
| Running Forward KL  | 4.46      |
| Running Reverse KL  | 2.72      |
| Running Update Time | 1191      |
-----------------------------------
--2024-08-13 08:03:23.403189 UTC---
| Itration            | 1192      |
| PAGAR Loss          | 2.29e+05  |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -1.01e+06 |
| Running Env Steps   | 5960000   |
| Running Forward KL  | 4.61      |
| Running Reverse KL  | 2.55      |
| Running Update Time | 1192      |
-----------------------------------
--2024-08-13 08:05:11.535793 UTC---
| Itration            | 1193      |
| PAGAR Loss          | 5.95e+04  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -7.19e+05 |
| Running Env Steps   | 5965000   |
| Running Forward KL  | 4.5       |
| Running Reverse KL  | 2.45      |
| Running Update Time | 1193      |
-----------------------------------
--2024-08-13 08:06:59.635489 UTC--
| Itration            | 1194     |
| PAGAR Loss          | 1.09e+05 |
| Real Det Return     | 5.3e+03  |
| Real Sto Return     | 5.3e+03  |
| Reward Loss         | -2.3e+05 |
| Running Env Steps   | 5970000  |
| Running Forward KL  | 4.52     |
| Running Reverse KL  | 2.86     |
| Running Update Time | 1194     |
----------------------------------
--2024-08-13 08:08:46.462288 UTC---
| Itration            | 1195      |
| PAGAR Loss          | 4.4e+04   |
| Real Det Return     | 5.24e+03  |
| Real Sto Return     | 4.55e+03  |
| Reward Loss         | -5.95e+05 |
| Running Env Steps   | 5975000   |
| Running Forward KL  | 4.3       |
| Running Reverse KL  | 2.14      |
| Running Update Time | 1195      |
-----------------------------------
--2024-08-13 08:10:34.118208 UTC---
| Itration            | 1196      |
| PAGAR Loss          | -7.69e+04 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.72e+05 |
| Running Env Steps   | 5980000   |
| Running Forward KL  | 4.83      |
| Running Reverse KL  | 2.39      |
| Running Update Time | 1196      |
-----------------------------------
--2024-08-13 08:12:21.566588 UTC---
| Itration            | 1197      |
| PAGAR Loss          | -3.02e+05 |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 5985000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 74        |
| Running Update Time | 1197      |
-----------------------------------
--2024-08-13 08:14:09.691014 UTC---
| Itration            | 1198      |
| PAGAR Loss          | -2.78e+06 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.74e+06 |
| Running Env Steps   | 5990000   |
| Running Forward KL  | 4.65      |
| Running Reverse KL  | 41.9      |
| Running Update Time | 1198      |
-----------------------------------
--2024-08-13 08:15:57.771739 UTC---
| Itration            | 1199      |
| PAGAR Loss          | 9.45e+04  |
| Real Det Return     | 5.21e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -5.93e+05 |
| Running Env Steps   | 5995000   |
| Running Forward KL  | 4.15      |
| Running Reverse KL  | 2.14      |
| Running Update Time | 1199      |
-----------------------------------
--2024-08-13 08:17:45.894520 UTC---
| Itration            | 1200      |
| PAGAR Loss          | -6.45e+05 |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -9.52e+05 |
| Running Env Steps   | 6000000   |
| Running Forward KL  | 5.06      |
| Running Reverse KL  | 66.9      |
| Running Update Time | 1200      |
-----------------------------------
--2024-08-13 08:19:33.475793 UTC---
| Itration            | 1201      |
| PAGAR Loss          | 1.51e+05  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 6005000   |
| Running Forward KL  | 4.27      |
| Running Reverse KL  | 2.06      |
| Running Update Time | 1201      |
-----------------------------------
--2024-08-13 08:21:21.613918 UTC--
| Itration            | 1202     |
| PAGAR Loss          | -5.7e+04 |
| Real Det Return     | 5.47e+03 |
| Real Sto Return     | 5.37e+03 |
| Reward Loss         | -1.1e+05 |
| Running Env Steps   | 6010000  |
| Running Forward KL  | 4.5      |
| Running Reverse KL  | 2.3      |
| Running Update Time | 1202     |
----------------------------------
--2024-08-13 08:23:09.481747 UTC---
| Itration            | 1203      |
| PAGAR Loss          | -1.63e+05 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -7.24e+05 |
| Running Env Steps   | 6015000   |
| Running Forward KL  | 4.44      |
| Running Reverse KL  | 1.78      |
| Running Update Time | 1203      |
-----------------------------------
--2024-08-13 08:24:57.220206 UTC---
| Itration            | 1204      |
| PAGAR Loss          | 6.44e+05  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 4.99e+03  |
| Reward Loss         | -5.15e+05 |
| Running Env Steps   | 6020000   |
| Running Forward KL  | 4.73      |
| Running Reverse KL  | 3.06      |
| Running Update Time | 1204      |
-----------------------------------
--2024-08-13 08:26:45.445251 UTC---
| Itration            | 1205      |
| PAGAR Loss          | -3.84e+05 |
| Real Det Return     | 5.3e+03   |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -5.57e+05 |
| Running Env Steps   | 6025000   |
| Running Forward KL  | 4.6       |
| Running Reverse KL  | 2.16      |
| Running Update Time | 1205      |
-----------------------------------
--2024-08-13 08:28:33.407901 UTC---
| Itration            | 1206      |
| PAGAR Loss          | 3.41e+04  |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -6.77e+05 |
| Running Env Steps   | 6030000   |
| Running Forward KL  | 4.44      |
| Running Reverse KL  | 2.05      |
| Running Update Time | 1206      |
-----------------------------------
--2024-08-13 08:30:21.588628 UTC---
| Itration            | 1207      |
| PAGAR Loss          | 2.58e+05  |
| Real Det Return     | 5.27e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -8.77e+05 |
| Running Env Steps   | 6035000   |
| Running Forward KL  | 4.93      |
| Running Reverse KL  | 2.73      |
| Running Update Time | 1207      |
-----------------------------------
--2024-08-13 08:32:09.637206 UTC---
| Itration            | 1208      |
| PAGAR Loss          | 1.26e+05  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -4.85e+05 |
| Running Env Steps   | 6040000   |
| Running Forward KL  | 4.3       |
| Running Reverse KL  | 2.31      |
| Running Update Time | 1208      |
-----------------------------------
--2024-08-13 08:33:57.638104 UTC---
| Itration            | 1209      |
| PAGAR Loss          | 5.81e+05  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -4.52e+05 |
| Running Env Steps   | 6045000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 2.01      |
| Running Update Time | 1209      |
-----------------------------------
--2024-08-13 08:35:45.521438 UTC---
| Itration            | 1210      |
| PAGAR Loss          | 1.06e+06  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -3.16e+05 |
| Running Env Steps   | 6050000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 13.6      |
| Running Update Time | 1210      |
-----------------------------------
--2024-08-13 08:37:32.493987 UTC---
| Itration            | 1211      |
| PAGAR Loss          | 1.94e+04  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -7.55e+05 |
| Running Env Steps   | 6055000   |
| Running Forward KL  | 4.39      |
| Running Reverse KL  | 1.94      |
| Running Update Time | 1211      |
-----------------------------------
--2024-08-13 08:39:20.663963 UTC---
| Itration            | 1212      |
| PAGAR Loss          | 1.67e+04  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -5.31e+05 |
| Running Env Steps   | 6060000   |
| Running Forward KL  | 4.47      |
| Running Reverse KL  | 1.63      |
| Running Update Time | 1212      |
-----------------------------------
--2024-08-13 08:41:07.477302 UTC---
| Itration            | 1213      |
| PAGAR Loss          | -4.25e+05 |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 4.73e+03  |
| Reward Loss         | -3.91e+05 |
| Running Env Steps   | 6065000   |
| Running Forward KL  | 4.9       |
| Running Reverse KL  | 20.7      |
| Running Update Time | 1213      |
-----------------------------------
--2024-08-13 08:42:55.592905 UTC---
| Itration            | 1214      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -7.23e+05 |
| Running Env Steps   | 6070000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 2.59      |
| Running Update Time | 1214      |
-----------------------------------
--2024-08-13 08:44:42.580611 UTC---
| Itration            | 1215      |
| PAGAR Loss          | -4.9e+05  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -6.77e+05 |
| Running Env Steps   | 6075000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 33.3      |
| Running Update Time | 1215      |
-----------------------------------
--2024-08-13 08:46:29.859898 UTC---
| Itration            | 1216      |
| PAGAR Loss          | -5.68e+06 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.9e+03   |
| Reward Loss         | -1.88e+06 |
| Running Env Steps   | 6080000   |
| Running Forward KL  | 4.94      |
| Running Reverse KL  | 40.5      |
| Running Update Time | 1216      |
-----------------------------------
--2024-08-13 08:48:18.105637 UTC---
| Itration            | 1217      |
| PAGAR Loss          | -8.78e+04 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -4.25e+05 |
| Running Env Steps   | 6085000   |
| Running Forward KL  | 4.24      |
| Running Reverse KL  | 1.79      |
| Running Update Time | 1217      |
-----------------------------------
--2024-08-13 08:50:04.743800 UTC---
| Itration            | 1218      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -2.39e+05 |
| Running Env Steps   | 6090000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 33.7      |
| Running Update Time | 1218      |
-----------------------------------
--2024-08-13 08:51:50.700656 UTC---
| Itration            | 1219      |
| PAGAR Loss          | 6.32e+04  |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 4.3e+03   |
| Reward Loss         | -1.54e+06 |
| Running Env Steps   | 6095000   |
| Running Forward KL  | 5.65      |
| Running Reverse KL  | 86.6      |
| Running Update Time | 1219      |
-----------------------------------
--2024-08-13 08:53:37.810100 UTC---
| Itration            | 1220      |
| PAGAR Loss          | -2.75e+06 |
| Real Det Return     | 5.06e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.78e+05 |
| Running Env Steps   | 6100000   |
| Running Forward KL  | 5.19      |
| Running Reverse KL  | 43.4      |
| Running Update Time | 1220      |
-----------------------------------
--2024-08-13 08:55:25.639834 UTC---
| Itration            | 1221      |
| PAGAR Loss          | -1.87e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -5.32e+05 |
| Running Env Steps   | 6105000   |
| Running Forward KL  | 5.01      |
| Running Reverse KL  | 7.84      |
| Running Update Time | 1221      |
-----------------------------------
--2024-08-13 08:57:13.850610 UTC---
| Itration            | 1222      |
| PAGAR Loss          | 1.53e+06  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -6.68e+05 |
| Running Env Steps   | 6110000   |
| Running Forward KL  | 4.79      |
| Running Reverse KL  | 2.28      |
| Running Update Time | 1222      |
-----------------------------------
--2024-08-13 08:59:00.446759 UTC---
| Itration            | 1223      |
| PAGAR Loss          | -1.19e+06 |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -4.09e+05 |
| Running Env Steps   | 6115000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 33.9      |
| Running Update Time | 1223      |
-----------------------------------
--2024-08-13 09:00:46.825568 UTC---
| Itration            | 1224      |
| PAGAR Loss          | -1.43e+06 |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.79e+06 |
| Running Env Steps   | 6120000   |
| Running Forward KL  | 5.05      |
| Running Reverse KL  | 37.6      |
| Running Update Time | 1224      |
-----------------------------------
--2024-08-13 09:02:34.903908 UTC---
| Itration            | 1225      |
| PAGAR Loss          | 7.31e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -1.44e+05 |
| Running Env Steps   | 6125000   |
| Running Forward KL  | 4.56      |
| Running Reverse KL  | 2.33      |
| Running Update Time | 1225      |
-----------------------------------
--2024-08-13 09:04:22.080897 UTC---
| Itration            | 1226      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 4.83e+03  |
| Reward Loss         | -7.12e+05 |
| Running Env Steps   | 6130000   |
| Running Forward KL  | 4.7       |
| Running Reverse KL  | 20.8      |
| Running Update Time | 1226      |
-----------------------------------
--2024-08-13 09:06:10.192444 UTC---
| Itration            | 1227      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -8.41e+05 |
| Running Env Steps   | 6135000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 18.8      |
| Running Update Time | 1227      |
-----------------------------------
--2024-08-13 09:07:56.452920 UTC---
| Itration            | 1228      |
| PAGAR Loss          | -1.74e+06 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.56e+03  |
| Reward Loss         | -2.48e+06 |
| Running Env Steps   | 6140000   |
| Running Forward KL  | 5.52      |
| Running Reverse KL  | 41.3      |
| Running Update Time | 1228      |
-----------------------------------
--2024-08-13 09:09:44.684459 UTC---
| Itration            | 1229      |
| PAGAR Loss          | -3.85e+05 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -4.43e+05 |
| Running Env Steps   | 6145000   |
| Running Forward KL  | 5.09      |
| Running Reverse KL  | 5.45      |
| Running Update Time | 1229      |
-----------------------------------
--2024-08-13 09:11:32.714326 UTC---
| Itration            | 1230      |
| PAGAR Loss          | 1.21e+08  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -3.93e+05 |
| Running Env Steps   | 6150000   |
| Running Forward KL  | 4.82      |
| Running Reverse KL  | 2.23      |
| Running Update Time | 1230      |
-----------------------------------
--2024-08-13 09:13:20.957109 UTC---
| Itration            | 1231      |
| PAGAR Loss          | 7.4e+04   |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -7.41e+05 |
| Running Env Steps   | 6155000   |
| Running Forward KL  | 4.83      |
| Running Reverse KL  | 2.56      |
| Running Update Time | 1231      |
-----------------------------------
--2024-08-13 09:15:08.479308 UTC---
| Itration            | 1232      |
| PAGAR Loss          | -5.2e+05  |
| Real Det Return     | 5.2e+03   |
| Real Sto Return     | 4.79e+03  |
| Reward Loss         | -8.69e+05 |
| Running Env Steps   | 6160000   |
| Running Forward KL  | 4.91      |
| Running Reverse KL  | 5.82      |
| Running Update Time | 1232      |
-----------------------------------
--2024-08-13 09:16:55.589143 UTC---
| Itration            | 1233      |
| PAGAR Loss          | 7.23e+04  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -8.64e+05 |
| Running Env Steps   | 6165000   |
| Running Forward KL  | 5.22      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 1233      |
-----------------------------------
--2024-08-13 09:18:43.550995 UTC--
| Itration            | 1234     |
| PAGAR Loss          | 4.05e+05 |
| Real Det Return     | 5.51e+03 |
| Real Sto Return     | 5.24e+03 |
| Reward Loss         | 9.71e+04 |
| Running Env Steps   | 6170000  |
| Running Forward KL  | 5.39     |
| Running Reverse KL  | 2.9      |
| Running Update Time | 1234     |
----------------------------------
--2024-08-13 09:20:31.247498 UTC---
| Itration            | 1235      |
| PAGAR Loss          | 4.76e+05  |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -2.12e+05 |
| Running Env Steps   | 6175000   |
| Running Forward KL  | 5.1       |
| Running Reverse KL  | 3.58      |
| Running Update Time | 1235      |
-----------------------------------
--2024-08-13 09:22:19.218890 UTC---
| Itration            | 1236      |
| PAGAR Loss          | -1.1e+05  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -4.15e+05 |
| Running Env Steps   | 6180000   |
| Running Forward KL  | 5.41      |
| Running Reverse KL  | 12.5      |
| Running Update Time | 1236      |
-----------------------------------
--2024-08-13 09:24:04.112710 UTC---
| Itration            | 1237      |
| PAGAR Loss          | -4.17e+05 |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 4.15e+03  |
| Reward Loss         | -1.46e+06 |
| Running Env Steps   | 6185000   |
| Running Forward KL  | 6.08      |
| Running Reverse KL  | 107       |
| Running Update Time | 1237      |
-----------------------------------
--2024-08-13 09:25:48.738321 UTC---
| Itration            | 1238      |
| PAGAR Loss          | -1.77e+06 |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 3.95e+03  |
| Reward Loss         | -3.83e+05 |
| Running Env Steps   | 6190000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 48        |
| Running Update Time | 1238      |
-----------------------------------
--2024-08-13 09:27:36.771943 UTC---
| Itration            | 1239      |
| PAGAR Loss          | -2.83e+06 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -6.41e+05 |
| Running Env Steps   | 6195000   |
| Running Forward KL  | 5.15      |
| Running Reverse KL  | 14.3      |
| Running Update Time | 1239      |
-----------------------------------
--2024-08-13 09:29:24.757398 UTC---
| Itration            | 1240      |
| PAGAR Loss          | 9.03e+04  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -5.38e+05 |
| Running Env Steps   | 6200000   |
| Running Forward KL  | 5.24      |
| Running Reverse KL  | 2.53      |
| Running Update Time | 1240      |
-----------------------------------
--2024-08-13 09:31:12.617458 UTC---
| Itration            | 1241      |
| PAGAR Loss          | -3.48e+05 |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -8.02e+05 |
| Running Env Steps   | 6205000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 48.6      |
| Running Update Time | 1241      |
-----------------------------------
--2024-08-13 09:33:00.726377 UTC---
| Itration            | 1242      |
| PAGAR Loss          | -1.48e+06 |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -5.7e+05  |
| Running Env Steps   | 6210000   |
| Running Forward KL  | 5.09      |
| Running Reverse KL  | 11.9      |
| Running Update Time | 1242      |
-----------------------------------
--2024-08-13 09:34:46.930525 UTC---
| Itration            | 1243      |
| PAGAR Loss          | 3.87e+05  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -9.45e+05 |
| Running Env Steps   | 6215000   |
| Running Forward KL  | 5.29      |
| Running Reverse KL  | 31.8      |
| Running Update Time | 1243      |
-----------------------------------
--2024-08-13 09:36:32.780287 UTC--
| Itration            | 1244     |
| PAGAR Loss          | 2.43e+05 |
| Real Det Return     | 5.44e+03 |
| Real Sto Return     | 5.15e+03 |
| Reward Loss         | -7.8e+05 |
| Running Env Steps   | 6220000  |
| Running Forward KL  | 5.09     |
| Running Reverse KL  | 22.7     |
| Running Update Time | 1244     |
----------------------------------
--2024-08-13 09:38:20.916432 UTC---
| Itration            | 1245      |
| PAGAR Loss          | -6.69e+04 |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -3.9e+05  |
| Running Env Steps   | 6225000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 3.73      |
| Running Update Time | 1245      |
-----------------------------------
--2024-08-13 09:40:08.883828 UTC---
| Itration            | 1246      |
| PAGAR Loss          | -1.76e+04 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -3.81e+05 |
| Running Env Steps   | 6230000   |
| Running Forward KL  | 5.69      |
| Running Reverse KL  | 33.8      |
| Running Update Time | 1246      |
-----------------------------------
--2024-08-13 09:41:56.948769 UTC--
| Itration            | 1247     |
| PAGAR Loss          | 3.28e+05 |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.37e+03 |
| Reward Loss         | -3e+03   |
| Running Env Steps   | 6235000  |
| Running Forward KL  | 5.03     |
| Running Reverse KL  | 2.75     |
| Running Update Time | 1247     |
----------------------------------
--2024-08-13 09:43:44.814640 UTC---
| Itration            | 1248      |
| PAGAR Loss          | -4.08e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -2.48e+05 |
| Running Env Steps   | 6240000   |
| Running Forward KL  | 5.58      |
| Running Reverse KL  | 16.9      |
| Running Update Time | 1248      |
-----------------------------------
--2024-08-13 09:45:33.020884 UTC---
| Itration            | 1249      |
| PAGAR Loss          | 1.94e+04  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -3.83e+05 |
| Running Env Steps   | 6245000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 3.6       |
| Running Update Time | 1249      |
-----------------------------------
--2024-08-13 09:47:21.214083 UTC---
| Itration            | 1250      |
| PAGAR Loss          | 4.17e+04  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.21e+03  |
| Reward Loss         | -8.45e+05 |
| Running Env Steps   | 6250000   |
| Running Forward KL  | 5.24      |
| Running Reverse KL  | 3.26      |
| Running Update Time | 1250      |
-----------------------------------
--2024-08-13 09:49:08.713623 UTC---
| Itration            | 1251      |
| PAGAR Loss          | 2.49e+04  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 4.92e+03  |
| Reward Loss         | -1.92e+05 |
| Running Env Steps   | 6255000   |
| Running Forward KL  | 5.16      |
| Running Reverse KL  | 2.82      |
| Running Update Time | 1251      |
-----------------------------------
--2024-08-13 09:50:56.682988 UTC---
| Itration            | 1252      |
| PAGAR Loss          | 2.91e+05  |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 6260000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 2.78      |
| Running Update Time | 1252      |
-----------------------------------
--2024-08-13 09:52:42.426829 UTC---
| Itration            | 1253      |
| PAGAR Loss          | 4.76e+06  |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -7.79e+05 |
| Running Env Steps   | 6265000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 59.5      |
| Running Update Time | 1253      |
-----------------------------------
--2024-08-13 09:54:30.274437 UTC---
| Itration            | 1254      |
| PAGAR Loss          | 4.83e+05  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -4.77e+04 |
| Running Env Steps   | 6270000   |
| Running Forward KL  | 4.84      |
| Running Reverse KL  | 2.63      |
| Running Update Time | 1254      |
-----------------------------------
--2024-08-13 09:56:18.100633 UTC---
| Itration            | 1255      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.17e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -2.55e+06 |
| Running Env Steps   | 6275000   |
| Running Forward KL  | 6.34      |
| Running Reverse KL  | 108       |
| Running Update Time | 1255      |
-----------------------------------
--2024-08-13 09:58:05.990361 UTC---
| Itration            | 1256      |
| PAGAR Loss          | 2.85e+04  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.21e+06 |
| Running Env Steps   | 6280000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 58.1      |
| Running Update Time | 1256      |
-----------------------------------
--2024-08-13 09:59:53.280913 UTC---
| Itration            | 1257      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.37e+05 |
| Running Env Steps   | 6285000   |
| Running Forward KL  | 5.46      |
| Running Reverse KL  | 34.9      |
| Running Update Time | 1257      |
-----------------------------------
--2024-08-13 10:01:36.731938 UTC---
| Itration            | 1258      |
| PAGAR Loss          | -1.13e+07 |
| Real Det Return     | 4.98e+03  |
| Real Sto Return     | 3.71e+03  |
| Reward Loss         | -1.88e+06 |
| Running Env Steps   | 6290000   |
| Running Forward KL  | 6.71      |
| Running Reverse KL  | 103       |
| Running Update Time | 1258      |
-----------------------------------
--2024-08-13 10:03:24.247716 UTC---
| Itration            | 1259      |
| PAGAR Loss          | 3.15e+06  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -2.51e+05 |
| Running Env Steps   | 6295000   |
| Running Forward KL  | 4.63      |
| Running Reverse KL  | 2.03      |
| Running Update Time | 1259      |
-----------------------------------
--2024-08-13 10:05:10.036248 UTC---
| Itration            | 1260      |
| PAGAR Loss          | -5.03e+04 |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 4.19e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 6300000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 68        |
| Running Update Time | 1260      |
-----------------------------------
--2024-08-13 10:06:57.913774 UTC---
| Itration            | 1261      |
| PAGAR Loss          | -1.11e+06 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.52e+06 |
| Running Env Steps   | 6305000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 27.5      |
| Running Update Time | 1261      |
-----------------------------------
--2024-08-13 10:08:46.158021 UTC---
| Itration            | 1262      |
| PAGAR Loss          | -2.21e+05 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -1.38e+06 |
| Running Env Steps   | 6310000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 43.4      |
| Running Update Time | 1262      |
-----------------------------------
--2024-08-13 10:10:34.334409 UTC---
| Itration            | 1263      |
| PAGAR Loss          | -1.07e+04 |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | 9.02e+04  |
| Running Env Steps   | 6315000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 2.97      |
| Running Update Time | 1263      |
-----------------------------------
--2024-08-13 10:12:21.841460 UTC--
| Itration            | 1264     |
| PAGAR Loss          | 4.43e+04 |
| Real Det Return     | 5.41e+03 |
| Real Sto Return     | 4.92e+03 |
| Reward Loss         | -9.1e+05 |
| Running Env Steps   | 6320000  |
| Running Forward KL  | 5.43     |
| Running Reverse KL  | 13.7     |
| Running Update Time | 1264     |
----------------------------------
--2024-08-13 10:14:08.992830 UTC---
| Itration            | 1265      |
| PAGAR Loss          | -1.22e+05 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -3.26e+04 |
| Running Env Steps   | 6325000   |
| Running Forward KL  | 4.77      |
| Running Reverse KL  | 2.38      |
| Running Update Time | 1265      |
-----------------------------------
--2024-08-13 10:15:52.533730 UTC---
| Itration            | 1266      |
| PAGAR Loss          | -5.69e+06 |
| Real Det Return     | 4.35e+03  |
| Real Sto Return     | 3.92e+03  |
| Reward Loss         | -6.73e+05 |
| Running Env Steps   | 6330000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 54.5      |
| Running Update Time | 1266      |
-----------------------------------
--2024-08-13 10:17:38.866337 UTC---
| Itration            | 1267      |
| PAGAR Loss          | -3.92e+06 |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -2.41e+05 |
| Running Env Steps   | 6335000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 8.71      |
| Running Update Time | 1267      |
-----------------------------------
--2024-08-13 10:19:23.912208 UTC---
| Itration            | 1268      |
| PAGAR Loss          | -8.37e+05 |
| Real Det Return     | 4.7e+03   |
| Real Sto Return     | 4.81e+03  |
| Reward Loss         | -2.3e+06  |
| Running Env Steps   | 6340000   |
| Running Forward KL  | 6.24      |
| Running Reverse KL  | 69.3      |
| Running Update Time | 1268      |
-----------------------------------
--2024-08-13 10:21:09.907572 UTC---
| Itration            | 1269      |
| PAGAR Loss          | -6.93e+05 |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 4.57e+03  |
| Reward Loss         | -5.69e+05 |
| Running Env Steps   | 6345000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 30.8      |
| Running Update Time | 1269      |
-----------------------------------
--2024-08-13 10:22:57.848671 UTC---
| Itration            | 1270      |
| PAGAR Loss          | 1.15e+05  |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -8.48e+05 |
| Running Env Steps   | 6350000   |
| Running Forward KL  | 5.57      |
| Running Reverse KL  | 38.5      |
| Running Update Time | 1270      |
-----------------------------------
--2024-08-13 10:24:44.938695 UTC--
| Itration            | 1271     |
| PAGAR Loss          | 1.74e+04 |
| Real Det Return     | 5.63e+03 |
| Real Sto Return     | 4.84e+03 |
| Reward Loss         | 1.82e+05 |
| Running Env Steps   | 6355000  |
| Running Forward KL  | 5.23     |
| Running Reverse KL  | 2.92     |
| Running Update Time | 1271     |
----------------------------------
--2024-08-13 10:26:32.128121 UTC---
| Itration            | 1272      |
| PAGAR Loss          | -3.7e+05  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.78e+03  |
| Reward Loss         | -1.62e+05 |
| Running Env Steps   | 6360000   |
| Running Forward KL  | 5.11      |
| Running Reverse KL  | 2.51      |
| Running Update Time | 1272      |
-----------------------------------
--2024-08-13 10:28:19.762383 UTC---
| Itration            | 1273      |
| PAGAR Loss          | -7.46e+06 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -8.88e+05 |
| Running Env Steps   | 6365000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 25.7      |
| Running Update Time | 1273      |
-----------------------------------
--2024-08-13 10:30:07.675661 UTC---
| Itration            | 1274      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -1.91e+05 |
| Running Env Steps   | 6370000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 2.69      |
| Running Update Time | 1274      |
-----------------------------------
--2024-08-13 10:31:55.471409 UTC---
| Itration            | 1275      |
| PAGAR Loss          | -1.82e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -2.83e+05 |
| Running Env Steps   | 6375000   |
| Running Forward KL  | 5.14      |
| Running Reverse KL  | 3.29      |
| Running Update Time | 1275      |
-----------------------------------
--2024-08-13 10:34:00.085634 UTC---
| Itration            | 1276      |
| PAGAR Loss          | -5.51e+03 |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -1.52e+04 |
| Running Env Steps   | 6380000   |
| Running Forward KL  | 5.61      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 1276      |
-----------------------------------
--2024-08-13 10:36:20.077826 UTC---
| Itration            | 1277      |
| PAGAR Loss          | 5.71e+05  |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.25e+05 |
| Running Env Steps   | 6385000   |
| Running Forward KL  | 5.16      |
| Running Reverse KL  | 3.42      |
| Running Update Time | 1277      |
-----------------------------------
--2024-08-13 10:38:38.032463 UTC---
| Itration            | 1278      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.67e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -1.77e+06 |
| Running Env Steps   | 6390000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 64.5      |
| Running Update Time | 1278      |
-----------------------------------
--2024-08-13 10:40:54.579934 UTC---
| Itration            | 1279      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.19e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -1.14e+06 |
| Running Env Steps   | 6395000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 66.4      |
| Running Update Time | 1279      |
-----------------------------------
--2024-08-13 10:43:13.380339 UTC---
| Itration            | 1280      |
| PAGAR Loss          | 9.34e+04  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -2.46e+05 |
| Running Env Steps   | 6400000   |
| Running Forward KL  | 4.68      |
| Running Reverse KL  | 2.39      |
| Running Update Time | 1280      |
-----------------------------------
--2024-08-13 10:45:33.434830 UTC---
| Itration            | 1281      |
| PAGAR Loss          | -2.08e+04 |
| Real Det Return     | 5.23e+03  |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 6405000   |
| Running Forward KL  | 5.52      |
| Running Reverse KL  | 32.7      |
| Running Update Time | 1281      |
-----------------------------------
--2024-08-13 10:47:50.812839 UTC---
| Itration            | 1282      |
| PAGAR Loss          | 1.62e+05  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -3.45e+05 |
| Running Env Steps   | 6410000   |
| Running Forward KL  | 5.13      |
| Running Reverse KL  | 3.35      |
| Running Update Time | 1282      |
-----------------------------------
--2024-08-13 10:49:57.864833 UTC---
| Itration            | 1283      |
| PAGAR Loss          | 2.34e+05  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.45e+04 |
| Running Env Steps   | 6415000   |
| Running Forward KL  | 5.04      |
| Running Reverse KL  | 3.15      |
| Running Update Time | 1283      |
-----------------------------------
--2024-08-13 10:51:46.167706 UTC---
| Itration            | 1284      |
| PAGAR Loss          | 4.99e+05  |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -1.26e+05 |
| Running Env Steps   | 6420000   |
| Running Forward KL  | 4.78      |
| Running Reverse KL  | 3.13      |
| Running Update Time | 1284      |
-----------------------------------
--2024-08-13 10:53:33.856797 UTC--
| Itration            | 1285     |
| PAGAR Loss          | 1.13e+05 |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 5.17e+03 |
| Reward Loss         | -1.3e+05 |
| Running Env Steps   | 6425000  |
| Running Forward KL  | 5.31     |
| Running Reverse KL  | 3.16     |
| Running Update Time | 1285     |
----------------------------------
--2024-08-13 10:55:21.910501 UTC---
| Itration            | 1286      |
| PAGAR Loss          | -3.29e+07 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -2.28e+06 |
| Running Env Steps   | 6430000   |
| Running Forward KL  | 5.31      |
| Running Reverse KL  | 37.1      |
| Running Update Time | 1286      |
-----------------------------------
--2024-08-13 10:57:08.670175 UTC---
| Itration            | 1287      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.8e+03   |
| Reward Loss         | -3.21e+06 |
| Running Env Steps   | 6435000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 116       |
| Running Update Time | 1287      |
-----------------------------------
--2024-08-13 10:58:56.851856 UTC---
| Itration            | 1288      |
| PAGAR Loss          | -5.72e+05 |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -4.04e+05 |
| Running Env Steps   | 6440000   |
| Running Forward KL  | 4.86      |
| Running Reverse KL  | 14.3      |
| Running Update Time | 1288      |
-----------------------------------
--2024-08-13 11:00:44.760884 UTC---
| Itration            | 1289      |
| PAGAR Loss          | -3.65e+06 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -1.43e+04 |
| Running Env Steps   | 6445000   |
| Running Forward KL  | 5.21      |
| Running Reverse KL  | 16.4      |
| Running Update Time | 1289      |
-----------------------------------
--2024-08-13 11:02:32.782727 UTC---
| Itration            | 1290      |
| PAGAR Loss          | 8.82e+04  |
| Real Det Return     | 5.26e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 6450000   |
| Running Forward KL  | 5.6       |
| Running Reverse KL  | 11.1      |
| Running Update Time | 1290      |
-----------------------------------
--2024-08-13 11:04:20.946066 UTC---
| Itration            | 1291      |
| PAGAR Loss          | -1.35e+05 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.54e+03  |
| Reward Loss         | -1.25e+06 |
| Running Env Steps   | 6455000   |
| Running Forward KL  | 5.67      |
| Running Reverse KL  | 65.4      |
| Running Update Time | 1291      |
-----------------------------------
--2024-08-13 11:06:40.503324 UTC---
| Itration            | 1292      |
| PAGAR Loss          | -7.57e+05 |
| Real Det Return     | 5.1e+03   |
| Real Sto Return     | 4.42e+03  |
| Reward Loss         | -2.09e+06 |
| Running Env Steps   | 6460000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 5.23      |
| Running Update Time | 1292      |
-----------------------------------
--2024-08-13 11:08:54.121551 UTC---
| Itration            | 1293      |
| PAGAR Loss          | -3.65e+06 |
| Real Det Return     | 2.92e+03  |
| Real Sto Return     | 4.82e+03  |
| Reward Loss         | -1e+06    |
| Running Env Steps   | 6465000   |
| Running Forward KL  | 5.73      |
| Running Reverse KL  | 41.6      |
| Running Update Time | 1293      |
-----------------------------------
--2024-08-13 11:11:02.946709 UTC---
| Itration            | 1294      |
| PAGAR Loss          | -4.11e+04 |
| Real Det Return     | 2.55e+03  |
| Real Sto Return     | 3.56e+03  |
| Reward Loss         | -4.78e+06 |
| Running Env Steps   | 6470000   |
| Running Forward KL  | 6.96      |
| Running Reverse KL  | 158       |
| Running Update Time | 1294      |
-----------------------------------
--2024-08-13 11:13:21.836515 UTC---
| Itration            | 1295      |
| PAGAR Loss          | 4.88e+05  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -4.63e+05 |
| Running Env Steps   | 6475000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 2.93      |
| Running Update Time | 1295      |
-----------------------------------
--2024-08-13 11:15:41.583740 UTC---
| Itration            | 1296      |
| PAGAR Loss          | -1.4e+05  |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -4.97e+05 |
| Running Env Steps   | 6480000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 34.7      |
| Running Update Time | 1296      |
-----------------------------------
--2024-08-13 11:17:51.415116 UTC---
| Itration            | 1297      |
| PAGAR Loss          | 3.63e+04  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -8.24e+04 |
| Running Env Steps   | 6485000   |
| Running Forward KL  | 5.24      |
| Running Reverse KL  | 3.3       |
| Running Update Time | 1297      |
-----------------------------------
--2024-08-13 11:19:38.985902 UTC---
| Itration            | 1298      |
| PAGAR Loss          | -1.42e+05 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -8.93e+05 |
| Running Env Steps   | 6490000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 37        |
| Running Update Time | 1298      |
-----------------------------------
--2024-08-13 11:21:24.930019 UTC---
| Itration            | 1299      |
| PAGAR Loss          | 1.87e+05  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -1.43e+05 |
| Running Env Steps   | 6495000   |
| Running Forward KL  | 5.29      |
| Running Reverse KL  | 4.4       |
| Running Update Time | 1299      |
-----------------------------------
--2024-08-13 11:23:10.392420 UTC---
| Itration            | 1300      |
| PAGAR Loss          | -1.79e+04 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.15e+05 |
| Running Env Steps   | 6500000   |
| Running Forward KL  | 5.43      |
| Running Reverse KL  | 3.93      |
| Running Update Time | 1300      |
-----------------------------------
--2024-08-13 11:24:56.973644 UTC---
| Itration            | 1301      |
| PAGAR Loss          | 3.04e+05  |
| Real Det Return     | 5.32e+03  |
| Real Sto Return     | 5.01e+03  |
| Reward Loss         | -5.06e+05 |
| Running Env Steps   | 6505000   |
| Running Forward KL  | 4.86      |
| Running Reverse KL  | 3.18      |
| Running Update Time | 1301      |
-----------------------------------
--2024-08-13 11:26:44.400213 UTC---
| Itration            | 1302      |
| PAGAR Loss          | 3.54e+03  |
| Real Det Return     | 5.28e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.51e+06 |
| Running Env Steps   | 6510000   |
| Running Forward KL  | 5.75      |
| Running Reverse KL  | 55.8      |
| Running Update Time | 1302      |
-----------------------------------
--2024-08-13 11:28:31.620628 UTC--
| Itration            | 1303     |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.5e+03  |
| Real Sto Return     | 5e+03    |
| Reward Loss         | -5.6e+05 |
| Running Env Steps   | 6515000  |
| Running Forward KL  | 5.41     |
| Running Reverse KL  | 36.3     |
| Running Update Time | 1303     |
----------------------------------
--2024-08-13 11:30:18.753784 UTC---
| Itration            | 1304      |
| PAGAR Loss          | 1.75e+05  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -2.72e+05 |
| Running Env Steps   | 6520000   |
| Running Forward KL  | 4.92      |
| Running Reverse KL  | 3.02      |
| Running Update Time | 1304      |
-----------------------------------
--2024-08-13 11:32:06.541910 UTC---
| Itration            | 1305      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -1.24e+06 |
| Running Env Steps   | 6525000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 18        |
| Running Update Time | 1305      |
-----------------------------------
--2024-08-13 11:33:53.168933 UTC--
| Itration            | 1306     |
| PAGAR Loss          | 1.47e+06 |
| Real Det Return     | 5.22e+03 |
| Real Sto Return     | 5.03e+03 |
| Reward Loss         | 4.4e+04  |
| Running Env Steps   | 6530000  |
| Running Forward KL  | 5.32     |
| Running Reverse KL  | 2.84     |
| Running Update Time | 1306     |
----------------------------------
--2024-08-13 11:35:41.416642 UTC---
| Itration            | 1307      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -6.26e+05 |
| Running Env Steps   | 6535000   |
| Running Forward KL  | 5.33      |
| Running Reverse KL  | 19.2      |
| Running Update Time | 1307      |
-----------------------------------
--2024-08-13 11:37:29.115776 UTC---
| Itration            | 1308      |
| PAGAR Loss          | 1.61e+05  |
| Real Det Return     | 5.33e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -2.53e+06 |
| Running Env Steps   | 6540000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 68.9      |
| Running Update Time | 1308      |
-----------------------------------
--2024-08-13 11:39:16.833319 UTC---
| Itration            | 1309      |
| PAGAR Loss          | 9.73e+04  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -3.88e+05 |
| Running Env Steps   | 6545000   |
| Running Forward KL  | 5.26      |
| Running Reverse KL  | 3.82      |
| Running Update Time | 1309      |
-----------------------------------
--2024-08-13 11:41:04.709069 UTC---
| Itration            | 1310      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.15e+03  |
| Reward Loss         | -2.19e+05 |
| Running Env Steps   | 6550000   |
| Running Forward KL  | 5.25      |
| Running Reverse KL  | 3.37      |
| Running Update Time | 1310      |
-----------------------------------
--2024-08-13 11:42:52.882080 UTC---
| Itration            | 1311      |
| PAGAR Loss          | 4.17e+04  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -2.37e+05 |
| Running Env Steps   | 6555000   |
| Running Forward KL  | 4.76      |
| Running Reverse KL  | 2.59      |
| Running Update Time | 1311      |
-----------------------------------
--2024-08-13 11:44:41.158648 UTC---
| Itration            | 1312      |
| PAGAR Loss          | 5.45e+04  |
| Real Det Return     | 5.35e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -4.37e+05 |
| Running Env Steps   | 6560000   |
| Running Forward KL  | 5.12      |
| Running Reverse KL  | 3.02      |
| Running Update Time | 1312      |
-----------------------------------
--2024-08-13 11:46:29.060909 UTC---
| Itration            | 1313      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.49e+04 |
| Running Env Steps   | 6565000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 23        |
| Running Update Time | 1313      |
-----------------------------------
--2024-08-13 11:48:16.786295 UTC---
| Itration            | 1314      |
| PAGAR Loss          | 1.49e+06  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -9.59e+04 |
| Running Env Steps   | 6570000   |
| Running Forward KL  | 5.3       |
| Running Reverse KL  | 3.46      |
| Running Update Time | 1314      |
-----------------------------------
--2024-08-13 11:50:04.743554 UTC---
| Itration            | 1315      |
| PAGAR Loss          | 1.2e+04   |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -5.23e+05 |
| Running Env Steps   | 6575000   |
| Running Forward KL  | 5.14      |
| Running Reverse KL  | 3.09      |
| Running Update Time | 1315      |
-----------------------------------
--2024-08-13 11:51:52.710405 UTC---
| Itration            | 1316      |
| PAGAR Loss          | 7.8e+05   |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.48e+06 |
| Running Env Steps   | 6580000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 40.6      |
| Running Update Time | 1316      |
-----------------------------------
--2024-08-13 11:53:38.646291 UTC---
| Itration            | 1317      |
| PAGAR Loss          | 6.19e+04  |
| Real Det Return     | 4.46e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -5.73e+05 |
| Running Env Steps   | 6585000   |
| Running Forward KL  | 5.34      |
| Running Reverse KL  | 3.67      |
| Running Update Time | 1317      |
-----------------------------------
--2024-08-13 11:55:26.836283 UTC---
| Itration            | 1318      |
| PAGAR Loss          | -1.1e+03  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -3.59e+05 |
| Running Env Steps   | 6590000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 3.73      |
| Running Update Time | 1318      |
-----------------------------------
--2024-08-13 11:57:14.954485 UTC---
| Itration            | 1319      |
| PAGAR Loss          | 6.7e+08   |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.09e+06 |
| Running Env Steps   | 6595000   |
| Running Forward KL  | 5.5       |
| Running Reverse KL  | 3.48      |
| Running Update Time | 1319      |
-----------------------------------
--2024-08-13 11:59:03.081187 UTC---
| Itration            | 1320      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -2.23e+05 |
| Running Env Steps   | 6600000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 3.87      |
| Running Update Time | 1320      |
-----------------------------------
--2024-08-13 12:00:50.980524 UTC---
| Itration            | 1321      |
| PAGAR Loss          | 5.52e+05  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -1.76e+05 |
| Running Env Steps   | 6605000   |
| Running Forward KL  | 5.02      |
| Running Reverse KL  | 3.22      |
| Running Update Time | 1321      |
-----------------------------------
--2024-08-13 12:02:38.969546 UTC---
| Itration            | 1322      |
| PAGAR Loss          | 4.11e+06  |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -5.55e+04 |
| Running Env Steps   | 6610000   |
| Running Forward KL  | 5.17      |
| Running Reverse KL  | 3.39      |
| Running Update Time | 1322      |
-----------------------------------
--2024-08-13 12:04:26.126160 UTC---
| Itration            | 1323      |
| PAGAR Loss          | 1.73e+06  |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 5.11e+03  |
| Reward Loss         | -1.46e+05 |
| Running Env Steps   | 6615000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 3.85      |
| Running Update Time | 1323      |
-----------------------------------
--2024-08-13 12:06:14.198735 UTC---
| Itration            | 1324      |
| PAGAR Loss          | -1.6e+06  |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -1.43e+06 |
| Running Env Steps   | 6620000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 71.9      |
| Running Update Time | 1324      |
-----------------------------------
--2024-08-13 12:08:02.483318 UTC---
| Itration            | 1325      |
| PAGAR Loss          | -2.99e+09 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -3.72e+05 |
| Running Env Steps   | 6625000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 15.8      |
| Running Update Time | 1325      |
-----------------------------------
--2024-08-13 12:09:50.734747 UTC---
| Itration            | 1326      |
| PAGAR Loss          | 5.15e+05  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -3.89e+05 |
| Running Env Steps   | 6630000   |
| Running Forward KL  | 5.39      |
| Running Reverse KL  | 4.42      |
| Running Update Time | 1326      |
-----------------------------------
--2024-08-13 12:11:38.952226 UTC---
| Itration            | 1327      |
| PAGAR Loss          | 3.63e+05  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -3.74e+05 |
| Running Env Steps   | 6635000   |
| Running Forward KL  | 5.42      |
| Running Reverse KL  | 3.11      |
| Running Update Time | 1327      |
-----------------------------------
--2024-08-13 12:13:27.040367 UTC---
| Itration            | 1328      |
| PAGAR Loss          | 6.77e+04  |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -2.54e+05 |
| Running Env Steps   | 6640000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 4.68      |
| Running Update Time | 1328      |
-----------------------------------
--2024-08-13 12:15:15.908904 UTC---
| Itration            | 1329      |
| PAGAR Loss          | 1.4e+06   |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.08e+04 |
| Running Env Steps   | 6645000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.19      |
| Running Update Time | 1329      |
-----------------------------------
--2024-08-13 12:17:04.142950 UTC---
| Itration            | 1330      |
| PAGAR Loss          | 5.92e+04  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -4.82e+05 |
| Running Env Steps   | 6650000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 2.99      |
| Running Update Time | 1330      |
-----------------------------------
--2024-08-13 12:18:51.807640 UTC---
| Itration            | 1331      |
| PAGAR Loss          | -1.67e+05 |
| Real Det Return     | 5.64e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -9.11e+04 |
| Running Env Steps   | 6655000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 16.9      |
| Running Update Time | 1331      |
-----------------------------------
--2024-08-13 12:20:40.062815 UTC--
| Itration            | 1332     |
| PAGAR Loss          | 1.6e+05  |
| Real Det Return     | 5.66e+03 |
| Real Sto Return     | 5.49e+03 |
| Reward Loss         | 1.02e+05 |
| Running Env Steps   | 6660000  |
| Running Forward KL  | 5.63     |
| Running Reverse KL  | 3.57     |
| Running Update Time | 1332     |
----------------------------------
--2024-08-13 12:22:28.065020 UTC---
| Itration            | 1333      |
| PAGAR Loss          | 4.03e+05  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -5.06e+05 |
| Running Env Steps   | 6665000   |
| Running Forward KL  | 5.75      |
| Running Reverse KL  | 4.09      |
| Running Update Time | 1333      |
-----------------------------------
--2024-08-13 12:24:15.210076 UTC---
| Itration            | 1334      |
| PAGAR Loss          | -1.13e+06 |
| Real Det Return     | 5.03e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -4.66e+05 |
| Running Env Steps   | 6670000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 12.9      |
| Running Update Time | 1334      |
-----------------------------------
--2024-08-13 12:26:03.150919 UTC--
| Itration            | 1335     |
| PAGAR Loss          | 3.37e+04 |
| Real Det Return     | 5.56e+03 |
| Real Sto Return     | 5.35e+03 |
| Reward Loss         | 2.25e+05 |
| Running Env Steps   | 6675000  |
| Running Forward KL  | 5.15     |
| Running Reverse KL  | 3.58     |
| Running Update Time | 1335     |
----------------------------------
--2024-08-13 12:27:51.057741 UTC--
| Itration            | 1336     |
| PAGAR Loss          | 7.95e+07 |
| Real Det Return     | 5.51e+03 |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | -4.2e+05 |
| Running Env Steps   | 6680000  |
| Running Forward KL  | 5.75     |
| Running Reverse KL  | 4.49     |
| Running Update Time | 1336     |
----------------------------------
--2024-08-13 12:29:39.138586 UTC---
| Itration            | 1337      |
| PAGAR Loss          | -4.51e+04 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -3.14e+06 |
| Running Env Steps   | 6685000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 63.6      |
| Running Update Time | 1337      |
-----------------------------------
--2024-08-13 12:31:26.905527 UTC---
| Itration            | 1338      |
| PAGAR Loss          | 3.06e+03  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.93e+05 |
| Running Env Steps   | 6690000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 4.41      |
| Running Update Time | 1338      |
-----------------------------------
--2024-08-13 12:33:17.023623 UTC---
| Itration            | 1339      |
| PAGAR Loss          | 1.16e+05  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.42e+05 |
| Running Env Steps   | 6695000   |
| Running Forward KL  | 5.45      |
| Running Reverse KL  | 3.94      |
| Running Update Time | 1339      |
-----------------------------------
--2024-08-13 12:35:36.523703 UTC---
| Itration            | 1340      |
| PAGAR Loss          | 1.86e+04  |
| Real Det Return     | 5.29e+03  |
| Real Sto Return     | 5.3e+03   |
| Reward Loss         | -7.38e+05 |
| Running Env Steps   | 6700000   |
| Running Forward KL  | 5.89      |
| Running Reverse KL  | 7.69      |
| Running Update Time | 1340      |
-----------------------------------
--2024-08-13 12:37:55.725706 UTC--
| Itration            | 1341     |
| PAGAR Loss          | 3.9e+05  |
| Real Det Return     | 5.4e+03  |
| Real Sto Return     | 5.23e+03 |
| Reward Loss         | -9.1e+05 |
| Running Env Steps   | 6705000  |
| Running Forward KL  | 5.41     |
| Running Reverse KL  | 4.31     |
| Running Update Time | 1341     |
----------------------------------
--2024-08-13 12:40:15.122325 UTC---
| Itration            | 1342      |
| PAGAR Loss          | -1.5e+04  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -3.72e+05 |
| Running Env Steps   | 6710000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 4.18      |
| Running Update Time | 1342      |
-----------------------------------
--2024-08-13 12:42:29.568877 UTC---
| Itration            | 1343      |
| PAGAR Loss          | 1.04e+06  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -9.76e+05 |
| Running Env Steps   | 6715000   |
| Running Forward KL  | 5.41      |
| Running Reverse KL  | 4.21      |
| Running Update Time | 1343      |
-----------------------------------
--2024-08-13 12:44:46.990437 UTC---
| Itration            | 1344      |
| PAGAR Loss          | 2.83e+05  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -6.43e+05 |
| Running Env Steps   | 6720000   |
| Running Forward KL  | 5.63      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 1344      |
-----------------------------------
--2024-08-13 12:47:07.039251 UTC---
| Itration            | 1345      |
| PAGAR Loss          | 1.12e+05  |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -3.41e+05 |
| Running Env Steps   | 6725000   |
| Running Forward KL  | 5.52      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 1345      |
-----------------------------------
--2024-08-13 12:49:26.075256 UTC---
| Itration            | 1346      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.69e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -3.47e+05 |
| Running Env Steps   | 6730000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 8.17      |
| Running Update Time | 1346      |
-----------------------------------
--2024-08-13 12:51:32.367191 UTC---
| Itration            | 1347      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -1.72e+06 |
| Running Env Steps   | 6735000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 35.3      |
| Running Update Time | 1347      |
-----------------------------------
--2024-08-13 12:53:17.225178 UTC---
| Itration            | 1348      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 3.97e+03  |
| Reward Loss         | -7.31e+06 |
| Running Env Steps   | 6740000   |
| Running Forward KL  | 7.9       |
| Running Reverse KL  | 195       |
| Running Update Time | 1348      |
-----------------------------------
--2024-08-13 12:54:49.229330 UTC---
| Itration            | 1349      |
| PAGAR Loss          | nan       |
| Real Det Return     | 419       |
| Real Sto Return     | 369       |
| Reward Loss         | -2.67e+07 |
| Running Env Steps   | 6745000   |
| Running Forward KL  | 18.2      |
| Running Reverse KL  | 357       |
| Running Update Time | 1349      |
-----------------------------------
--2024-08-13 12:56:44.873148 UTC---
| Itration            | 1350      |
| PAGAR Loss          | nan       |
| Real Det Return     | 995       |
| Real Sto Return     | 594       |
| Reward Loss         | -2.46e+07 |
| Running Env Steps   | 6750000   |
| Running Forward KL  | 21.4      |
| Running Reverse KL  | 249       |
| Running Update Time | 1350      |
-----------------------------------
--2024-08-13 12:58:15.405064 UTC---
| Itration            | 1351      |
| PAGAR Loss          | nan       |
| Real Det Return     | 567       |
| Real Sto Return     | 476       |
| Reward Loss         | -3.25e+07 |
| Running Env Steps   | 6755000   |
| Running Forward KL  | 18.4      |
| Running Reverse KL  | 335       |
| Running Update Time | 1351      |
-----------------------------------
--2024-08-13 12:59:42.600242 UTC---
| Itration            | 1352      |
| PAGAR Loss          | -4.97e+09 |
| Real Det Return     | 195       |
| Real Sto Return     | 213       |
| Reward Loss         | -2.82e+07 |
| Running Env Steps   | 6760000   |
| Running Forward KL  | 22.3      |
| Running Reverse KL  | 361       |
| Running Update Time | 1352      |
-----------------------------------
--2024-08-13 13:01:18.373642 UTC---
| Itration            | 1353      |
| PAGAR Loss          | nan       |
| Real Det Return     | 2.46e+03  |
| Real Sto Return     | 2.26e+03  |
| Reward Loss         | -7.99e+06 |
| Running Env Steps   | 6765000   |
| Running Forward KL  | 11.7      |
| Running Reverse KL  | 234       |
| Running Update Time | 1353      |
-----------------------------------
--2024-08-13 13:03:35.411433 UTC---
| Itration            | 1354      |
| PAGAR Loss          | 2.08e+06  |
| Real Det Return     | 4.13e+03  |
| Real Sto Return     | 4.77e+03  |
| Reward Loss         | -1.87e+06 |
| Running Env Steps   | 6770000   |
| Running Forward KL  | 5.56      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 1354      |
-----------------------------------
--2024-08-13 13:05:53.234109 UTC---
| Itration            | 1355      |
| PAGAR Loss          | -1.4e+06  |
| Real Det Return     | 4.82e+03  |
| Real Sto Return     | 4.47e+03  |
| Reward Loss         | -1.06e+06 |
| Running Env Steps   | 6775000   |
| Running Forward KL  | 5.84      |
| Running Reverse KL  | 14.6      |
| Running Update Time | 1355      |
-----------------------------------
--2024-08-13 13:08:08.027867 UTC---
| Itration            | 1356      |
| PAGAR Loss          | -8.42e+05 |
| Real Det Return     | 4e+03     |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -1.75e+06 |
| Running Env Steps   | 6780000   |
| Running Forward KL  | 6.87      |
| Running Reverse KL  | 101       |
| Running Update Time | 1356      |
-----------------------------------
--2024-08-13 13:10:24.844020 UTC--
| Itration            | 1357     |
| PAGAR Loss          | 3.4e+05  |
| Real Det Return     | 5.23e+03 |
| Real Sto Return     | 4.61e+03 |
| Reward Loss         | -1.6e+06 |
| Running Env Steps   | 6785000  |
| Running Forward KL  | 5.97     |
| Running Reverse KL  | 33.6     |
| Running Update Time | 1357     |
----------------------------------
--2024-08-13 13:12:44.194409 UTC---
| Itration            | 1358      |
| PAGAR Loss          | 6.45e+04  |
| Real Det Return     | 5.4e+03   |
| Real Sto Return     | 4.89e+03  |
| Reward Loss         | -1.37e+06 |
| Running Env Steps   | 6790000   |
| Running Forward KL  | 5.7       |
| Running Reverse KL  | 29.8      |
| Running Update Time | 1358      |
-----------------------------------
--2024-08-13 13:15:02.777382 UTC---
| Itration            | 1359      |
| PAGAR Loss          | -3.08e+05 |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -1.73e+06 |
| Running Env Steps   | 6795000   |
| Running Forward KL  | 5.71      |
| Running Reverse KL  | 37        |
| Running Update Time | 1359      |
-----------------------------------
--2024-08-13 13:17:12.442424 UTC---
| Itration            | 1360      |
| PAGAR Loss          | -2.66e+06 |
| Real Det Return     | 1.85e+03  |
| Real Sto Return     | 3.12e+03  |
| Reward Loss         | -9.61e+05 |
| Running Env Steps   | 6800000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 50.4      |
| Running Update Time | 1360      |
-----------------------------------
--2024-08-13 13:19:31.278178 UTC---
| Itration            | 1361      |
| PAGAR Loss          | 7.69e+05  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.22e+03  |
| Reward Loss         | -2.02e+05 |
| Running Env Steps   | 6805000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 42.9      |
| Running Update Time | 1361      |
-----------------------------------
--2024-08-13 13:21:45.869729 UTC---
| Itration            | 1362      |
| PAGAR Loss          | 1.31e+06  |
| Real Det Return     | 3.29e+03  |
| Real Sto Return     | 4.39e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 6810000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 35.3      |
| Running Update Time | 1362      |
-----------------------------------
--2024-08-13 13:23:51.792001 UTC---
| Itration            | 1363      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.15e+03  |
| Real Sto Return     | 4.66e+03  |
| Reward Loss         | -2.67e+06 |
| Running Env Steps   | 6815000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 68.3      |
| Running Update Time | 1363      |
-----------------------------------
--2024-08-13 13:25:38.646704 UTC---
| Itration            | 1364      |
| PAGAR Loss          | -1.75e+06 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.68e+03  |
| Reward Loss         | -1.43e+06 |
| Running Env Steps   | 6820000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 34.3      |
| Running Update Time | 1364      |
-----------------------------------
--2024-08-13 13:27:25.893376 UTC--
| Itration            | 1365     |
| PAGAR Loss          | 1.4e+06  |
| Real Det Return     | 5.52e+03 |
| Real Sto Return     | 4.88e+03 |
| Reward Loss         | -2.8e+06 |
| Running Env Steps   | 6825000  |
| Running Forward KL  | 6.08     |
| Running Reverse KL  | 73.2     |
| Running Update Time | 1365     |
----------------------------------
--2024-08-13 13:29:14.184384 UTC---
| Itration            | 1366      |
| PAGAR Loss          | 1.97e+06  |
| Real Det Return     | 5.25e+03  |
| Real Sto Return     | 5.03e+03  |
| Reward Loss         | -7.96e+05 |
| Running Env Steps   | 6830000   |
| Running Forward KL  | 5.83      |
| Running Reverse KL  | 4.05      |
| Running Update Time | 1366      |
-----------------------------------
--2024-08-13 13:31:02.313545 UTC---
| Itration            | 1367      |
| PAGAR Loss          | 1.95e+05  |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -2.41e+05 |
| Running Env Steps   | 6835000   |
| Running Forward KL  | 5.61      |
| Running Reverse KL  | 4.29      |
| Running Update Time | 1367      |
-----------------------------------
--2024-08-13 13:32:49.720242 UTC---
| Itration            | 1368      |
| PAGAR Loss          | -2.67e+05 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -1.68e+05 |
| Running Env Steps   | 6840000   |
| Running Forward KL  | 5.68      |
| Running Reverse KL  | 4.19      |
| Running Update Time | 1368      |
-----------------------------------
--2024-08-13 13:34:37.363563 UTC---
| Itration            | 1369      |
| PAGAR Loss          | 7.5e+04   |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -1.44e+06 |
| Running Env Steps   | 6845000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 41.6      |
| Running Update Time | 1369      |
-----------------------------------
--2024-08-13 13:36:13.639018 UTC---
| Itration            | 1370      |
| PAGAR Loss          | -7.99e+05 |
| Real Det Return     | 1.21e+03  |
| Real Sto Return     | 3.09e+03  |
| Reward Loss         | -5.51e+06 |
| Running Env Steps   | 6850000   |
| Running Forward KL  | 7.97      |
| Running Reverse KL  | 196       |
| Running Update Time | 1370      |
-----------------------------------
--2024-08-13 13:37:58.981072 UTC---
| Itration            | 1371      |
| PAGAR Loss          | -8.35e+06 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 4.34e+03  |
| Reward Loss         | -2.47e+06 |
| Running Env Steps   | 6855000   |
| Running Forward KL  | 6.42      |
| Running Reverse KL  | 68.4      |
| Running Update Time | 1371      |
-----------------------------------
--2024-08-13 13:39:46.280112 UTC---
| Itration            | 1372      |
| PAGAR Loss          | -3.33e+05 |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -6.1e+04  |
| Running Env Steps   | 6860000   |
| Running Forward KL  | 6.25      |
| Running Reverse KL  | 5.51      |
| Running Update Time | 1372      |
-----------------------------------
--2024-08-13 13:41:32.945288 UTC---
| Itration            | 1373      |
| PAGAR Loss          | 3.81e+08  |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 4.7e+03   |
| Reward Loss         | -2.21e+06 |
| Running Env Steps   | 6865000   |
| Running Forward KL  | 6.27      |
| Running Reverse KL  | 68.2      |
| Running Update Time | 1373      |
-----------------------------------
--2024-08-13 13:43:21.065395 UTC---
| Itration            | 1374      |
| PAGAR Loss          | -5.6e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -8.68e+05 |
| Running Env Steps   | 6870000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 4.54      |
| Running Update Time | 1374      |
-----------------------------------
--2024-08-13 13:45:31.867279 UTC---
| Itration            | 1375      |
| PAGAR Loss          | -3.43e+04 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | -9.19e+04 |
| Running Env Steps   | 6875000   |
| Running Forward KL  | 6.37      |
| Running Reverse KL  | 5.26      |
| Running Update Time | 1375      |
-----------------------------------
--2024-08-13 13:47:18.821680 UTC---
| Itration            | 1376      |
| PAGAR Loss          | -3.75e+05 |
| Real Det Return     | 5.63e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -1.46e+05 |
| Running Env Steps   | 6880000   |
| Running Forward KL  | 6.1       |
| Running Reverse KL  | 30.2      |
| Running Update Time | 1376      |
-----------------------------------
--2024-08-13 13:49:06.954883 UTC---
| Itration            | 1377      |
| PAGAR Loss          | -6.22e+04 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -4.84e+05 |
| Running Env Steps   | 6885000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 3.79      |
| Running Update Time | 1377      |
-----------------------------------
--2024-08-13 13:50:55.196369 UTC---
| Itration            | 1378      |
| PAGAR Loss          | -5.18e+04 |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -4.03e+05 |
| Running Env Steps   | 6890000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 4.25      |
| Running Update Time | 1378      |
-----------------------------------
--2024-08-13 13:52:43.351544 UTC--
| Itration            | 1379     |
| PAGAR Loss          | 4.5e+05  |
| Real Det Return     | 5.57e+03 |
| Real Sto Return     | 5.48e+03 |
| Reward Loss         | 1.03e+05 |
| Running Env Steps   | 6895000  |
| Running Forward KL  | 6.18     |
| Running Reverse KL  | 4.94     |
| Running Update Time | 1379     |
----------------------------------
--2024-08-13 13:54:30.528153 UTC--
| Itration            | 1380     |
| PAGAR Loss          | -6.5e+04 |
| Real Det Return     | 5.53e+03 |
| Real Sto Return     | 4.97e+03 |
| Reward Loss         | -1.1e+06 |
| Running Env Steps   | 6900000  |
| Running Forward KL  | 6.01     |
| Running Reverse KL  | 66.5     |
| Running Update Time | 1380     |
----------------------------------
--2024-08-13 13:56:18.250969 UTC---
| Itration            | 1381      |
| PAGAR Loss          | -1.49e+05 |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -5.11e+05 |
| Running Env Steps   | 6905000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 40.7      |
| Running Update Time | 1381      |
-----------------------------------
--2024-08-13 13:58:06.380112 UTC---
| Itration            | 1382      |
| PAGAR Loss          | -9e+04    |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -5.54e+05 |
| Running Env Steps   | 6910000   |
| Running Forward KL  | 5.44      |
| Running Reverse KL  | 4.33      |
| Running Update Time | 1382      |
-----------------------------------
--2024-08-13 13:59:54.027782 UTC---
| Itration            | 1383      |
| PAGAR Loss          | -3.81e+05 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | 9.77e+05  |
| Running Env Steps   | 6915000   |
| Running Forward KL  | 6.41      |
| Running Reverse KL  | 35.4      |
| Running Update Time | 1383      |
-----------------------------------
--2024-08-13 14:01:42.155477 UTC---
| Itration            | 1384      |
| PAGAR Loss          | -4.44e+04 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -5.25e+05 |
| Running Env Steps   | 6920000   |
| Running Forward KL  | 5.69      |
| Running Reverse KL  | 4.81      |
| Running Update Time | 1384      |
-----------------------------------
--2024-08-13 14:03:29.430233 UTC---
| Itration            | 1385      |
| PAGAR Loss          | -2.72e+05 |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 4.96e+03  |
| Reward Loss         | -1.1e+04  |
| Running Env Steps   | 6925000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 4.86      |
| Running Update Time | 1385      |
-----------------------------------
--2024-08-13 14:05:17.697119 UTC---
| Itration            | 1386      |
| PAGAR Loss          | -6.38e+03 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -6.63e+05 |
| Running Env Steps   | 6930000   |
| Running Forward KL  | 5.1       |
| Running Reverse KL  | 3.25      |
| Running Update Time | 1386      |
-----------------------------------
--2024-08-13 14:07:05.945644 UTC---
| Itration            | 1387      |
| PAGAR Loss          | -2.6e+05  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 6935000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 1387      |
-----------------------------------
--2024-08-13 14:08:54.027834 UTC---
| Itration            | 1388      |
| PAGAR Loss          | -2.89e+05 |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.4e+03   |
| Reward Loss         | -2.67e+05 |
| Running Env Steps   | 6940000   |
| Running Forward KL  | 6.6       |
| Running Reverse KL  | 5.54      |
| Running Update Time | 1388      |
-----------------------------------
--2024-08-13 14:10:42.313701 UTC---
| Itration            | 1389      |
| PAGAR Loss          | -1.67e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -7.85e+05 |
| Running Env Steps   | 6945000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 4.18      |
| Running Update Time | 1389      |
-----------------------------------
--2024-08-13 14:12:30.175513 UTC---
| Itration            | 1390      |
| PAGAR Loss          | -1.18e+06 |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -1.05e+06 |
| Running Env Steps   | 6950000   |
| Running Forward KL  | 5.52      |
| Running Reverse KL  | 20.2      |
| Running Update Time | 1390      |
-----------------------------------
--2024-08-13 14:14:18.431575 UTC---
| Itration            | 1391      |
| PAGAR Loss          | -6.18e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -5.88e+05 |
| Running Env Steps   | 6955000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.73      |
| Running Update Time | 1391      |
-----------------------------------
--2024-08-13 14:16:06.569224 UTC---
| Itration            | 1392      |
| PAGAR Loss          | 1.73e+06  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.2e+03   |
| Reward Loss         | -9.24e+05 |
| Running Env Steps   | 6960000   |
| Running Forward KL  | 5.72      |
| Running Reverse KL  | 4.87      |
| Running Update Time | 1392      |
-----------------------------------
--2024-08-13 14:17:54.819373 UTC---
| Itration            | 1393      |
| PAGAR Loss          | -1.96e+04 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -4.23e+05 |
| Running Env Steps   | 6965000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 3.66      |
| Running Update Time | 1393      |
-----------------------------------
--2024-08-13 14:19:43.086522 UTC---
| Itration            | 1394      |
| PAGAR Loss          | 9.16e+04  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -4.41e+05 |
| Running Env Steps   | 6970000   |
| Running Forward KL  | 6.06      |
| Running Reverse KL  | 4.34      |
| Running Update Time | 1394      |
-----------------------------------
--2024-08-13 14:21:31.260890 UTC---
| Itration            | 1395      |
| PAGAR Loss          | -1.22e+06 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -4.88e+05 |
| Running Env Steps   | 6975000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.82      |
| Running Update Time | 1395      |
-----------------------------------
--2024-08-13 14:23:19.631637 UTC---
| Itration            | 1396      |
| PAGAR Loss          | 2.78e+05  |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -2.49e+05 |
| Running Env Steps   | 6980000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 4.81      |
| Running Update Time | 1396      |
-----------------------------------
--2024-08-13 14:25:16.668579 UTC---
| Itration            | 1397      |
| PAGAR Loss          | -4.96e+04 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.12e+03  |
| Reward Loss         | -2.98e+05 |
| Running Env Steps   | 6985000   |
| Running Forward KL  | 5.8       |
| Running Reverse KL  | 4.91      |
| Running Update Time | 1397      |
-----------------------------------
--2024-08-13 14:27:30.692851 UTC---
| Itration            | 1398      |
| PAGAR Loss          | 4.32e+05  |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -9.47e+05 |
| Running Env Steps   | 6990000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 4.64      |
| Running Update Time | 1398      |
-----------------------------------
--2024-08-13 14:29:50.068490 UTC---
| Itration            | 1399      |
| PAGAR Loss          | 5.91e+04  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | -3.19e+05 |
| Running Env Steps   | 6995000   |
| Running Forward KL  | 5.82      |
| Running Reverse KL  | 4.27      |
| Running Update Time | 1399      |
-----------------------------------
--2024-08-13 14:32:07.321084 UTC---
| Itration            | 1400      |
| PAGAR Loss          | -4.03e+03 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -6.09e+05 |
| Running Env Steps   | 7000000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 4.42      |
| Running Update Time | 1400      |
-----------------------------------
--2024-08-13 14:34:21.551926 UTC---
| Itration            | 1401      |
| PAGAR Loss          | 1.72e+04  |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -6.04e+05 |
| Running Env Steps   | 7005000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 4.14      |
| Running Update Time | 1401      |
-----------------------------------
--2024-08-13 14:36:08.777101 UTC---
| Itration            | 1402      |
| PAGAR Loss          | -1.57e+05 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.63e+03  |
| Reward Loss         | -2.5e+06  |
| Running Env Steps   | 7010000   |
| Running Forward KL  | 6.12      |
| Running Reverse KL  | 80.1      |
| Running Update Time | 1402      |
-----------------------------------
--2024-08-13 14:37:56.986431 UTC---
| Itration            | 1403      |
| PAGAR Loss          | -1.91e+05 |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.9e+05  |
| Running Env Steps   | 7015000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 4.51      |
| Running Update Time | 1403      |
-----------------------------------
--2024-08-13 14:39:44.572879 UTC---
| Itration            | 1404      |
| PAGAR Loss          | -3.46e+04 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.45e+03  |
| Reward Loss         | 1.31e+04  |
| Running Env Steps   | 7020000   |
| Running Forward KL  | 6.01      |
| Running Reverse KL  | 4.9       |
| Running Update Time | 1404      |
-----------------------------------
--2024-08-13 14:41:32.460358 UTC---
| Itration            | 1405      |
| PAGAR Loss          | -5.87e+05 |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.47e+03  |
| Reward Loss         | -4.14e+05 |
| Running Env Steps   | 7025000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 5.92      |
| Running Update Time | 1405      |
-----------------------------------
--2024-08-13 14:43:20.613426 UTC---
| Itration            | 1406      |
| PAGAR Loss          | -8.26e+04 |
| Real Det Return     | 5.58e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -1.36e+05 |
| Running Env Steps   | 7030000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 4.72      |
| Running Update Time | 1406      |
-----------------------------------
--2024-08-13 14:45:08.894805 UTC---
| Itration            | 1407      |
| PAGAR Loss          | -7.38e+04 |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 5.13e+03  |
| Reward Loss         | -1.39e+06 |
| Running Env Steps   | 7035000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 3.92      |
| Running Update Time | 1407      |
-----------------------------------
--2024-08-13 14:46:56.841909 UTC---
| Itration            | 1408      |
| PAGAR Loss          | -8.63e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.28e+03  |
| Reward Loss         | -1.08e+06 |
| Running Env Steps   | 7040000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 14.1      |
| Running Update Time | 1408      |
-----------------------------------
--2024-08-13 14:48:45.243604 UTC---
| Itration            | 1409      |
| PAGAR Loss          | -8.85e+04 |
| Real Det Return     | 5.62e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | 2.33e+05  |
| Running Env Steps   | 7045000   |
| Running Forward KL  | 5.79      |
| Running Reverse KL  | 4.36      |
| Running Update Time | 1409      |
-----------------------------------
--2024-08-13 14:50:33.472077 UTC---
| Itration            | 1410      |
| PAGAR Loss          | -2.09e+04 |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.43e+03  |
| Reward Loss         | 7.72e+03  |
| Running Env Steps   | 7050000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 4.53      |
| Running Update Time | 1410      |
-----------------------------------
--2024-08-13 14:52:21.656993 UTC---
| Itration            | 1411      |
| PAGAR Loss          | 6.01e+05  |
| Real Det Return     | 5.36e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -1.03e+06 |
| Running Env Steps   | 7055000   |
| Running Forward KL  | 5.47      |
| Running Reverse KL  | 4.06      |
| Running Update Time | 1411      |
-----------------------------------
--2024-08-13 14:54:09.739483 UTC--
| Itration            | 1412     |
| PAGAR Loss          | 1.91e+07 |
| Real Det Return     | 5.58e+03 |
| Real Sto Return     | 5.5e+03  |
| Reward Loss         | 4.03e+04 |
| Running Env Steps   | 7060000  |
| Running Forward KL  | 6.11     |
| Running Reverse KL  | 5.35     |
| Running Update Time | 1412     |
----------------------------------
--2024-08-13 14:55:57.862054 UTC---
| Itration            | 1413      |
| PAGAR Loss          | -3.55e+05 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -2.92e+05 |
| Running Env Steps   | 7065000   |
| Running Forward KL  | 5.85      |
| Running Reverse KL  | 4.54      |
| Running Update Time | 1413      |
-----------------------------------
--2024-08-13 14:57:46.024344 UTC---
| Itration            | 1414      |
| PAGAR Loss          | -1.87e+05 |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -6.93e+05 |
| Running Env Steps   | 7070000   |
| Running Forward KL  | 5.55      |
| Running Reverse KL  | 3.88      |
| Running Update Time | 1414      |
-----------------------------------
--2024-08-13 14:59:34.265213 UTC---
| Itration            | 1415      |
| PAGAR Loss          | -2.89e+05 |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.37e+03  |
| Reward Loss         | -2.41e+05 |
| Running Env Steps   | 7075000   |
| Running Forward KL  | 6.15      |
| Running Reverse KL  | 4.52      |
| Running Update Time | 1415      |
-----------------------------------
--2024-08-13 15:01:22.112615 UTC---
| Itration            | 1416      |
| PAGAR Loss          | -1.2e+05  |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.16e+03  |
| Reward Loss         | -3.97e+05 |
| Running Env Steps   | 7080000   |
| Running Forward KL  | 5.64      |
| Running Reverse KL  | 3.91      |
| Running Update Time | 1416      |
-----------------------------------
--2024-08-13 15:03:10.547354 UTC--
| Itration            | 1417     |
| PAGAR Loss          | -1e+05   |
| Real Det Return     | 5.62e+03 |
| Real Sto Return     | 5.53e+03 |
| Reward Loss         | 1.83e+05 |
| Running Env Steps   | 7085000  |
| Running Forward KL  | 5.99     |
| Running Reverse KL  | 4.51     |
| Running Update Time | 1417     |
----------------------------------
--2024-08-13 15:04:58.008307 UTC---
| Itration            | 1418      |
| PAGAR Loss          | 3.6e+04   |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.23e+03  |
| Reward Loss         | -4.91e+05 |
| Running Env Steps   | 7090000   |
| Running Forward KL  | 5.74      |
| Running Reverse KL  | 4.83      |
| Running Update Time | 1418      |
-----------------------------------
--2024-08-13 15:06:46.323139 UTC---
| Itration            | 1419      |
| PAGAR Loss          | 1.25e+05  |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -5.83e+05 |
| Running Env Steps   | 7095000   |
| Running Forward KL  | 5.77      |
| Running Reverse KL  | 3.84      |
| Running Update Time | 1419      |
-----------------------------------
--2024-08-13 15:08:34.652931 UTC---
| Itration            | 1420      |
| PAGAR Loss          | 6.77e+04  |
| Real Det Return     | 5.46e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -4.07e+05 |
| Running Env Steps   | 7100000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.26      |
| Running Update Time | 1420      |
-----------------------------------
--2024-08-13 15:10:22.293622 UTC---
| Itration            | 1421      |
| PAGAR Loss          | 3.51e+04  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.05e+03  |
| Reward Loss         | -1.68e+06 |
| Running Env Steps   | 7105000   |
| Running Forward KL  | 5.94      |
| Running Reverse KL  | 36.9      |
| Running Update Time | 1421      |
-----------------------------------
--2024-08-13 15:12:10.624549 UTC---
| Itration            | 1422      |
| PAGAR Loss          | 7.64e+04  |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.42e+03  |
| Reward Loss         | -3.97e+05 |
| Running Env Steps   | 7110000   |
| Running Forward KL  | 6.09      |
| Running Reverse KL  | 4.56      |
| Running Update Time | 1422      |
-----------------------------------
--2024-08-13 15:13:58.623374 UTC---
| Itration            | 1423      |
| PAGAR Loss          | 2.53e+05  |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 5.17e+03  |
| Reward Loss         | -8.88e+05 |
| Running Env Steps   | 7115000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 4.41      |
| Running Update Time | 1423      |
-----------------------------------
--2024-08-13 15:15:46.799246 UTC---
| Itration            | 1424      |
| PAGAR Loss          | -1.65e+05 |
| Real Det Return     | 5.56e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.55e+05 |
| Running Env Steps   | 7120000   |
| Running Forward KL  | 5.59      |
| Running Reverse KL  | 4.28      |
| Running Update Time | 1424      |
-----------------------------------
--2024-08-13 15:17:34.655348 UTC---
| Itration            | 1425      |
| PAGAR Loss          | 2.35e+05  |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -6.05e+05 |
| Running Env Steps   | 7125000   |
| Running Forward KL  | 5.9       |
| Running Reverse KL  | 4.49      |
| Running Update Time | 1425      |
-----------------------------------
--2024-08-13 15:19:22.581959 UTC---
| Itration            | 1426      |
| PAGAR Loss          | -4.71e+04 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.24e+03  |
| Reward Loss         | -4.11e+05 |
| Running Env Steps   | 7130000   |
| Running Forward KL  | 5.54      |
| Running Reverse KL  | 4         |
| Running Update Time | 1426      |
-----------------------------------
--2024-08-13 15:21:10.874499 UTC---
| Itration            | 1427      |
| PAGAR Loss          | -1.43e+05 |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -5.16e+05 |
| Running Env Steps   | 7135000   |
| Running Forward KL  | 5.41      |
| Running Reverse KL  | 3.86      |
| Running Update Time | 1427      |
-----------------------------------
--2024-08-13 15:22:59.139091 UTC--
| Itration            | 1428     |
| PAGAR Loss          | 8.07e+04 |
| Real Det Return     | 5.46e+03 |
| Real Sto Return     | 5.31e+03 |
| Reward Loss         | -7.7e+05 |
| Running Env Steps   | 7140000  |
| Running Forward KL  | 5.83     |
| Running Reverse KL  | 4.2      |
| Running Update Time | 1428     |
----------------------------------
--2024-08-13 15:24:47.446635 UTC---
| Itration            | 1429      |
| PAGAR Loss          | 1.46e+05  |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -3.41e+05 |
| Running Env Steps   | 7145000   |
| Running Forward KL  | 5.96      |
| Running Reverse KL  | 4.65      |
| Running Update Time | 1429      |
-----------------------------------
--2024-08-13 15:26:35.755617 UTC---
| Itration            | 1430      |
| PAGAR Loss          | -6.08e+03 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.35e+03  |
| Reward Loss         | -3.97e+05 |
| Running Env Steps   | 7150000   |
| Running Forward KL  | 5.51      |
| Running Reverse KL  | 4.28      |
| Running Update Time | 1430      |
-----------------------------------
--2024-08-13 15:28:21.583333 UTC---
| Itration            | 1431      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.28e+03  |
| Reward Loss         | -1.11e+07 |
| Running Env Steps   | 7155000   |
| Running Forward KL  | 6.97      |
| Running Reverse KL  | 122       |
| Running Update Time | 1431      |
-----------------------------------
--2024-08-13 15:30:00.253267 UTC--
| Itration            | 1432     |
| PAGAR Loss          | nan      |
| Real Det Return     | 4.12e+03 |
| Real Sto Return     | 1.95e+03 |
| Reward Loss         | -1.8e+07 |
| Running Env Steps   | 7160000  |
| Running Forward KL  | 7.15     |
| Running Reverse KL  | 164      |
| Running Update Time | 1432     |
----------------------------------
--2024-08-13 15:31:46.856774 UTC---
| Itration            | 1433      |
| PAGAR Loss          | -2e+05    |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 4.86e+03  |
| Reward Loss         | -5.34e+06 |
| Running Env Steps   | 7165000   |
| Running Forward KL  | 6.44      |
| Running Reverse KL  | 104       |
| Running Update Time | 1433      |
-----------------------------------
--2024-08-13 15:33:34.961178 UTC---
| Itration            | 1434      |
| PAGAR Loss          | 4.48e+05  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.26e+03  |
| Reward Loss         | -6.17e+05 |
| Running Env Steps   | 7170000   |
| Running Forward KL  | 5.93      |
| Running Reverse KL  | 4.75      |
| Running Update Time | 1434      |
-----------------------------------
--2024-08-13 15:35:22.964935 UTC---
| Itration            | 1435      |
| PAGAR Loss          | -1.22e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5e+03     |
| Reward Loss         | -9.19e+05 |
| Running Env Steps   | 7175000   |
| Running Forward KL  | 5.76      |
| Running Reverse KL  | 4.74      |
| Running Update Time | 1435      |
-----------------------------------
--2024-08-13 15:37:11.493066 UTC---
| Itration            | 1436      |
| PAGAR Loss          | 6.11e+04  |
| Real Det Return     | 5.39e+03  |
| Real Sto Return     | 5.29e+03  |
| Reward Loss         | -5.69e+05 |
| Running Env Steps   | 7180000   |
| Running Forward KL  | 5.98      |
| Running Reverse KL  | 4.61      |
| Running Update Time | 1436      |
-----------------------------------
--2024-08-13 15:38:59.521822 UTC---
| Itration            | 1437      |
| PAGAR Loss          | -2.92e+04 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -3.3e+05  |
| Running Env Steps   | 7185000   |
| Running Forward KL  | 6.58      |
| Running Reverse KL  | 5.31      |
| Running Update Time | 1437      |
-----------------------------------
--2024-08-13 15:40:47.885683 UTC---
| Itration            | 1438      |
| PAGAR Loss          | -2.25e+05 |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 5.27e+03  |
| Reward Loss         | -7.56e+05 |
| Running Env Steps   | 7190000   |
| Running Forward KL  | 5.49      |
| Running Reverse KL  | 4.15      |
| Running Update Time | 1438      |
-----------------------------------
--2024-08-13 15:42:36.280534 UTC---
| Itration            | 1439      |
| PAGAR Loss          | -1.61e+05 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -7.64e+05 |
| Running Env Steps   | 7195000   |
| Running Forward KL  | 6.3       |
| Running Reverse KL  | 4.61      |
| Running Update Time | 1439      |
-----------------------------------
--2024-08-13 15:44:24.744909 UTC---
| Itration            | 1440      |
| PAGAR Loss          | -2.27e+05 |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -7.7e+05  |
| Running Env Steps   | 7200000   |
| Running Forward KL  | 6.03      |
| Running Reverse KL  | 4.55      |
| Running Update Time | 1440      |
-----------------------------------
--2024-08-13 15:46:13.023590 UTC---
| Itration            | 1441      |
| PAGAR Loss          | -1.85e+04 |
| Real Det Return     | 5.6e+03   |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -2.69e+06 |
| Running Env Steps   | 7205000   |
| Running Forward KL  | 6.94      |
| Running Reverse KL  | 42.1      |
| Running Update Time | 1441      |
-----------------------------------
--2024-08-13 15:48:00.189672 UTC---
| Itration            | 1442      |
| PAGAR Loss          | 1.13e+08  |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.43e+06 |
| Running Env Steps   | 7210000   |
| Running Forward KL  | 5.66      |
| Running Reverse KL  | 3.82      |
| Running Update Time | 1442      |
-----------------------------------
--2024-08-13 15:49:26.270574 UTC---
| Itration            | 1443      |
| PAGAR Loss          | -2.61e+07 |
| Real Det Return     | -11.6     |
| Real Sto Return     | -11.4     |
| Reward Loss         | -3.55e+07 |
| Running Env Steps   | 7215000   |
| Running Forward KL  | 31.5      |
| Running Reverse KL  | 403       |
| Running Update Time | 1443      |
-----------------------------------
--2024-08-13 15:50:59.831937 UTC---
| Itration            | 1444      |
| PAGAR Loss          | -1.35e+07 |
| Real Det Return     | 979       |
| Real Sto Return     | 695       |
| Reward Loss         | -2.38e+07 |
| Running Env Steps   | 7220000   |
| Running Forward KL  | 13.5      |
| Running Reverse KL  | 217       |
| Running Update Time | 1444      |
-----------------------------------
--2024-08-13 15:52:29.679656 UTC---
| Itration            | 1445      |
| PAGAR Loss          | nan       |
| Real Det Return     | 24.6      |
| Real Sto Return     | 963       |
| Reward Loss         | -2.16e+07 |
| Running Env Steps   | 7225000   |
| Running Forward KL  | 10.3      |
| Running Reverse KL  | 235       |
| Running Update Time | 1445      |
-----------------------------------
--2024-08-13 15:54:16.027344 UTC---
| Itration            | 1446      |
| PAGAR Loss          | -4.87e+05 |
| Real Det Return     | 5.59e+03  |
| Real Sto Return     | 4.97e+03  |
| Reward Loss         | -1.86e+06 |
| Running Env Steps   | 7230000   |
| Running Forward KL  | 6.66      |
| Running Reverse KL  | 40.9      |
| Running Update Time | 1446      |
-----------------------------------
--2024-08-13 15:56:03.202652 UTC---
| Itration            | 1447      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.34e+03  |
| Real Sto Return     | 4.76e+03  |
| Reward Loss         | -2.62e+06 |
| Running Env Steps   | 7235000   |
| Running Forward KL  | 6.18      |
| Running Reverse KL  | 48.6      |
| Running Update Time | 1447      |
-----------------------------------
--2024-08-13 15:57:51.249356 UTC--
| Itration            | 1448     |
| PAGAR Loss          | 5.04e+06 |
| Real Det Return     | 5.64e+03 |
| Real Sto Return     | 5.47e+03 |
| Reward Loss         | 9.01e+04 |
| Running Env Steps   | 7240000  |
| Running Forward KL  | 6.36     |
| Running Reverse KL  | 5.05     |
| Running Update Time | 1448     |
----------------------------------
--2024-08-13 15:59:37.707778 UTC---
| Itration            | 1449      |
| PAGAR Loss          | -8.67e+06 |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.58e+03  |
| Reward Loss         | -5.01e+06 |
| Running Env Steps   | 7245000   |
| Running Forward KL  | 7.16      |
| Running Reverse KL  | 109       |
| Running Update Time | 1449      |
-----------------------------------
--2024-08-13 16:01:25.648360 UTC---
| Itration            | 1450      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.02e+03  |
| Reward Loss         | -8.48e+05 |
| Running Env Steps   | 7250000   |
| Running Forward KL  | 5.89      |
| Running Reverse KL  | 3.95      |
| Running Update Time | 1450      |
-----------------------------------
--2024-08-13 16:03:13.761324 UTC---
| Itration            | 1451      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.44e+03  |
| Reward Loss         | -1.69e+06 |
| Running Env Steps   | 7255000   |
| Running Forward KL  | 6.89      |
| Running Reverse KL  | 44.2      |
| Running Update Time | 1451      |
-----------------------------------
--2024-08-13 16:05:02.299285 UTC--
| Itration            | 1452     |
| PAGAR Loss          | 1.22e+06 |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.56e+03 |
| Reward Loss         | 4.17e+05 |
| Running Env Steps   | 7260000  |
| Running Forward KL  | 7.1      |
| Running Reverse KL  | 5.61     |
| Running Update Time | 1452     |
----------------------------------
--2024-08-13 16:06:50.714437 UTC--
| Itration            | 1453     |
| PAGAR Loss          | 1.58e+05 |
| Real Det Return     | 5.51e+03 |
| Real Sto Return     | 5.33e+03 |
| Reward Loss         | -2.6e+05 |
| Running Env Steps   | 7265000  |
| Running Forward KL  | 6.05     |
| Running Reverse KL  | 4.78     |
| Running Update Time | 1453     |
----------------------------------
--2024-08-13 16:08:38.291108 UTC---
| Itration            | 1454      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.06e+03  |
| Reward Loss         | -2.65e+05 |
| Running Env Steps   | 7270000   |
| Running Forward KL  | 6.2       |
| Running Reverse KL  | 5.03      |
| Running Update Time | 1454      |
-----------------------------------
--2024-08-13 16:10:26.743102 UTC---
| Itration            | 1455      |
| PAGAR Loss          | 1.19e+06  |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -3.59e+05 |
| Running Env Steps   | 7275000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 5.14      |
| Running Update Time | 1455      |
-----------------------------------
--2024-08-13 16:12:15.076789 UTC--
| Itration            | 1456     |
| PAGAR Loss          | 3.12e+05 |
| Real Det Return     | 5.55e+03 |
| Real Sto Return     | 5.43e+03 |
| Reward Loss         | 1.33e+05 |
| Running Env Steps   | 7280000  |
| Running Forward KL  | 6.95     |
| Running Reverse KL  | 6.53     |
| Running Update Time | 1456     |
----------------------------------
--2024-08-13 16:14:00.939509 UTC---
| Itration            | 1457      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.31e+03  |
| Reward Loss         | -6.87e+05 |
| Running Env Steps   | 7285000   |
| Running Forward KL  | 6.23      |
| Running Reverse KL  | 4.69      |
| Running Update Time | 1457      |
-----------------------------------
--2024-08-13 16:15:48.834029 UTC--
| Itration            | 1458     |
| PAGAR Loss          | 9.16e+05 |
| Real Det Return     | 5.39e+03 |
| Real Sto Return     | 5.44e+03 |
| Reward Loss         | 2.94e+04 |
| Running Env Steps   | 7290000  |
| Running Forward KL  | 6.43     |
| Running Reverse KL  | 5.07     |
| Running Update Time | 1458     |
----------------------------------
--2024-08-13 16:17:36.228461 UTC---
| Itration            | 1459      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.18e+03  |
| Reward Loss         | -1.57e+05 |
| Running Env Steps   | 7295000   |
| Running Forward KL  | 6.28      |
| Running Reverse KL  | 5.12      |
| Running Update Time | 1459      |
-----------------------------------
--2024-08-13 16:19:24.865668 UTC--
| Itration            | 1460     |
| PAGAR Loss          | 6.28e+05 |
| Real Det Return     | 5.64e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 5.86e+03 |
| Running Env Steps   | 7300000  |
| Running Forward KL  | 6.43     |
| Running Reverse KL  | 4.93     |
| Running Update Time | 1460     |
----------------------------------
--2024-08-13 16:21:13.335407 UTC---
| Itration            | 1461      |
| PAGAR Loss          | -8.75e+04 |
| Real Det Return     | 5.66e+03  |
| Real Sto Return     | 5.53e+03  |
| Reward Loss         | 2.27e+05  |
| Running Env Steps   | 7305000   |
| Running Forward KL  | 6.48      |
| Running Reverse KL  | 5.63      |
| Running Update Time | 1461      |
-----------------------------------
--2024-08-13 16:23:04.460392 UTC---
| Itration            | 1462      |
| PAGAR Loss          | -1.69e+04 |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.51e+03  |
| Reward Loss         | -2.79e+05 |
| Running Env Steps   | 7310000   |
| Running Forward KL  | 6.29      |
| Running Reverse KL  | 31.9      |
| Running Update Time | 1462      |
-----------------------------------
--2024-08-13 16:25:10.321088 UTC---
| Itration            | 1463      |
| PAGAR Loss          | 7.31e+07  |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.34e+03  |
| Reward Loss         | -5.64e+05 |
| Running Env Steps   | 7315000   |
| Running Forward KL  | 5.92      |
| Running Reverse KL  | 4.78      |
| Running Update Time | 1463      |
-----------------------------------
--2024-08-13 16:27:14.790964 UTC---
| Itration            | 1464      |
| PAGAR Loss          | -2.07e+05 |
| Real Det Return     | 5.76e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -6.6e+04  |
| Running Env Steps   | 7320000   |
| Running Forward KL  | 6.63      |
| Running Reverse KL  | 31.3      |
| Running Update Time | 1464      |
-----------------------------------
--2024-08-13 16:29:02.899169 UTC---
| Itration            | 1465      |
| PAGAR Loss          | 2.08e+04  |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.41e+03  |
| Reward Loss         | -6.88e+05 |
| Running Env Steps   | 7325000   |
| Running Forward KL  | 6.04      |
| Running Reverse KL  | 20.9      |
| Running Update Time | 1465      |
-----------------------------------
--2024-08-13 16:30:49.501845 UTC---
| Itration            | 1466      |
| PAGAR Loss          | -3.91e+05 |
| Real Det Return     | 5.43e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -2.77e+05 |
| Running Env Steps   | 7330000   |
| Running Forward KL  | 6.64      |
| Running Reverse KL  | 5.25      |
| Running Update Time | 1466      |
-----------------------------------
--2024-08-13 16:32:36.936597 UTC--
| Itration            | 1467     |
| PAGAR Loss          | 7.07e+05 |
| Real Det Return     | 5.65e+03 |
| Real Sto Return     | 5.52e+03 |
| Reward Loss         | 2.54e+05 |
| Running Env Steps   | 7335000  |
| Running Forward KL  | 6.54     |
| Running Reverse KL  | 5.39     |
| Running Update Time | 1467     |
----------------------------------
--2024-08-13 16:34:25.148535 UTC---
| Itration            | 1468      |
| PAGAR Loss          | 1.33e+05  |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.38e+03  |
| Reward Loss         | -1.32e+04 |
| Running Env Steps   | 7340000   |
| Running Forward KL  | 6.11      |
| Running Reverse KL  | 5.5       |
| Running Update Time | 1468      |
-----------------------------------
--2024-08-13 16:36:13.230002 UTC---
| Itration            | 1469      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.61e+03  |
| Real Sto Return     | 5.49e+03  |
| Reward Loss         | -4.64e+06 |
| Running Env Steps   | 7345000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 38        |
| Running Update Time | 1469      |
-----------------------------------
--2024-08-13 16:38:00.374858 UTC--
| Itration            | 1470     |
| PAGAR Loss          | nan      |
| Real Det Return     | 5.71e+03 |
| Real Sto Return     | 5.21e+03 |
| Reward Loss         | -1.4e+06 |
| Running Env Steps   | 7350000  |
| Running Forward KL  | 7.08     |
| Running Reverse KL  | 29.4     |
| Running Update Time | 1470     |
----------------------------------
--2024-08-13 16:39:44.960307 UTC---
| Itration            | 1471      |
| PAGAR Loss          | 1.7e+06   |
| Real Det Return     | 5.41e+03  |
| Real Sto Return     | 4.91e+03  |
| Reward Loss         | -7.05e+05 |
| Running Env Steps   | 7355000   |
| Running Forward KL  | 6.65      |
| Running Reverse KL  | 31.4      |
| Running Update Time | 1471      |
-----------------------------------
--2024-08-13 16:41:27.680131 UTC---
| Itration            | 1472      |
| PAGAR Loss          | 1.51e+07  |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 4.85e+03  |
| Reward Loss         | -3.51e+05 |
| Running Env Steps   | 7360000   |
| Running Forward KL  | 6.5       |
| Running Reverse KL  | 5         |
| Running Update Time | 1472      |
-----------------------------------
--2024-08-13 16:43:14.540576 UTC---
| Itration            | 1473      |
| PAGAR Loss          | 1.91e+05  |
| Real Det Return     | 5.47e+03  |
| Real Sto Return     | 5.32e+03  |
| Reward Loss         | -1.65e+06 |
| Running Env Steps   | 7365000   |
| Running Forward KL  | 7.72      |
| Running Reverse KL  | 59.1      |
| Running Update Time | 1473      |
-----------------------------------
--2024-08-13 16:45:03.391580 UTC---
| Itration            | 1474      |
| PAGAR Loss          | -8.63e+04 |
| Real Det Return     | 5.65e+03  |
| Real Sto Return     | 5.46e+03  |
| Reward Loss         | -6.92e+04 |
| Running Env Steps   | 7370000   |
| Running Forward KL  | 6.77      |
| Running Reverse KL  | 4.88      |
| Running Update Time | 1474      |
-----------------------------------
--2024-08-13 16:47:19.292901 UTC---
| Itration            | 1475      |
| PAGAR Loss          | -3.65e+06 |
| Real Det Return     | 4.61e+03  |
| Real Sto Return     | 5.07e+03  |
| Reward Loss         | -3.82e+04 |
| Running Env Steps   | 7375000   |
| Running Forward KL  | 6.05      |
| Running Reverse KL  | 17.1      |
| Running Update Time | 1475      |
-----------------------------------
--2024-08-13 16:49:36.160004 UTC---
| Itration            | 1476      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.48e+06 |
| Running Env Steps   | 7380000   |
| Running Forward KL  | 6.68      |
| Running Reverse KL  | 36.5      |
| Running Update Time | 1476      |
-----------------------------------
--2024-08-13 16:51:55.092797 UTC---
| Itration            | 1477      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -9.19e+05 |
| Running Env Steps   | 7385000   |
| Running Forward KL  | 6.32      |
| Running Reverse KL  | 5.03      |
| Running Update Time | 1477      |
-----------------------------------
--2024-08-13 16:53:44.042409 UTC---
| Itration            | 1478      |
| PAGAR Loss          | -1.05e+05 |
| Real Det Return     | 5.53e+03  |
| Real Sto Return     | 5.39e+03  |
| Reward Loss         | -1.88e+05 |
| Running Env Steps   | 7390000   |
| Running Forward KL  | 6.47      |
| Running Reverse KL  | 4.94      |
| Running Update Time | 1478      |
-----------------------------------
--2024-08-13 16:55:31.594557 UTC---
| Itration            | 1479      |
| PAGAR Loss          | -6.23e+04 |
| Real Det Return     | 5.48e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -5.56e+05 |
| Running Env Steps   | 7395000   |
| Running Forward KL  | 6.76      |
| Running Reverse KL  | 4.93      |
| Running Update Time | 1479      |
-----------------------------------
--2024-08-13 16:57:18.820176 UTC---
| Itration            | 1480      |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.74e+03  |
| Real Sto Return     | 4.46e+03  |
| Reward Loss         | -1.18e+06 |
| Running Env Steps   | 7400000   |
| Running Forward KL  | 6.55      |
| Running Reverse KL  | 13.3      |
| Running Update Time | 1480      |
-----------------------------------
--2024-08-13 16:59:29.548253 UTC---
| Itration            | 1481      |
| PAGAR Loss          | nan       |
| Real Det Return     | 4.95e+03  |
| Real Sto Return     | 4.98e+03  |
| Reward Loss         | -6.96e+06 |
| Running Env Steps   | 7405000   |
| Running Forward KL  | 8.37      |
| Running Reverse KL  | 146       |
| Running Update Time | 1481      |
-----------------------------------
--2024-08-13 17:01:43.712972 UTC---
| Itration            | 1482      |
| PAGAR Loss          | -5.24e+05 |
| Real Det Return     | 4.54e+03  |
| Real Sto Return     | 4.69e+03  |
| Reward Loss         | -1.35e+06 |
| Running Env Steps   | 7410000   |
| Running Forward KL  | 5.85      |
| Running Reverse KL  | 34.1      |
| Running Update Time | 1482      |
-----------------------------------
--2024-08-13 17:04:02.723188 UTC---
| Itration            | 1483      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.57e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -3.07e+05 |
| Running Env Steps   | 7415000   |
| Running Forward KL  | 7.09      |
| Running Reverse KL  | 5.37      |
| Running Update Time | 1483      |
-----------------------------------
--2024-08-13 17:06:21.727901 UTC---
| Itration            | 1484      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.52e+03  |
| Real Sto Return     | 5.1e+03   |
| Reward Loss         | -3.86e+06 |
| Running Env Steps   | 7420000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 61        |
| Running Update Time | 1484      |
-----------------------------------
--2024-08-13 17:08:40.126955 UTC---
| Itration            | 1485      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.31e+03  |
| Real Sto Return     | 4.93e+03  |
| Reward Loss         | -9.09e+05 |
| Running Env Steps   | 7425000   |
| Running Forward KL  | 6.55      |
| Running Reverse KL  | 37.6      |
| Running Update Time | 1485      |
-----------------------------------
--2024-08-13 17:10:58.030558 UTC---
| Itration            | 1486      |
| PAGAR Loss          | -1.89e+08 |
| Real Det Return     | 5.22e+03  |
| Real Sto Return     | 4.45e+03  |
| Reward Loss         | -2.21e+05 |
| Running Env Steps   | 7430000   |
| Running Forward KL  | 6.38      |
| Running Reverse KL  | 31.2      |
| Running Update Time | 1486      |
-----------------------------------
--2024-08-13 17:13:16.956412 UTC---
| Itration            | 1487      |
| PAGAR Loss          | -2.74e+05 |
| Real Det Return     | 5.55e+03  |
| Real Sto Return     | 5.04e+03  |
| Reward Loss         | -1.3e+06  |
| Running Env Steps   | 7435000   |
| Running Forward KL  | 6.69      |
| Running Reverse KL  | 30.2      |
| Running Update Time | 1487      |
-----------------------------------
--2024-08-13 17:15:16.212226 UTC---
| Itration            | 1488      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.49e+03  |
| Real Sto Return     | 4.95e+03  |
| Reward Loss         | -2.78e+06 |
| Running Env Steps   | 7440000   |
| Running Forward KL  | 6.54      |
| Running Reverse KL  | 37.3      |
| Running Update Time | 1488      |
-----------------------------------
--2024-08-13 17:17:02.775499 UTC---
| Itration            | 1489      |
| PAGAR Loss          | -2.79e+07 |
| Real Det Return     | 5.45e+03  |
| Real Sto Return     | 4.71e+03  |
| Reward Loss         | -6.54e+06 |
| Running Env Steps   | 7445000   |
| Running Forward KL  | 7.6       |
| Running Reverse KL  | 101       |
| Running Update Time | 1489      |
-----------------------------------
--2024-08-13 17:18:49.427081 UTC---
| Itration            | 1490      |
| PAGAR Loss          | -1.28e+08 |
| Real Det Return     | 5.54e+03  |
| Real Sto Return     | 5.14e+03  |
| Reward Loss         | -9.51e+05 |
| Running Env Steps   | 7450000   |
| Running Forward KL  | 7.46      |
| Running Reverse KL  | 91.5      |
| Running Update Time | 1490      |
-----------------------------------
--2024-08-13 17:20:34.642301 UTC---
| Itration            | 1491      |
| PAGAR Loss          | -1.51e+07 |
| Real Det Return     | 5.5e+03   |
| Real Sto Return     | 4.13e+03  |
| Reward Loss         | -1.78e+06 |
| Running Env Steps   | 7455000   |
| Running Forward KL  | 7         |
| Running Reverse KL  | 67.1      |
| Running Update Time | 1491      |
-----------------------------------
--2024-08-13 17:22:22.401945 UTC---
| Itration            | 1492      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.44e+03  |
| Real Sto Return     | 5.09e+03  |
| Reward Loss         | -1.43e+06 |
| Running Env Steps   | 7460000   |
| Running Forward KL  | 6.17      |
| Running Reverse KL  | 34.4      |
| Running Update Time | 1492      |
-----------------------------------
--2024-08-13 17:24:09.131228 UTC---
| Itration            | 1493      |
| PAGAR Loss          | -3.16e+04 |
| Real Det Return     | 4.56e+03  |
| Real Sto Return     | 5.48e+03  |
| Reward Loss         | -2.91e+05 |
| Running Env Steps   | 7465000   |
| Running Forward KL  | 6.53      |
| Running Reverse KL  | 5.42      |
| Running Update Time | 1493      |
-----------------------------------
--2024-08-13 17:25:56.400702 UTC--
| Itration            | 1494     |
| PAGAR Loss          | 2.93e+05 |
| Real Det Return     | 5.52e+03 |
| Real Sto Return     | 5.3e+03  |
| Reward Loss         | -7.7e+05 |
| Running Env Steps   | 7470000  |
| Running Forward KL  | 6.34     |
| Running Reverse KL  | 4.58     |
| Running Update Time | 1494     |
----------------------------------
--2024-08-13 17:27:44.579064 UTC---
| Itration            | 1495      |
| PAGAR Loss          | 6.6e+04   |
| Real Det Return     | 5.16e+03  |
| Real Sto Return     | 5.19e+03  |
| Reward Loss         | -1.59e+06 |
| Running Env Steps   | 7475000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 4.13      |
| Running Update Time | 1495      |
-----------------------------------
--2024-08-13 17:29:28.762005 UTC---
| Itration            | 1496      |
| PAGAR Loss          | -1.52e+07 |
| Real Det Return     | 5.38e+03  |
| Real Sto Return     | 5.25e+03  |
| Reward Loss         | -6.24e+05 |
| Running Env Steps   | 7480000   |
| Running Forward KL  | 6.8       |
| Running Reverse KL  | 62.5      |
| Running Update Time | 1496      |
-----------------------------------
--2024-08-13 17:31:15.047011 UTC---
| Itration            | 1497      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.51e+03  |
| Real Sto Return     | 5.36e+03  |
| Reward Loss         | -1.76e+06 |
| Running Env Steps   | 7485000   |
| Running Forward KL  | 6.26      |
| Running Reverse KL  | 34.3      |
| Running Update Time | 1497      |
-----------------------------------
--2024-08-13 17:32:59.207169 UTC---
| Itration            | 1498      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.42e+03  |
| Real Sto Return     | 5.33e+03  |
| Reward Loss         | -3.53e+06 |
| Running Env Steps   | 7490000   |
| Running Forward KL  | 6.56      |
| Running Reverse KL  | 34.3      |
| Running Update Time | 1498      |
-----------------------------------
--2024-08-13 17:34:42.831432 UTC---
| Itration            | 1499      |
| PAGAR Loss          | nan       |
| Real Det Return     | 5.37e+03  |
| Real Sto Return     | 4.87e+03  |
| Reward Loss         | -5.46e+05 |
| Running Env Steps   | 7495000   |
| Running Forward KL  | 6.13      |
| Running Reverse KL  | 4.49      |
| Running Update Time | 1499      |
-----------------------------------
