====> Epoch:   1 Average train loss: -16401.0078 Average bpd: 7.702
====> Epoch:   2 Average train loss: -14358.8730 Average bpd: 6.743
====> [eval] Epoch:   2 Average bpd: 6.577
====> [test] Epoch:   2 Average bpd: 6.583
Best val_bpd: 6.577468170478991
Best test_bpd: 6.5832316037294705
====> Epoch:   3 Average train loss: -13999.6023 Average bpd: 6.575
====> Epoch:   4 Average train loss: -13602.6297 Average bpd: 6.388
====> [eval] Epoch:   4 Average bpd: 6.176
====> [test] Epoch:   4 Average bpd: 6.181
Best val_bpd: 6.176322972465354
Best test_bpd: 6.181409312196386
====> Epoch:   5 Average train loss: -13161.2401 Average bpd: 6.181
====> Epoch:   6 Average train loss: -12795.5691 Average bpd: 6.009
====> [eval] Epoch:   6 Average bpd: 5.907
====> [test] Epoch:   6 Average bpd: 5.913
Best val_bpd: 5.907325250764808
Best test_bpd: 5.912639087985334
====> Epoch:   7 Average train loss: -12295.1969 Average bpd: 5.774
====> Epoch:   8 Average train loss: -12033.8393 Average bpd: 5.651
====> [eval] Epoch:   8 Average bpd: 5.491
====> [test] Epoch:   8 Average bpd: 5.496
Best val_bpd: 5.491061783579017
Best test_bpd: 5.495866952720175
====> Epoch:   9 Average train loss: -11617.7969 Average bpd: 5.456
====> Epoch:  10 Average train loss: -11263.7419 Average bpd: 5.290
====> [eval] Epoch:  10 Average bpd: 5.373
====> [test] Epoch:  10 Average bpd: 5.378
Best val_bpd: 5.372515150288236
Best test_bpd: 5.378270919062943
====> Epoch:  11 Average train loss: -10778.1397 Average bpd: 5.062
====> Epoch:  12 Average train loss: -10320.2065 Average bpd: 4.847
====> [eval] Epoch:  12 Average bpd: 4.698
====> [test] Epoch:  12 Average bpd: 4.705
Best val_bpd: 4.698092664360882
Best test_bpd: 4.704691018956412
====> Epoch:  13 Average train loss: -10006.7602 Average bpd: 4.699
====> Epoch:  14 Average train loss: -9780.1408 Average bpd: 4.593
====> [eval] Epoch:  14 Average bpd: 4.439
====> [test] Epoch:  14 Average bpd: 4.447
Best val_bpd: 4.438914617209352
Best test_bpd: 4.446522017964447
====> Epoch:  15 Average train loss: -9608.7206 Average bpd: 4.513
====> Epoch:  16 Average train loss: -9343.6927 Average bpd: 4.388
====> [eval] Epoch:  16 Average bpd: 4.312
====> [test] Epoch:  16 Average bpd: 4.320
Best val_bpd: 4.312324691320704
Best test_bpd: 4.320006888564287
====> Epoch:  17 Average train loss: -9174.8956 Average bpd: 4.309
====> Epoch:  18 Average train loss: -9087.1163 Average bpd: 4.268
====> [eval] Epoch:  18 Average bpd: 4.256
====> [test] Epoch:  18 Average bpd: 4.265
Best val_bpd: 4.256174957323714
Best test_bpd: 4.264637960156208
====> Epoch:  19 Average train loss: -8976.1800 Average bpd: 4.215
====> Epoch:  20 Average train loss: -8865.9205 Average bpd: 4.164
====> [eval] Epoch:  20 Average bpd: 4.220
====> [test] Epoch:  20 Average bpd: 4.228
Best val_bpd: 4.220259771724468
Best test_bpd: 4.228255603641685
====> Epoch:  21 Average train loss: -8784.2767 Average bpd: 4.125
====> Epoch:  22 Average train loss: -8703.4771 Average bpd: 4.087
====> [eval] Epoch:  22 Average bpd: 4.087
====> [test] Epoch:  22 Average bpd: 4.096
Best val_bpd: 4.087248568060809
Best test_bpd: 4.095810038580406
====> Epoch:  23 Average train loss: -8655.0071 Average bpd: 4.065
====> Epoch:  24 Average train loss: -8610.3965 Average bpd: 4.044
====> [eval] Epoch:  24 Average bpd: 4.020
====> [test] Epoch:  24 Average bpd: 4.029
Best val_bpd: 4.0198981134300915
Best test_bpd: 4.029357575033702
====> Epoch:  25 Average train loss: -8511.4200 Average bpd: 3.997
====> Epoch:  26 Average train loss: -8475.9662 Average bpd: 3.981
====> [eval] Epoch:  26 Average bpd: 3.978
====> [test] Epoch:  26 Average bpd: 3.988
Best val_bpd: 3.978265962551028
Best test_bpd: 3.987736502159483
====> Epoch:  27 Average train loss: -8427.4337 Average bpd: 3.958
====> Epoch:  28 Average train loss: -8350.6718 Average bpd: 3.922
====> [eval] Epoch:  28 Average bpd: 3.933
====> [test] Epoch:  28 Average bpd: 3.943
Best val_bpd: 3.9329678614154426
Best test_bpd: 3.9430642582266944
====> Epoch:  29 Average train loss: -8321.8626 Average bpd: 3.908
====> Epoch:  30 Average train loss: -8282.5602 Average bpd: 3.890
====> [eval] Epoch:  30 Average bpd: 3.886
====> [test] Epoch:  30 Average bpd: 3.895
Best val_bpd: 3.885519451762471
Best test_bpd: 3.8954165286636897
====> Epoch:  31 Average train loss: -8255.7617 Average bpd: 3.877
====> Epoch:  32 Average train loss: -8196.6736 Average bpd: 3.849
====> [eval] Epoch:  32 Average bpd: 3.851
====> [test] Epoch:  32 Average bpd: 3.861
Best val_bpd: 3.8514526072519564
Best test_bpd: 3.8614751306208444
====> Epoch:  33 Average train loss: -8171.1871 Average bpd: 3.837
====> Epoch:  34 Average train loss: -8132.8335 Average bpd: 3.819
====> [eval] Epoch:  34 Average bpd: 3.891
====> [test] Epoch:  34 Average bpd: 3.900
Best val_bpd: 3.8514526072519564
Best test_bpd: 3.8614751306208444
====> Epoch:  35 Average train loss: -8101.1661 Average bpd: 3.805
====> Epoch:  36 Average train loss: -8070.3045 Average bpd: 3.790
====> [eval] Epoch:  36 Average bpd: 3.822
====> [test] Epoch:  36 Average bpd: 3.832
Best val_bpd: 3.8218888860821107
Best test_bpd: 3.83192694304013
====> Epoch:  37 Average train loss: -8048.4144 Average bpd: 3.780
====> Epoch:  38 Average train loss: -8021.8122 Average bpd: 3.767
====> [eval] Epoch:  38 Average bpd: 3.781
====> [test] Epoch:  38 Average bpd: 3.791
Best val_bpd: 3.780735151026162
Best test_bpd: 3.7910141781736626
====> Epoch:  39 Average train loss: -8004.4775 Average bpd: 3.759
====> Epoch:  40 Average train loss: -7980.8943 Average bpd: 3.748
====> [eval] Epoch:  40 Average bpd: 3.797
====> [test] Epoch:  40 Average bpd: 3.808
Best val_bpd: 3.780735151026162
Best test_bpd: 3.7910141781736626
====> Epoch:  41 Average train loss: -7954.6960 Average bpd: 3.736
====> Epoch:  42 Average train loss: -7928.4139 Average bpd: 3.723
====> [eval] Epoch:  42 Average bpd: 3.778
====> [test] Epoch:  42 Average bpd: 3.788
Best val_bpd: 3.777905271568355
Best test_bpd: 3.7882579065647586
====> Epoch:  43 Average train loss: -7908.6262 Average bpd: 3.714
====> Epoch:  44 Average train loss: -7889.3851 Average bpd: 3.705
====> [eval] Epoch:  44 Average bpd: 3.745
====> [test] Epoch:  44 Average bpd: 3.755
Best val_bpd: 3.7451410541334145
Best test_bpd: 3.7552060884206684
====> Epoch:  45 Average train loss: -7870.9054 Average bpd: 3.696
====> Epoch:  46 Average train loss: -7854.0560 Average bpd: 3.688
====> [eval] Epoch:  46 Average bpd: 3.726
====> [test] Epoch:  46 Average bpd: 3.736
Best val_bpd: 3.725620534887809
Best test_bpd: 3.7357103578980753
====> Epoch:  47 Average train loss: -7826.9522 Average bpd: 3.676
====> Epoch:  48 Average train loss: -7813.9656 Average bpd: 3.670
====> [eval] Epoch:  48 Average bpd: 3.698
====> [test] Epoch:  48 Average bpd: 3.709
Best val_bpd: 3.698431235948049
Best test_bpd: 3.7088635206817573
====> Epoch:  49 Average train loss: -7795.6913 Average bpd: 3.661
====> Epoch:  50 Average train loss: -7775.9172 Average bpd: 3.652
====> [eval] Epoch:  50 Average bpd: 3.696
====> [test] Epoch:  50 Average bpd: 3.707
Best val_bpd: 3.6964814439626017
Best test_bpd: 3.7069920032980734
====> Epoch:  51 Average train loss: -7761.7909 Average bpd: 3.645
====> Epoch:  52 Average train loss: -7742.4926 Average bpd: 3.636
====> [eval] Epoch:  52 Average bpd: 3.738
====> [test] Epoch:  52 Average bpd: 3.748
Best val_bpd: 3.6964814439626017
Best test_bpd: 3.7069920032980734
====> Epoch:  53 Average train loss: -7737.5072 Average bpd: 3.634
====> Epoch:  54 Average train loss: -7716.0441 Average bpd: 3.624
====> [eval] Epoch:  54 Average bpd: 3.684
====> [test] Epoch:  54 Average bpd: 3.694
Best val_bpd: 3.683599549376777
Best test_bpd: 3.694183950951012
====> Epoch:  55 Average train loss: -7708.3004 Average bpd: 3.620
====> Epoch:  56 Average train loss: -7691.4836 Average bpd: 3.612
====> [eval] Epoch:  56 Average bpd: 3.682
====> [test] Epoch:  56 Average bpd: 3.692
Best val_bpd: 3.681576937614824
Best test_bpd: 3.6922107715020407
====> Epoch:  57 Average train loss: -7681.8833 Average bpd: 3.608
====> Epoch:  58 Average train loss: -7671.2719 Average bpd: 3.603
====> [eval] Epoch:  58 Average bpd: 3.670
====> [test] Epoch:  58 Average bpd: 3.681
Best val_bpd: 3.6703741771362366
Best test_bpd: 3.680796041317767
====> Epoch:  59 Average train loss: -7664.3566 Average bpd: 3.599
====> Epoch:  60 Average train loss: -7645.3540 Average bpd: 3.590
====> [eval] Epoch:  60 Average bpd: 3.662
====> [test] Epoch:  60 Average bpd: 3.673
Best val_bpd: 3.6620184769645765
Best test_bpd: 3.67253183373937
====> Epoch:  61 Average train loss: -7637.8359 Average bpd: 3.587
====> Epoch:  62 Average train loss: -7622.4443 Average bpd: 3.580
====> [eval] Epoch:  62 Average bpd: 3.662
====> [test] Epoch:  62 Average bpd: 3.673
Best val_bpd: 3.6620184769645765
Best test_bpd: 3.67253183373937
====> Epoch:  63 Average train loss: -7614.6919 Average bpd: 3.576
====> Epoch:  64 Average train loss: -7608.7825 Average bpd: 3.573
====> [eval] Epoch:  64 Average bpd: 3.625
====> [test] Epoch:  64 Average bpd: 3.636
Best val_bpd: 3.6251253834992183
Best test_bpd: 3.6360562800440435
====> Epoch:  65 Average train loss: -7597.7014 Average bpd: 3.568
====> Epoch:  66 Average train loss: -7585.4331 Average bpd: 3.562
====> [eval] Epoch:  66 Average bpd: 3.649
====> [test] Epoch:  66 Average bpd: 3.659
Best val_bpd: 3.6251253834992183
Best test_bpd: 3.6360562800440435
====> Epoch:  67 Average train loss: -7569.5941 Average bpd: 3.555
====> Epoch:  68 Average train loss: -7565.4622 Average bpd: 3.553
====> [eval] Epoch:  68 Average bpd: 3.625
====> [test] Epoch:  68 Average bpd: 3.636
Best val_bpd: 3.6251253834992183
Best test_bpd: 3.6360562800440435
====> Epoch:  69 Average train loss: -7560.6750 Average bpd: 3.551
====> Epoch:  70 Average train loss: -7545.8854 Average bpd: 3.544
====> [eval] Epoch:  70 Average bpd: 3.616
====> [test] Epoch:  70 Average bpd: 3.626
Best val_bpd: 3.6155025151560034
Best test_bpd: 3.62581206378154
====> Epoch:  71 Average train loss: -7534.5708 Average bpd: 3.538
====> Epoch:  72 Average train loss: -7531.5574 Average bpd: 3.537
====> [eval] Epoch:  72 Average bpd: 3.605
====> [test] Epoch:  72 Average bpd: 3.616
Best val_bpd: 3.6052860592154787
Best test_bpd: 3.6159730125205267
====> Epoch:  73 Average train loss: -7527.8965 Average bpd: 3.535
====> Epoch:  74 Average train loss: -7520.2479 Average bpd: 3.532
====> [eval] Epoch:  74 Average bpd: 3.634
====> [test] Epoch:  74 Average bpd: 3.645
Best val_bpd: 3.6052860592154787
Best test_bpd: 3.6159730125205267
====> Epoch:  75 Average train loss: -7510.2444 Average bpd: 3.527
====> Epoch:  76 Average train loss: -7503.9035 Average bpd: 3.524
====> [eval] Epoch:  76 Average bpd: 3.602
====> [test] Epoch:  76 Average bpd: 3.612
Best val_bpd: 3.6018714184261134
Best test_bpd: 3.6124719232366562
====> Epoch:  77 Average train loss: -7501.5404 Average bpd: 3.523
====> Epoch:  78 Average train loss: -7486.2406 Average bpd: 3.516
====> [eval] Epoch:  78 Average bpd: 3.595
====> [test] Epoch:  78 Average bpd: 3.605
Best val_bpd: 3.59476951459349
Best test_bpd: 3.605487503147533
====> Epoch:  79 Average train loss: -7477.7433 Average bpd: 3.512
====> Epoch:  80 Average train loss: -7477.0725 Average bpd: 3.511
====> [eval] Epoch:  80 Average bpd: 3.594
====> [test] Epoch:  80 Average bpd: 3.605
Best val_bpd: 3.594271060977386
Best test_bpd: 3.6048968145826725
====> Epoch:  81 Average train loss: -7469.1142 Average bpd: 3.508
====> Epoch:  82 Average train loss: -7458.0398 Average bpd: 3.502
====> [eval] Epoch:  82 Average bpd: 3.600
====> [test] Epoch:  82 Average bpd: 3.611
Best val_bpd: 3.594271060977386
Best test_bpd: 3.6048968145826725
====> Epoch:  83 Average train loss: -7456.1030 Average bpd: 3.502
====> Epoch:  84 Average train loss: -7447.2413 Average bpd: 3.497
====> [eval] Epoch:  84 Average bpd: 3.586
====> [test] Epoch:  84 Average bpd: 3.597
Best val_bpd: 3.5864828433233646
Best test_bpd: 3.5973498298089295
====> Epoch:  85 Average train loss: -7443.3904 Average bpd: 3.496
====> Epoch:  86 Average train loss: -7436.2061 Average bpd: 3.492
====> [eval] Epoch:  86 Average bpd: 3.574
====> [test] Epoch:  86 Average bpd: 3.584
Best val_bpd: 3.573517143841336
Best test_bpd: 3.584312129100738
====> Epoch:  87 Average train loss: -7438.1491 Average bpd: 3.493
====> Epoch:  88 Average train loss: -7424.6682 Average bpd: 3.487
====> [eval] Epoch:  88 Average bpd: 3.568
====> [test] Epoch:  88 Average bpd: 3.579
Best val_bpd: 3.5681596164965033
Best test_bpd: 3.5785047169112025
====> Epoch:  89 Average train loss: -7418.5163 Average bpd: 3.484
====> Epoch:  90 Average train loss: -7413.1314 Average bpd: 3.481
====> [eval] Epoch:  90 Average bpd: 3.598
====> [test] Epoch:  90 Average bpd: 3.609
Best val_bpd: 3.5681596164965033
Best test_bpd: 3.5785047169112025
====> Epoch:  91 Average train loss: -7417.7368 Average bpd: 3.484
====> Epoch:  92 Average train loss: -7410.4369 Average bpd: 3.480
====> [eval] Epoch:  92 Average bpd: 3.561
====> [test] Epoch:  92 Average bpd: 3.572
Best val_bpd: 3.561268121463736
Best test_bpd: 3.571866558854142
====> Epoch:  93 Average train loss: -7403.4601 Average bpd: 3.477
====> Epoch:  94 Average train loss: -7398.5729 Average bpd: 3.475
====> [eval] Epoch:  94 Average bpd: 3.565
====> [test] Epoch:  94 Average bpd: 3.576
Best val_bpd: 3.561268121463736
Best test_bpd: 3.571866558854142
====> Epoch:  95 Average train loss: -7395.2881 Average bpd: 3.473
====> Epoch:  96 Average train loss: -7390.2137 Average bpd: 3.471
====> [eval] Epoch:  96 Average bpd: 3.561
====> [test] Epoch:  96 Average bpd: 3.572
Best val_bpd: 3.560900868114621
Best test_bpd: 3.5715929359079492
====> Epoch:  97 Average train loss: -7386.9904 Average bpd: 3.469
====> Epoch:  98 Average train loss: -7378.5028 Average bpd: 3.465
====> [eval] Epoch:  98 Average bpd: 3.548
====> [test] Epoch:  98 Average bpd: 3.559
Best val_bpd: 3.5483572626773774
Best test_bpd: 3.5589872142558088
====> Epoch:  99 Average train loss: -7377.7439 Average bpd: 3.465
====> Epoch: 100 Average train loss: -7368.7583 Average bpd: 3.461
====> [eval] Epoch: 100 Average bpd: 3.565
====> [test] Epoch: 100 Average bpd: 3.576
Best val_bpd: 3.5483572626773774
Best test_bpd: 3.5589872142558088
====> Epoch: 101 Average train loss: -7370.6512 Average bpd: 3.461
====> Epoch: 102 Average train loss: -7359.9354 Average bpd: 3.456
====> [eval] Epoch: 102 Average bpd: 3.543
====> [test] Epoch: 102 Average bpd: 3.554
Best val_bpd: 3.5434982143469282
Best test_bpd: 3.5542435691867507
====> Epoch: 103 Average train loss: -7359.0954 Average bpd: 3.456
====> Epoch: 104 Average train loss: -7349.5132 Average bpd: 3.452
====> [eval] Epoch: 104 Average bpd: 3.545
====> [test] Epoch: 104 Average bpd: 3.556
Best val_bpd: 3.5434982143469282
Best test_bpd: 3.5542435691867507
====> Epoch: 105 Average train loss: -7348.2447 Average bpd: 3.451
====> Epoch: 106 Average train loss: -7346.0025 Average bpd: 3.450
====> [eval] Epoch: 106 Average bpd: 3.530
====> [test] Epoch: 106 Average bpd: 3.541
Best val_bpd: 3.5298312661281543
Best test_bpd: 3.5408218675024394
====> Epoch: 107 Average train loss: -7340.6238 Average bpd: 3.447
====> Epoch: 108 Average train loss: -7339.8570 Average bpd: 3.447
====> [eval] Epoch: 108 Average bpd: 3.537
====> [test] Epoch: 108 Average bpd: 3.548
Best val_bpd: 3.5298312661281543
Best test_bpd: 3.5408218675024394
====> Epoch: 109 Average train loss: -7334.6852 Average bpd: 3.445
====> Epoch: 110 Average train loss: -7328.4430 Average bpd: 3.442
====> [eval] Epoch: 110 Average bpd: 3.522
====> [test] Epoch: 110 Average bpd: 3.533
Best val_bpd: 3.5216392390152262
Best test_bpd: 3.532560283929385
====> Epoch: 111 Average train loss: -7328.0774 Average bpd: 3.441
====> Epoch: 112 Average train loss: -7324.7347 Average bpd: 3.440
====> [eval] Epoch: 112 Average bpd: 3.522
====> [test] Epoch: 112 Average bpd: 3.533
Best val_bpd: 3.5216392390152262
Best test_bpd: 3.532560283929385
====> Epoch: 113 Average train loss: -7317.6365 Average bpd: 3.437
====> Epoch: 114 Average train loss: -7318.4634 Average bpd: 3.437
====> [eval] Epoch: 114 Average bpd: 3.539
====> [test] Epoch: 114 Average bpd: 3.550
Best val_bpd: 3.5216392390152262
Best test_bpd: 3.532560283929385
====> Epoch: 115 Average train loss: -7309.4921 Average bpd: 3.433
====> Epoch: 116 Average train loss: -7310.2524 Average bpd: 3.433
====> [eval] Epoch: 116 Average bpd: 3.528
====> [test] Epoch: 116 Average bpd: 3.539
Best val_bpd: 3.5216392390152262
Best test_bpd: 3.532560283929385
====> Epoch: 117 Average train loss: -7306.0284 Average bpd: 3.431
====> Epoch: 118 Average train loss: -7305.9469 Average bpd: 3.431
====> [eval] Epoch: 118 Average bpd: 3.511
====> [test] Epoch: 118 Average bpd: 3.522
Best val_bpd: 3.510867372723225
Best test_bpd: 3.5217327670993575
====> Epoch: 119 Average train loss: -7301.9492 Average bpd: 3.429
====> Epoch: 120 Average train loss: -7298.8198 Average bpd: 3.428
====> [eval] Epoch: 120 Average bpd: 3.511
====> [test] Epoch: 120 Average bpd: 3.522
Best val_bpd: 3.510867372723225
Best test_bpd: 3.5217327670993575
====> Epoch: 121 Average train loss: -7294.5593 Average bpd: 3.426
====> Epoch: 122 Average train loss: -7290.1473 Average bpd: 3.424
====> [eval] Epoch: 122 Average bpd: 3.526
====> [test] Epoch: 122 Average bpd: 3.537
Best val_bpd: 3.510867372723225
Best test_bpd: 3.5217327670993575
====> Epoch: 123 Average train loss: -7287.3464 Average bpd: 3.422
====> Epoch: 124 Average train loss: -7285.5323 Average bpd: 3.421
====> [eval] Epoch: 124 Average bpd: 3.511
====> [test] Epoch: 124 Average bpd: 3.522
Best val_bpd: 3.510867372723225
Best test_bpd: 3.5217327670993575
====> Epoch: 125 Average train loss: -7283.9682 Average bpd: 3.421
====> Epoch: 126 Average train loss: -7282.4357 Average bpd: 3.420
====> [eval] Epoch: 126 Average bpd: 3.521
====> [test] Epoch: 126 Average bpd: 3.532
Best val_bpd: 3.510867372723225
Best test_bpd: 3.5217327670993575
====> Epoch: 127 Average train loss: -7274.5715 Average bpd: 3.416
====> Epoch: 128 Average train loss: -7273.6265 Average bpd: 3.416
====> [eval] Epoch: 128 Average bpd: 3.503
====> [test] Epoch: 128 Average bpd: 3.514
Best val_bpd: 3.5030914492889114
Best test_bpd: 3.5142677454835574
====> Epoch: 129 Average train loss: -7270.4681 Average bpd: 3.414
====> Epoch: 130 Average train loss: -7267.5528 Average bpd: 3.413
====> [eval] Epoch: 130 Average bpd: 3.504
====> [test] Epoch: 130 Average bpd: 3.515
Best val_bpd: 3.5030914492889114
Best test_bpd: 3.5142677454835574
====> Epoch: 131 Average train loss: -7266.9488 Average bpd: 3.413
====> Epoch: 132 Average train loss: -7258.6613 Average bpd: 3.409
====> [eval] Epoch: 132 Average bpd: 3.503
====> [test] Epoch: 132 Average bpd: 3.514
Best val_bpd: 3.502868721596731
Best test_bpd: 3.5137276706146157
====> Epoch: 133 Average train loss: -7262.5024 Average bpd: 3.411
====> Epoch: 134 Average train loss: -7254.2724 Average bpd: 3.407
====> [eval] Epoch: 134 Average bpd: 3.503
====> [test] Epoch: 134 Average bpd: 3.514
Best val_bpd: 3.502868721596731
Best test_bpd: 3.5137276706146157
====> Epoch: 135 Average train loss: -7250.4371 Average bpd: 3.405
====> Epoch: 136 Average train loss: -7250.1055 Average bpd: 3.405
====> [eval] Epoch: 136 Average bpd: 3.499
====> [test] Epoch: 136 Average bpd: 3.510
Best val_bpd: 3.499289432454613
Best test_bpd: 3.5102198195824723
====> Epoch: 137 Average train loss: -7251.0237 Average bpd: 3.405
====> Epoch: 138 Average train loss: -7244.7700 Average bpd: 3.402
====> [eval] Epoch: 138 Average bpd: 3.503
====> [test] Epoch: 138 Average bpd: 3.514
Best val_bpd: 3.499289432454613
Best test_bpd: 3.5102198195824723
====> Epoch: 139 Average train loss: -7245.0610 Average bpd: 3.402
====> Epoch: 140 Average train loss: -7238.9314 Average bpd: 3.400
====> [eval] Epoch: 140 Average bpd: 3.495
====> [test] Epoch: 140 Average bpd: 3.506
Best val_bpd: 3.495164186201494
Best test_bpd: 3.5063349949159464
====> Epoch: 141 Average train loss: -7237.8539 Average bpd: 3.399
====> Epoch: 142 Average train loss: -7237.4555 Average bpd: 3.399
====> [eval] Epoch: 142 Average bpd: 3.504
====> [test] Epoch: 142 Average bpd: 3.515
Best val_bpd: 3.495164186201494
Best test_bpd: 3.5063349949159464
====> Epoch: 143 Average train loss: -7232.3265 Average bpd: 3.396
====> Epoch: 144 Average train loss: -7230.6760 Average bpd: 3.396
====> [eval] Epoch: 144 Average bpd: 3.512
====> [test] Epoch: 144 Average bpd: 3.523
Best val_bpd: 3.495164186201494
Best test_bpd: 3.5063349949159464
====> Epoch: 145 Average train loss: -7229.6664 Average bpd: 3.395
====> Epoch: 146 Average train loss: -7224.2013 Average bpd: 3.393
====> [eval] Epoch: 146 Average bpd: 3.489
====> [test] Epoch: 146 Average bpd: 3.499
Best val_bpd: 3.488512066346111
Best test_bpd: 3.4993280581615327
====> Epoch: 147 Average train loss: -7220.4586 Average bpd: 3.391
====> Epoch: 148 Average train loss: -7220.8058 Average bpd: 3.391
====> [eval] Epoch: 148 Average bpd: 3.488
====> [test] Epoch: 148 Average bpd: 3.499
Best val_bpd: 3.488060256311651
Best test_bpd: 3.4992910739092817
====> Epoch: 149 Average train loss: -7217.2721 Average bpd: 3.389
====> Epoch: 150 Average train loss: -7216.1488 Average bpd: 3.389
====> [eval] Epoch: 150 Average bpd: 3.479
====> [test] Epoch: 150 Average bpd: 3.490
Best val_bpd: 3.4788832446456466
Best test_bpd: 3.490018219287178
====> Epoch: 151 Average train loss: -7210.6818 Average bpd: 3.386
====> Epoch: 152 Average train loss: -7208.6035 Average bpd: 3.385
====> [eval] Epoch: 152 Average bpd: 3.479
====> [test] Epoch: 152 Average bpd: 3.490
Best val_bpd: 3.4788043314585133
Best test_bpd: 3.489909164144902
====> Epoch: 153 Average train loss: -7207.2272 Average bpd: 3.385
====> Epoch: 154 Average train loss: -7206.0320 Average bpd: 3.384
====> [eval] Epoch: 154 Average bpd: 3.489
====> [test] Epoch: 154 Average bpd: 3.500
Best val_bpd: 3.4788043314585133
Best test_bpd: 3.489909164144902
====> Epoch: 155 Average train loss: -7204.9067 Average bpd: 3.384
====> Epoch: 156 Average train loss: -7199.0096 Average bpd: 3.381
====> [eval] Epoch: 156 Average bpd: 3.487
====> [test] Epoch: 156 Average bpd: 3.498
Best val_bpd: 3.4788043314585133
Best test_bpd: 3.489909164144902
====> Epoch: 157 Average train loss: -7197.2554 Average bpd: 3.380
====> Epoch: 158 Average train loss: -7196.8736 Average bpd: 3.380
====> [eval] Epoch: 158 Average bpd: 3.470
====> [test] Epoch: 158 Average bpd: 3.481
Best val_bpd: 3.4699463664771537
Best test_bpd: 3.480726214360619
====> Epoch: 159 Average train loss: -7192.8463 Average bpd: 3.378
====> Epoch: 160 Average train loss: -7194.5724 Average bpd: 3.379
====> [eval] Epoch: 160 Average bpd: 3.477
====> [test] Epoch: 160 Average bpd: 3.488
Best val_bpd: 3.4699463664771537
Best test_bpd: 3.480726214360619
====> Epoch: 161 Average train loss: -7188.7692 Average bpd: 3.376
====> Epoch: 162 Average train loss: -7191.8547 Average bpd: 3.377
====> [eval] Epoch: 162 Average bpd: 3.477
====> [test] Epoch: 162 Average bpd: 3.488
Best val_bpd: 3.4699463664771537
Best test_bpd: 3.480726214360619
====> Epoch: 163 Average train loss: -7188.7261 Average bpd: 3.376
====> Epoch: 164 Average train loss: -7180.8013 Average bpd: 3.372
====> [eval] Epoch: 164 Average bpd: 3.468
====> [test] Epoch: 164 Average bpd: 3.479
Best val_bpd: 3.4680186673685225
Best test_bpd: 3.4792214996879656
====> Epoch: 165 Average train loss: -7184.2824 Average bpd: 3.374
====> Epoch: 166 Average train loss: -7181.7343 Average bpd: 3.373
====> [eval] Epoch: 166 Average bpd: 3.471
====> [test] Epoch: 166 Average bpd: 3.482
Best val_bpd: 3.4680186673685225
Best test_bpd: 3.4792214996879656
====> Epoch: 167 Average train loss: -7181.2079 Average bpd: 3.372
====> Epoch: 168 Average train loss: -7175.1393 Average bpd: 3.370
====> [eval] Epoch: 168 Average bpd: 3.479
====> [test] Epoch: 168 Average bpd: 3.490
Best val_bpd: 3.4680186673685225
Best test_bpd: 3.4792214996879656
====> Epoch: 169 Average train loss: -7180.0768 Average bpd: 3.372
====> Epoch: 170 Average train loss: -7173.2022 Average bpd: 3.369
====> [eval] Epoch: 170 Average bpd: 3.465
====> [test] Epoch: 170 Average bpd: 3.476
Best val_bpd: 3.464911414001138
Best test_bpd: 3.476025321130771
====> Epoch: 171 Average train loss: -7174.0605 Average bpd: 3.369
====> Epoch: 172 Average train loss: -7169.2551 Average bpd: 3.367
====> [eval] Epoch: 172 Average bpd: 3.470
====> [test] Epoch: 172 Average bpd: 3.481
Best val_bpd: 3.464911414001138
Best test_bpd: 3.476025321130771
====> Epoch: 173 Average train loss: -7167.8783 Average bpd: 3.366
====> Epoch: 174 Average train loss: -7166.7615 Average bpd: 3.366
====> [eval] Epoch: 174 Average bpd: 3.465
====> [test] Epoch: 174 Average bpd: 3.476
Best val_bpd: 3.464911414001138
Best test_bpd: 3.476025321130771
====> Epoch: 175 Average train loss: -7162.9282 Average bpd: 3.364
====> Epoch: 176 Average train loss: -7164.5490 Average bpd: 3.365
====> [eval] Epoch: 176 Average bpd: 3.463
====> [test] Epoch: 176 Average bpd: 3.474
Best val_bpd: 3.4631949531276773
Best test_bpd: 3.4742203682896227
====> Epoch: 177 Average train loss: -7161.2415 Average bpd: 3.363
====> Epoch: 178 Average train loss: -7160.4876 Average bpd: 3.363
====> [eval] Epoch: 178 Average bpd: 3.467
====> [test] Epoch: 178 Average bpd: 3.478
Best val_bpd: 3.4631949531276773
Best test_bpd: 3.4742203682896227
====> Epoch: 179 Average train loss: -7158.7767 Average bpd: 3.362
====> Epoch: 180 Average train loss: -7154.9953 Average bpd: 3.360
====> [eval] Epoch: 180 Average bpd: 3.450
====> [test] Epoch: 180 Average bpd: 3.461
Best val_bpd: 3.4501926960845313
Best test_bpd: 3.4614500062670457
====> Epoch: 181 Average train loss: -7154.1948 Average bpd: 3.360
====> Epoch: 182 Average train loss: -7154.1772 Average bpd: 3.360
====> [eval] Epoch: 182 Average bpd: 3.453
====> [test] Epoch: 182 Average bpd: 3.464
Best val_bpd: 3.4501926960845313
Best test_bpd: 3.4614500062670457
====> Epoch: 183 Average train loss: -7150.1180 Average bpd: 3.358
====> Epoch: 184 Average train loss: -7152.2793 Average bpd: 3.359
====> [eval] Epoch: 184 Average bpd: 3.462
====> [test] Epoch: 184 Average bpd: 3.474
Best val_bpd: 3.4501926960845313
Best test_bpd: 3.4614500062670457
====> Epoch: 185 Average train loss: -7149.5352 Average bpd: 3.358
====> Epoch: 186 Average train loss: -7144.7423 Average bpd: 3.355
====> [eval] Epoch: 186 Average bpd: 3.455
====> [test] Epoch: 186 Average bpd: 3.467
Best val_bpd: 3.4501926960845313
Best test_bpd: 3.4614500062670457
====> Epoch: 187 Average train loss: -7143.5475 Average bpd: 3.355
====> Epoch: 188 Average train loss: -7144.2949 Average bpd: 3.355
====> [eval] Epoch: 188 Average bpd: 3.458
====> [test] Epoch: 188 Average bpd: 3.469
Best val_bpd: 3.4501926960845313
Best test_bpd: 3.4614500062670457
====> Epoch: 189 Average train loss: -7141.7491 Average bpd: 3.354
====> Epoch: 190 Average train loss: -7138.1865 Average bpd: 3.352
====> [eval] Epoch: 190 Average bpd: 3.458
====> [test] Epoch: 190 Average bpd: 3.469
Best val_bpd: 3.4501926960845313
Best test_bpd: 3.4614500062670457
====> Epoch: 191 Average train loss: -7138.7761 Average bpd: 3.353
====> Epoch: 192 Average train loss: -7137.1826 Average bpd: 3.352
====> [eval] Epoch: 192 Average bpd: 3.445
====> [test] Epoch: 192 Average bpd: 3.456
Best val_bpd: 3.4451769323732266
Best test_bpd: 3.456332529693892
====> Epoch: 193 Average train loss: -7135.5643 Average bpd: 3.351
====> Epoch: 194 Average train loss: -7133.5556 Average bpd: 3.350
====> [eval] Epoch: 194 Average bpd: 3.456
====> [test] Epoch: 194 Average bpd: 3.467
Best val_bpd: 3.4451769323732266
Best test_bpd: 3.456332529693892
====> Epoch: 195 Average train loss: -7132.5487 Average bpd: 3.350
====> Epoch: 196 Average train loss: -7129.3648 Average bpd: 3.348
====> [eval] Epoch: 196 Average bpd: 3.445
====> [test] Epoch: 196 Average bpd: 3.456
Best val_bpd: 3.444959732072825
Best test_bpd: 3.4561821152639105
====> Epoch: 197 Average train loss: -7129.2027 Average bpd: 3.348
====> Epoch: 198 Average train loss: -7128.2101 Average bpd: 3.348
====> [eval] Epoch: 198 Average bpd: 3.458
====> [test] Epoch: 198 Average bpd: 3.470
Best val_bpd: 3.444959732072825
Best test_bpd: 3.4561821152639105
====> Epoch: 199 Average train loss: -7126.4127 Average bpd: 3.347
====> Epoch: 200 Average train loss: -7126.4740 Average bpd: 3.347
====> [eval] Epoch: 200 Average bpd: 3.450
====> [test] Epoch: 200 Average bpd: 3.461
Best val_bpd: 3.444959732072825
Best test_bpd: 3.4561821152639105
====> Epoch: 201 Average train loss: -7122.5502 Average bpd: 3.345
====> Epoch: 202 Average train loss: -7122.1585 Average bpd: 3.345
====> [eval] Epoch: 202 Average bpd: 3.444
====> [test] Epoch: 202 Average bpd: 3.456
Best val_bpd: 3.444393960673716
Best test_bpd: 3.4557105359325844
====> Epoch: 203 Average train loss: -7120.0183 Average bpd: 3.344
====> Epoch: 204 Average train loss: -7120.8260 Average bpd: 3.344
====> [eval] Epoch: 204 Average bpd: 3.447
====> [test] Epoch: 204 Average bpd: 3.458
Best val_bpd: 3.444393960673716
Best test_bpd: 3.4557105359325844
====> Epoch: 205 Average train loss: -7113.1644 Average bpd: 3.341
====> Epoch: 206 Average train loss: -7118.1049 Average bpd: 3.343
====> [eval] Epoch: 206 Average bpd: 3.449
====> [test] Epoch: 206 Average bpd: 3.460
Best val_bpd: 3.444393960673716
Best test_bpd: 3.4557105359325844
====> Epoch: 207 Average train loss: -7116.1070 Average bpd: 3.342
====> Epoch: 208 Average train loss: -7115.6178 Average bpd: 3.342
====> [eval] Epoch: 208 Average bpd: 3.445
====> [test] Epoch: 208 Average bpd: 3.456
Best val_bpd: 3.444393960673716
Best test_bpd: 3.4557105359325844
====> Epoch: 209 Average train loss: -7111.7608 Average bpd: 3.340
====> Epoch: 210 Average train loss: -7109.3072 Average bpd: 3.339
====> [eval] Epoch: 210 Average bpd: 3.436
====> [test] Epoch: 210 Average bpd: 3.448
Best val_bpd: 3.4363690404672154
Best test_bpd: 3.4478054339345126
====> Epoch: 211 Average train loss: -7111.0588 Average bpd: 3.340
====> Epoch: 212 Average train loss: -7108.6481 Average bpd: 3.338
====> [eval] Epoch: 212 Average bpd: 3.443
====> [test] Epoch: 212 Average bpd: 3.454
Best val_bpd: 3.4363690404672154
Best test_bpd: 3.4478054339345126
====> Epoch: 213 Average train loss: -7103.3531 Average bpd: 3.336
====> Epoch: 214 Average train loss: -7106.4017 Average bpd: 3.337
====> [eval] Epoch: 214 Average bpd: 3.446
====> [test] Epoch: 214 Average bpd: 3.457
Best val_bpd: 3.4363690404672154
Best test_bpd: 3.4478054339345126
====> Epoch: 215 Average train loss: -7104.9176 Average bpd: 3.337
====> Epoch: 216 Average train loss: -7102.4072 Average bpd: 3.335
====> [eval] Epoch: 216 Average bpd: 3.435
====> [test] Epoch: 216 Average bpd: 3.446
Best val_bpd: 3.435442987184504
Best test_bpd: 3.4464450954317645
====> Epoch: 217 Average train loss: -7101.0778 Average bpd: 3.335
====> Epoch: 218 Average train loss: -7099.3755 Average bpd: 3.334
====> [eval] Epoch: 218 Average bpd: 3.434
====> [test] Epoch: 218 Average bpd: 3.445
Best val_bpd: 3.434015849023699
Best test_bpd: 3.4452826138980193
====> Epoch: 219 Average train loss: -7097.5250 Average bpd: 3.333
====> Epoch: 220 Average train loss: -7095.6168 Average bpd: 3.332
====> [eval] Epoch: 220 Average bpd: 3.435
====> [test] Epoch: 220 Average bpd: 3.446
Best val_bpd: 3.434015849023699
Best test_bpd: 3.4452826138980193
====> Epoch: 221 Average train loss: -7098.1478 Average bpd: 3.333
====> Epoch: 222 Average train loss: -7095.9386 Average bpd: 3.332
====> [eval] Epoch: 222 Average bpd: 3.441
====> [test] Epoch: 222 Average bpd: 3.452
Best val_bpd: 3.434015849023699
Best test_bpd: 3.4452826138980193
====> Epoch: 223 Average train loss: -7094.3262 Average bpd: 3.332
====> Epoch: 224 Average train loss: -7091.4159 Average bpd: 3.330
====> [eval] Epoch: 224 Average bpd: 3.433
====> [test] Epoch: 224 Average bpd: 3.444
Best val_bpd: 3.4328156401565413
Best test_bpd: 3.443941061415801
====> Epoch: 225 Average train loss: -7092.5098 Average bpd: 3.331
====> Epoch: 226 Average train loss: -7089.8183 Average bpd: 3.330
====> [eval] Epoch: 226 Average bpd: 3.440
====> [test] Epoch: 226 Average bpd: 3.451
Best val_bpd: 3.4328156401565413
Best test_bpd: 3.443941061415801
====> Epoch: 227 Average train loss: -7089.0425 Average bpd: 3.329
====> Epoch: 228 Average train loss: -7088.2814 Average bpd: 3.329
====> [eval] Epoch: 228 Average bpd: 3.434
====> [test] Epoch: 228 Average bpd: 3.445
Best val_bpd: 3.4328156401565413
Best test_bpd: 3.443941061415801
====> Epoch: 229 Average train loss: -7086.7875 Average bpd: 3.328
====> Epoch: 230 Average train loss: -7087.7477 Average bpd: 3.329
====> [eval] Epoch: 230 Average bpd: 3.431
====> [test] Epoch: 230 Average bpd: 3.442
Best val_bpd: 3.4306584022622237
Best test_bpd: 3.441884556009655
====> Epoch: 231 Average train loss: -7085.6480 Average bpd: 3.328
====> Epoch: 232 Average train loss: -7082.8447 Average bpd: 3.326
====> [eval] Epoch: 232 Average bpd: 3.429
====> [test] Epoch: 232 Average bpd: 3.440
Best val_bpd: 3.42858383069542
Best test_bpd: 3.4398521289002946
====> Epoch: 233 Average train loss: -7083.4539 Average bpd: 3.327
====> Epoch: 234 Average train loss: -7081.7533 Average bpd: 3.326
====> [eval] Epoch: 234 Average bpd: 3.425
====> [test] Epoch: 234 Average bpd: 3.436
Best val_bpd: 3.424780094760278
Best test_bpd: 3.4359300572907174
====> Epoch: 235 Average train loss: -7080.1630 Average bpd: 3.325
====> Epoch: 236 Average train loss: -7079.3531 Average bpd: 3.325
====> [eval] Epoch: 236 Average bpd: 3.432
====> [test] Epoch: 236 Average bpd: 3.443
Best val_bpd: 3.424780094760278
Best test_bpd: 3.4359300572907174
====> Epoch: 237 Average train loss: -7076.4176 Average bpd: 3.323
====> Epoch: 238 Average train loss: -7075.1428 Average bpd: 3.323
====> [eval] Epoch: 238 Average bpd: 3.424
====> [test] Epoch: 238 Average bpd: 3.435
Best val_bpd: 3.4236344624466626
Best test_bpd: 3.434948770001801
====> Epoch: 239 Average train loss: -7075.1637 Average bpd: 3.323
====> Epoch: 240 Average train loss: -7073.7926 Average bpd: 3.322
====> [eval] Epoch: 240 Average bpd: 3.436
====> [test] Epoch: 240 Average bpd: 3.447
Best val_bpd: 3.4236344624466626
Best test_bpd: 3.434948770001801
====> Epoch: 241 Average train loss: -7070.3168 Average bpd: 3.320
====> Epoch: 242 Average train loss: -7070.1179 Average bpd: 3.320
====> [eval] Epoch: 242 Average bpd: 3.433
====> [test] Epoch: 242 Average bpd: 3.444
Best val_bpd: 3.4236344624466626
Best test_bpd: 3.434948770001801
====> Epoch: 243 Average train loss: -7069.1263 Average bpd: 3.320
====> Epoch: 244 Average train loss: -7070.8220 Average bpd: 3.321
====> [eval] Epoch: 244 Average bpd: 3.431
====> [test] Epoch: 244 Average bpd: 3.442
Best val_bpd: 3.4236344624466626
Best test_bpd: 3.434948770001801
====> Epoch: 245 Average train loss: -7068.5552 Average bpd: 3.320
====> Epoch: 246 Average train loss: -7069.3627 Average bpd: 3.320
====> [eval] Epoch: 246 Average bpd: 3.424
====> [test] Epoch: 246 Average bpd: 3.436
Best val_bpd: 3.4236344624466626
Best test_bpd: 3.434948770001801
====> Epoch: 247 Average train loss: -7066.6698 Average bpd: 3.319
====> Epoch: 248 Average train loss: -7065.9436 Average bpd: 3.318
====> [eval] Epoch: 248 Average bpd: 3.423
====> [test] Epoch: 248 Average bpd: 3.434
Best val_bpd: 3.4225490385741493
Best test_bpd: 3.4338251594498113
====> Epoch: 249 Average train loss: -7062.8191 Average bpd: 3.317
====> Epoch: 250 Average train loss: -7063.7289 Average bpd: 3.317
====> [eval] Epoch: 250 Average bpd: 3.424
====> [test] Epoch: 250 Average bpd: 3.435
Best val_bpd: 3.4225490385741493
Best test_bpd: 3.4338251594498113
====> Epoch: 251 Average train loss: -7061.2631 Average bpd: 3.316
====> Epoch: 252 Average train loss: -7061.5913 Average bpd: 3.316
====> [eval] Epoch: 252 Average bpd: 3.429
====> [test] Epoch: 252 Average bpd: 3.440
Best val_bpd: 3.4225490385741493
Best test_bpd: 3.4338251594498113
====> Epoch: 253 Average train loss: -7060.1919 Average bpd: 3.316
====> Epoch: 254 Average train loss: -7057.6686 Average bpd: 3.314
====> [eval] Epoch: 254 Average bpd: 3.417
====> [test] Epoch: 254 Average bpd: 3.428
Best val_bpd: 3.416725230705234
Best test_bpd: 3.427798849628989
====> Epoch: 255 Average train loss: -7056.0063 Average bpd: 3.314
====> Epoch: 256 Average train loss: -7056.2699 Average bpd: 3.314
====> [eval] Epoch: 256 Average bpd: 3.420
====> [test] Epoch: 256 Average bpd: 3.432
Best val_bpd: 3.416725230705234
Best test_bpd: 3.427798849628989
====> Epoch: 257 Average train loss: -7057.1776 Average bpd: 3.314
====> Epoch: 258 Average train loss: -7052.4407 Average bpd: 3.312
====> [eval] Epoch: 258 Average bpd: 3.426
====> [test] Epoch: 258 Average bpd: 3.437
Best val_bpd: 3.416725230705234
Best test_bpd: 3.427798849628989
====> Epoch: 259 Average train loss: -7052.6130 Average bpd: 3.312
====> Epoch: 260 Average train loss: -7051.4660 Average bpd: 3.312
====> [eval] Epoch: 260 Average bpd: 3.415
====> [test] Epoch: 260 Average bpd: 3.426
Best val_bpd: 3.4148210598391726
Best test_bpd: 3.4258728340636058
====> Epoch: 261 Average train loss: -7052.2350 Average bpd: 3.312
====> Epoch: 262 Average train loss: -7049.4860 Average bpd: 3.311
====> [eval] Epoch: 262 Average bpd: 3.418
====> [test] Epoch: 262 Average bpd: 3.429
Best val_bpd: 3.4148210598391726
Best test_bpd: 3.4258728340636058
====> Epoch: 263 Average train loss: -7050.5396 Average bpd: 3.311
====> Epoch: 264 Average train loss: -7048.8981 Average bpd: 3.310
====> [eval] Epoch: 264 Average bpd: 3.419
====> [test] Epoch: 264 Average bpd: 3.431
Best val_bpd: 3.4148210598391726
Best test_bpd: 3.4258728340636058
====> Epoch: 265 Average train loss: -7048.3380 Average bpd: 3.310
====> Epoch: 266 Average train loss: -7048.2657 Average bpd: 3.310
====> [eval] Epoch: 266 Average bpd: 3.415
====> [test] Epoch: 266 Average bpd: 3.426
Best val_bpd: 3.4148210598391726
Best test_bpd: 3.4258728340636058
====> Epoch: 267 Average train loss: -7045.5341 Average bpd: 3.309
====> Epoch: 268 Average train loss: -7045.6139 Average bpd: 3.309
====> [eval] Epoch: 268 Average bpd: 3.422
====> [test] Epoch: 268 Average bpd: 3.434
Best val_bpd: 3.4148210598391726
Best test_bpd: 3.4258728340636058
====> Epoch: 269 Average train loss: -7047.0444 Average bpd: 3.309
====> Epoch: 270 Average train loss: -7043.1467 Average bpd: 3.308
====> [eval] Epoch: 270 Average bpd: 3.414
====> [test] Epoch: 270 Average bpd: 3.425
Best val_bpd: 3.413584534331741
Best test_bpd: 3.4246052566168794
====> Epoch: 271 Average train loss: -7040.3431 Average bpd: 3.306
====> Epoch: 272 Average train loss: -7039.6690 Average bpd: 3.306
====> [eval] Epoch: 272 Average bpd: 3.417
====> [test] Epoch: 272 Average bpd: 3.428
Best val_bpd: 3.413584534331741
Best test_bpd: 3.4246052566168794
====> Epoch: 273 Average train loss: -7041.9124 Average bpd: 3.307
====> Epoch: 274 Average train loss: -7041.9663 Average bpd: 3.307
====> [eval] Epoch: 274 Average bpd: 3.417
====> [test] Epoch: 274 Average bpd: 3.428
Best val_bpd: 3.413584534331741
Best test_bpd: 3.4246052566168794
====> Epoch: 275 Average train loss: -7040.9278 Average bpd: 3.307
====> Epoch: 276 Average train loss: -7039.9374 Average bpd: 3.306
====> [eval] Epoch: 276 Average bpd: 3.409
====> [test] Epoch: 276 Average bpd: 3.420
Best val_bpd: 3.409216382540071
Best test_bpd: 3.420239850742302
====> Epoch: 277 Average train loss: -7037.0515 Average bpd: 3.305
====> Epoch: 278 Average train loss: -7033.7863 Average bpd: 3.303
====> [eval] Epoch: 278 Average bpd: 3.409
====> [test] Epoch: 278 Average bpd: 3.420
Best val_bpd: 3.409216382540071
Best test_bpd: 3.420239850742302
====> Epoch: 279 Average train loss: -7036.7618 Average bpd: 3.305
====> Epoch: 280 Average train loss: -7033.6604 Average bpd: 3.303
====> [eval] Epoch: 280 Average bpd: 3.416
====> [test] Epoch: 280 Average bpd: 3.428
Best val_bpd: 3.409216382540071
Best test_bpd: 3.420239850742302
====> Epoch: 281 Average train loss: -7035.4296 Average bpd: 3.304
====> Epoch: 282 Average train loss: -7031.9612 Average bpd: 3.302
====> [eval] Epoch: 282 Average bpd: 3.415
====> [test] Epoch: 282 Average bpd: 3.426
Best val_bpd: 3.409216382540071
Best test_bpd: 3.420239850742302
====> Epoch: 283 Average train loss: -7031.2792 Average bpd: 3.302
====> Epoch: 284 Average train loss: -7030.2193 Average bpd: 3.302
====> [eval] Epoch: 284 Average bpd: 3.405
====> [test] Epoch: 284 Average bpd: 3.416
Best val_bpd: 3.405145613424359
Best test_bpd: 3.416374704664494
====> Epoch: 285 Average train loss: -7029.3281 Average bpd: 3.301
====> Epoch: 286 Average train loss: -7029.5026 Average bpd: 3.301
====> [eval] Epoch: 286 Average bpd: 3.412
====> [test] Epoch: 286 Average bpd: 3.423
Best val_bpd: 3.405145613424359
Best test_bpd: 3.416374704664494
====> Epoch: 287 Average train loss: -7027.3477 Average bpd: 3.300
====> Epoch: 288 Average train loss: -7027.5728 Average bpd: 3.300
====> [eval] Epoch: 288 Average bpd: 3.410
====> [test] Epoch: 288 Average bpd: 3.421
Best val_bpd: 3.405145613424359
Best test_bpd: 3.416374704664494
====> Epoch: 289 Average train loss: -7027.6456 Average bpd: 3.300
====> Epoch: 290 Average train loss: -7026.3641 Average bpd: 3.300
====> [eval] Epoch: 290 Average bpd: 3.402
====> [test] Epoch: 290 Average bpd: 3.414
Best val_bpd: 3.4024674154079615
Best test_bpd: 3.4136933971727412
====> Epoch: 291 Average train loss: -7025.5862 Average bpd: 3.299
====> Epoch: 292 Average train loss: -7025.1628 Average bpd: 3.299
====> [eval] Epoch: 292 Average bpd: 3.409
====> [test] Epoch: 292 Average bpd: 3.420
Best val_bpd: 3.4024674154079615
Best test_bpd: 3.4136933971727412
====> Epoch: 293 Average train loss: -7022.1723 Average bpd: 3.298
====> Epoch: 294 Average train loss: -7022.1985 Average bpd: 3.298
====> [eval] Epoch: 294 Average bpd: 3.408
====> [test] Epoch: 294 Average bpd: 3.420
Best val_bpd: 3.4024674154079615
Best test_bpd: 3.4136933971727412
====> Epoch: 295 Average train loss: -7022.6371 Average bpd: 3.298
====> Epoch: 296 Average train loss: -7020.8801 Average bpd: 3.297
====> [eval] Epoch: 296 Average bpd: 3.410
====> [test] Epoch: 296 Average bpd: 3.421
Best val_bpd: 3.4024674154079615
Best test_bpd: 3.4136933971727412
====> Epoch: 297 Average train loss: -7020.9249 Average bpd: 3.297
====> Epoch: 298 Average train loss: -7019.7283 Average bpd: 3.297
====> [eval] Epoch: 298 Average bpd: 3.408
====> [test] Epoch: 298 Average bpd: 3.419
Best val_bpd: 3.4024674154079615
Best test_bpd: 3.4136933971727412
====> Epoch: 299 Average train loss: -7018.3087 Average bpd: 3.296
====> Epoch: 300 Average train loss: -7016.7037 Average bpd: 3.295
====> [eval] Epoch: 300 Average bpd: 3.401
====> [test] Epoch: 300 Average bpd: 3.412
Best val_bpd: 3.4012117178261185
Best test_bpd: 3.4124231906410185
====> Epoch: 301 Average train loss: -7018.1967 Average bpd: 3.296
====> Epoch: 302 Average train loss: -7016.7481 Average bpd: 3.295
====> [eval] Epoch: 302 Average bpd: 3.409
====> [test] Epoch: 302 Average bpd: 3.420
Best val_bpd: 3.4012117178261185
Best test_bpd: 3.4124231906410185
====> Epoch: 303 Average train loss: -7015.3557 Average bpd: 3.295
====> Epoch: 304 Average train loss: -7015.9422 Average bpd: 3.295
====> [eval] Epoch: 304 Average bpd: 3.412
====> [test] Epoch: 304 Average bpd: 3.423
Best val_bpd: 3.4012117178261185
Best test_bpd: 3.4124231906410185
====> Epoch: 305 Average train loss: -7013.4137 Average bpd: 3.294
====> Epoch: 306 Average train loss: -7013.6545 Average bpd: 3.294
====> [eval] Epoch: 306 Average bpd: 3.402
====> [test] Epoch: 306 Average bpd: 3.413
Best val_bpd: 3.4012117178261185
Best test_bpd: 3.4124231906410185
====> Epoch: 307 Average train loss: -7013.1393 Average bpd: 3.294
====> Epoch: 308 Average train loss: -7009.6175 Average bpd: 3.292
====> [eval] Epoch: 308 Average bpd: 3.415
====> [test] Epoch: 308 Average bpd: 3.426
Best val_bpd: 3.4012117178261185
Best test_bpd: 3.4124231906410185
====> Epoch: 309 Average train loss: -7011.0090 Average bpd: 3.293
====> Epoch: 310 Average train loss: -7009.1124 Average bpd: 3.292
====> [eval] Epoch: 310 Average bpd: 3.399
====> [test] Epoch: 310 Average bpd: 3.410
Best val_bpd: 3.3988100628326516
Best test_bpd: 3.409864554673235
====> Epoch: 311 Average train loss: -7010.8844 Average bpd: 3.293
====> Epoch: 312 Average train loss: -7009.1860 Average bpd: 3.292
====> [eval] Epoch: 312 Average bpd: 3.398
====> [test] Epoch: 312 Average bpd: 3.409
Best val_bpd: 3.398059526190295
Best test_bpd: 3.408995086585319
====> Epoch: 313 Average train loss: -7009.1484 Average bpd: 3.292
====> Epoch: 314 Average train loss: -7008.2780 Average bpd: 3.291
====> [eval] Epoch: 314 Average bpd: 3.398
====> [test] Epoch: 314 Average bpd: 3.409
Best val_bpd: 3.3980578172487634
Best test_bpd: 3.4092668177223504
====> Epoch: 315 Average train loss: -7006.6023 Average bpd: 3.290
====> Epoch: 316 Average train loss: -7006.9809 Average bpd: 3.291
====> [eval] Epoch: 316 Average bpd: 3.400
====> [test] Epoch: 316 Average bpd: 3.411
Best val_bpd: 3.3980578172487634
Best test_bpd: 3.4092668177223504
====> Epoch: 317 Average train loss: -7004.5845 Average bpd: 3.290
====> Epoch: 318 Average train loss: -7006.1418 Average bpd: 3.290
====> [eval] Epoch: 318 Average bpd: 3.399
====> [test] Epoch: 318 Average bpd: 3.411
Best val_bpd: 3.3980578172487634
Best test_bpd: 3.4092668177223504
====> Epoch: 319 Average train loss: -7002.2859 Average bpd: 3.288
====> Epoch: 320 Average train loss: -7000.8483 Average bpd: 3.288
====> [eval] Epoch: 320 Average bpd: 3.401
====> [test] Epoch: 320 Average bpd: 3.413
Best val_bpd: 3.3980578172487634
Best test_bpd: 3.4092668177223504
====> Epoch: 321 Average train loss: -7002.3469 Average bpd: 3.288
====> Epoch: 322 Average train loss: -7001.3991 Average bpd: 3.288
====> [eval] Epoch: 322 Average bpd: 3.396
====> [test] Epoch: 322 Average bpd: 3.407
Best val_bpd: 3.3959662818067318
Best test_bpd: 3.4073834444207347
====> Epoch: 323 Average train loss: -6999.6491 Average bpd: 3.287
====> Epoch: 324 Average train loss: -7000.1731 Average bpd: 3.287
====> [eval] Epoch: 324 Average bpd: 3.402
====> [test] Epoch: 324 Average bpd: 3.414
Best val_bpd: 3.3959662818067318
Best test_bpd: 3.4073834444207347
====> Epoch: 325 Average train loss: -7000.3023 Average bpd: 3.288
====> Epoch: 326 Average train loss: -6996.7168 Average bpd: 3.286
====> [eval] Epoch: 326 Average bpd: 3.395
====> [test] Epoch: 326 Average bpd: 3.407
Best val_bpd: 3.395455090589486
Best test_bpd: 3.406854647114564
====> Epoch: 327 Average train loss: -6995.0029 Average bpd: 3.285
====> Epoch: 328 Average train loss: -6995.5576 Average bpd: 3.285
====> [eval] Epoch: 328 Average bpd: 3.395
====> [test] Epoch: 328 Average bpd: 3.405
Best val_bpd: 3.3945828141772707
Best test_bpd: 3.405420033150933
====> Epoch: 329 Average train loss: -6994.6170 Average bpd: 3.285
====> Epoch: 330 Average train loss: -6995.9128 Average bpd: 3.285
====> [eval] Epoch: 330 Average bpd: 3.399
====> [test] Epoch: 330 Average bpd: 3.410
Best val_bpd: 3.3945828141772707
Best test_bpd: 3.405420033150933
====> Epoch: 331 Average train loss: -6994.5965 Average bpd: 3.285
====> Epoch: 332 Average train loss: -6993.5459 Average bpd: 3.284
====> [eval] Epoch: 332 Average bpd: 3.397
====> [test] Epoch: 332 Average bpd: 3.409
Best val_bpd: 3.3945828141772707
Best test_bpd: 3.405420033150933
====> Epoch: 333 Average train loss: -6992.7735 Average bpd: 3.284
====> Epoch: 334 Average train loss: -6992.3148 Average bpd: 3.284
====> [eval] Epoch: 334 Average bpd: 3.398
====> [test] Epoch: 334 Average bpd: 3.409
Best val_bpd: 3.3945828141772707
Best test_bpd: 3.405420033150933
====> Epoch: 335 Average train loss: -6991.5316 Average bpd: 3.283
====> Epoch: 336 Average train loss: -6990.6438 Average bpd: 3.283
====> [eval] Epoch: 336 Average bpd: 3.394
====> [test] Epoch: 336 Average bpd: 3.406
Best val_bpd: 3.3943041239110316
Best test_bpd: 3.4055954750477744
====> Epoch: 337 Average train loss: -6989.6169 Average bpd: 3.283
====> Epoch: 338 Average train loss: -6988.2409 Average bpd: 3.282
====> [eval] Epoch: 338 Average bpd: 3.398
====> [test] Epoch: 338 Average bpd: 3.410
Best val_bpd: 3.3943041239110316
Best test_bpd: 3.4055954750477744
====> Epoch: 339 Average train loss: -6989.9755 Average bpd: 3.283
====> Epoch: 340 Average train loss: -6985.4990 Average bpd: 3.281
====> [eval] Epoch: 340 Average bpd: 3.393
====> [test] Epoch: 340 Average bpd: 3.404
Best val_bpd: 3.3928118536707816
Best test_bpd: 3.4040217155089527
====> Epoch: 341 Average train loss: -6987.2066 Average bpd: 3.281
====> Epoch: 342 Average train loss: -6986.3457 Average bpd: 3.281
====> [eval] Epoch: 342 Average bpd: 3.391
====> [test] Epoch: 342 Average bpd: 3.402
Best val_bpd: 3.3907254566639846
Best test_bpd: 3.4019058869732386
====> Epoch: 343 Average train loss: -6986.7552 Average bpd: 3.281
====> Epoch: 344 Average train loss: -6983.7810 Average bpd: 3.280
====> [eval] Epoch: 344 Average bpd: 3.392
====> [test] Epoch: 344 Average bpd: 3.403
Best val_bpd: 3.3907254566639846
Best test_bpd: 3.4019058869732386
====> Epoch: 345 Average train loss: -6984.7461 Average bpd: 3.280
====> Epoch: 346 Average train loss: -6983.9269 Average bpd: 3.280
====> [eval] Epoch: 346 Average bpd: 3.392
====> [test] Epoch: 346 Average bpd: 3.403
Best val_bpd: 3.3907254566639846
Best test_bpd: 3.4019058869732386
====> Epoch: 347 Average train loss: -6983.5980 Average bpd: 3.280
====> Epoch: 348 Average train loss: -6982.9572 Average bpd: 3.279
====> [eval] Epoch: 348 Average bpd: 3.392
====> [test] Epoch: 348 Average bpd: 3.404
Best val_bpd: 3.3907254566639846
Best test_bpd: 3.4019058869732386
====> Epoch: 349 Average train loss: -6982.1398 Average bpd: 3.279
====> Epoch: 350 Average train loss: -6982.1130 Average bpd: 3.279
====> [eval] Epoch: 350 Average bpd: 3.389
====> [test] Epoch: 350 Average bpd: 3.400
Best val_bpd: 3.3889326906440957
Best test_bpd: 3.4004136268922998
====> Epoch: 351 Average train loss: -6980.9000 Average bpd: 3.278
====> Epoch: 352 Average train loss: -6980.7196 Average bpd: 3.278
====> [eval] Epoch: 352 Average bpd: 3.396
====> [test] Epoch: 352 Average bpd: 3.407
Best val_bpd: 3.3889326906440957
Best test_bpd: 3.4004136268922998
====> Epoch: 353 Average train loss: -6978.2035 Average bpd: 3.277
====> Epoch: 354 Average train loss: -6978.7147 Average bpd: 3.277
====> [eval] Epoch: 354 Average bpd: 3.386
====> [test] Epoch: 354 Average bpd: 3.398
Best val_bpd: 3.3863004373937593
Best test_bpd: 3.39781308375959
====> Epoch: 355 Average train loss: -6977.4921 Average bpd: 3.277
====> Epoch: 356 Average train loss: -6979.1058 Average bpd: 3.278
====> [eval] Epoch: 356 Average bpd: 3.388
====> [test] Epoch: 356 Average bpd: 3.399
Best val_bpd: 3.3863004373937593
Best test_bpd: 3.39781308375959
====> Epoch: 357 Average train loss: -6976.4888 Average bpd: 3.276
====> Epoch: 358 Average train loss: -6975.3499 Average bpd: 3.276
====> [eval] Epoch: 358 Average bpd: 3.388
====> [test] Epoch: 358 Average bpd: 3.400
Best val_bpd: 3.3863004373937593
Best test_bpd: 3.39781308375959
====> Epoch: 359 Average train loss: -6976.3936 Average bpd: 3.276
====> Epoch: 360 Average train loss: -6974.2235 Average bpd: 3.275
====> [eval] Epoch: 360 Average bpd: 3.388
====> [test] Epoch: 360 Average bpd: 3.399
Best val_bpd: 3.3863004373937593
Best test_bpd: 3.39781308375959
====> Epoch: 361 Average train loss: -6973.8949 Average bpd: 3.275
====> Epoch: 362 Average train loss: -6975.0953 Average bpd: 3.276
====> [eval] Epoch: 362 Average bpd: 3.392
====> [test] Epoch: 362 Average bpd: 3.404
Best val_bpd: 3.3863004373937593
Best test_bpd: 3.39781308375959
====> Epoch: 363 Average train loss: -6973.4940 Average bpd: 3.275
====> Epoch: 364 Average train loss: -6972.8902 Average bpd: 3.275
====> [eval] Epoch: 364 Average bpd: 3.384
====> [test] Epoch: 364 Average bpd: 3.395
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 365 Average train loss: -6973.6638 Average bpd: 3.275
====> Epoch: 366 Average train loss: -6970.9408 Average bpd: 3.274
====> [eval] Epoch: 366 Average bpd: 3.387
====> [test] Epoch: 366 Average bpd: 3.398
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 367 Average train loss: -6970.0972 Average bpd: 3.273
====> Epoch: 368 Average train loss: -6970.5453 Average bpd: 3.274
====> [eval] Epoch: 368 Average bpd: 3.385
====> [test] Epoch: 368 Average bpd: 3.396
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 369 Average train loss: -6969.2313 Average bpd: 3.273
====> Epoch: 370 Average train loss: -6970.1322 Average bpd: 3.273
====> [eval] Epoch: 370 Average bpd: 3.384
====> [test] Epoch: 370 Average bpd: 3.396
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 371 Average train loss: -6969.0312 Average bpd: 3.273
====> Epoch: 372 Average train loss: -6968.4371 Average bpd: 3.273
====> [eval] Epoch: 372 Average bpd: 3.386
====> [test] Epoch: 372 Average bpd: 3.398
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 373 Average train loss: -6963.0693 Average bpd: 3.270
====> Epoch: 374 Average train loss: -6966.7953 Average bpd: 3.272
====> [eval] Epoch: 374 Average bpd: 3.386
====> [test] Epoch: 374 Average bpd: 3.397
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 375 Average train loss: -6964.4924 Average bpd: 3.271
====> Epoch: 376 Average train loss: -6964.8496 Average bpd: 3.271
====> [eval] Epoch: 376 Average bpd: 3.389
====> [test] Epoch: 376 Average bpd: 3.400
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 377 Average train loss: -6964.4258 Average bpd: 3.271
====> Epoch: 378 Average train loss: -6964.6576 Average bpd: 3.271
====> [eval] Epoch: 378 Average bpd: 3.400
====> [test] Epoch: 378 Average bpd: 3.411
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 379 Average train loss: -6963.2976 Average bpd: 3.270
====> Epoch: 380 Average train loss: -6961.5775 Average bpd: 3.269
====> [eval] Epoch: 380 Average bpd: 3.387
====> [test] Epoch: 380 Average bpd: 3.398
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 381 Average train loss: -6960.3546 Average bpd: 3.269
====> Epoch: 382 Average train loss: -6961.9053 Average bpd: 3.270
====> [eval] Epoch: 382 Average bpd: 3.391
====> [test] Epoch: 382 Average bpd: 3.402
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 383 Average train loss: -6962.2928 Average bpd: 3.270
====> Epoch: 384 Average train loss: -6959.0222 Average bpd: 3.268
====> [eval] Epoch: 384 Average bpd: 3.387
====> [test] Epoch: 384 Average bpd: 3.398
Best val_bpd: 3.3835021969383896
Best test_bpd: 3.395017689363109
====> Epoch: 385 Average train loss: -6960.2444 Average bpd: 3.269
====> Epoch: 386 Average train loss: -6960.2618 Average bpd: 3.269
====> [eval] Epoch: 386 Average bpd: 3.381
====> [test] Epoch: 386 Average bpd: 3.392
Best val_bpd: 3.3806640601363456
Best test_bpd: 3.3919478505674374
====> Epoch: 387 Average train loss: -6959.9643 Average bpd: 3.269
====> Epoch: 388 Average train loss: -6956.9199 Average bpd: 3.267
====> [eval] Epoch: 388 Average bpd: 3.385
====> [test] Epoch: 388 Average bpd: 3.396
Best val_bpd: 3.3806640601363456
Best test_bpd: 3.3919478505674374
====> Epoch: 389 Average train loss: -6959.3501 Average bpd: 3.268
====> Epoch: 390 Average train loss: -6955.4084 Average bpd: 3.266
====> [eval] Epoch: 390 Average bpd: 3.378
====> [test] Epoch: 390 Average bpd: 3.390
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 391 Average train loss: -6956.2146 Average bpd: 3.267
====> Epoch: 392 Average train loss: -6956.0626 Average bpd: 3.267
====> [eval] Epoch: 392 Average bpd: 3.380
====> [test] Epoch: 392 Average bpd: 3.392
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 393 Average train loss: -6958.1260 Average bpd: 3.268
====> Epoch: 394 Average train loss: -6953.4304 Average bpd: 3.266
====> [eval] Epoch: 394 Average bpd: 3.383
====> [test] Epoch: 394 Average bpd: 3.395
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 395 Average train loss: -6955.3545 Average bpd: 3.266
====> Epoch: 396 Average train loss: -6953.4385 Average bpd: 3.266
====> [eval] Epoch: 396 Average bpd: 3.381
====> [test] Epoch: 396 Average bpd: 3.393
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 397 Average train loss: -6951.2626 Average bpd: 3.265
====> Epoch: 398 Average train loss: -6953.8743 Average bpd: 3.266
====> [eval] Epoch: 398 Average bpd: 3.380
====> [test] Epoch: 398 Average bpd: 3.391
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 399 Average train loss: -6951.3925 Average bpd: 3.265
====> Epoch: 400 Average train loss: -6952.5402 Average bpd: 3.265
====> [eval] Epoch: 400 Average bpd: 3.389
====> [test] Epoch: 400 Average bpd: 3.401
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 401 Average train loss: -6950.2207 Average bpd: 3.264
====> Epoch: 402 Average train loss: -6951.1311 Average bpd: 3.264
====> [eval] Epoch: 402 Average bpd: 3.379
====> [test] Epoch: 402 Average bpd: 3.390
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 403 Average train loss: -6950.2338 Average bpd: 3.264
====> Epoch: 404 Average train loss: -6948.2153 Average bpd: 3.263
====> [eval] Epoch: 404 Average bpd: 3.381
====> [test] Epoch: 404 Average bpd: 3.392
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 405 Average train loss: -6949.9938 Average bpd: 3.264
====> Epoch: 406 Average train loss: -6950.4112 Average bpd: 3.264
====> [eval] Epoch: 406 Average bpd: 3.381
====> [test] Epoch: 406 Average bpd: 3.393
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 407 Average train loss: -6948.3642 Average bpd: 3.263
====> Epoch: 408 Average train loss: -6946.2537 Average bpd: 3.262
====> [eval] Epoch: 408 Average bpd: 3.383
====> [test] Epoch: 408 Average bpd: 3.394
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 409 Average train loss: -6944.9385 Average bpd: 3.262
====> Epoch: 410 Average train loss: -6947.1232 Average bpd: 3.263
====> [eval] Epoch: 410 Average bpd: 3.385
====> [test] Epoch: 410 Average bpd: 3.396
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 411 Average train loss: -6944.7555 Average bpd: 3.261
====> Epoch: 412 Average train loss: -6944.7772 Average bpd: 3.261
====> [eval] Epoch: 412 Average bpd: 3.383
====> [test] Epoch: 412 Average bpd: 3.394
Best val_bpd: 3.378168301606248
Best test_bpd: 3.389670909119793
====> Epoch: 413 Average train loss: -6944.8641 Average bpd: 3.261
====> Epoch: 414 Average train loss: -6942.6835 Average bpd: 3.260
====> [eval] Epoch: 414 Average bpd: 3.376
====> [test] Epoch: 414 Average bpd: 3.387
Best val_bpd: 3.3755779889014375
Best test_bpd: 3.386947960056575
====> Epoch: 415 Average train loss: -6944.9454 Average bpd: 3.262
====> Epoch: 416 Average train loss: -6942.1344 Average bpd: 3.260
====> [eval] Epoch: 416 Average bpd: 3.378
====> [test] Epoch: 416 Average bpd: 3.389
Best val_bpd: 3.3755779889014375
Best test_bpd: 3.386947960056575
====> Epoch: 417 Average train loss: -6943.7355 Average bpd: 3.261
====> Epoch: 418 Average train loss: -6940.6626 Average bpd: 3.260
====> [eval] Epoch: 418 Average bpd: 3.382
====> [test] Epoch: 418 Average bpd: 3.394
Best val_bpd: 3.3755779889014375
Best test_bpd: 3.386947960056575
====> Epoch: 419 Average train loss: -6941.6847 Average bpd: 3.260
====> Epoch: 420 Average train loss: -6939.9474 Average bpd: 3.259
====> [eval] Epoch: 420 Average bpd: 3.376
====> [test] Epoch: 420 Average bpd: 3.388
Best val_bpd: 3.3755779889014375
Best test_bpd: 3.386947960056575
====> Epoch: 421 Average train loss: -6941.4224 Average bpd: 3.260
====> Epoch: 422 Average train loss: -6939.1651 Average bpd: 3.259
====> [eval] Epoch: 422 Average bpd: 3.373
====> [test] Epoch: 422 Average bpd: 3.385
Best val_bpd: 3.3732706892504223
Best test_bpd: 3.3847977349906864
====> Epoch: 423 Average train loss: -6937.4835 Average bpd: 3.258
====> Epoch: 424 Average train loss: -6939.0669 Average bpd: 3.259
====> [eval] Epoch: 424 Average bpd: 3.376
====> [test] Epoch: 424 Average bpd: 3.388
Best val_bpd: 3.3732706892504223
Best test_bpd: 3.3847977349906864
====> Epoch: 425 Average train loss: -6939.3828 Average bpd: 3.259
====> Epoch: 426 Average train loss: -6938.5399 Average bpd: 3.259
====> [eval] Epoch: 426 Average bpd: 3.378
====> [test] Epoch: 426 Average bpd: 3.389
Best val_bpd: 3.3732706892504223
Best test_bpd: 3.3847977349906864
====> Epoch: 427 Average train loss: -6936.6651 Average bpd: 3.258
====> Epoch: 428 Average train loss: -6937.7415 Average bpd: 3.258
====> [eval] Epoch: 428 Average bpd: 3.372
====> [test] Epoch: 428 Average bpd: 3.383
Best val_bpd: 3.371926412621067
Best test_bpd: 3.383273023162295
====> Epoch: 429 Average train loss: -6936.6613 Average bpd: 3.258
====> Epoch: 430 Average train loss: -6935.3337 Average bpd: 3.257
====> [eval] Epoch: 430 Average bpd: 3.380
====> [test] Epoch: 430 Average bpd: 3.391
Best val_bpd: 3.371926412621067
Best test_bpd: 3.383273023162295
====> Epoch: 431 Average train loss: -6935.0043 Average bpd: 3.257
====> Epoch: 432 Average train loss: -6934.1883 Average bpd: 3.256
====> [eval] Epoch: 432 Average bpd: 3.377
====> [test] Epoch: 432 Average bpd: 3.388
Best val_bpd: 3.371926412621067
Best test_bpd: 3.383273023162295
====> Epoch: 433 Average train loss: -6934.8002 Average bpd: 3.257
====> Epoch: 434 Average train loss: -6933.7395 Average bpd: 3.256
====> [eval] Epoch: 434 Average bpd: 3.374
====> [test] Epoch: 434 Average bpd: 3.385
Best val_bpd: 3.371926412621067
Best test_bpd: 3.383273023162295
====> Epoch: 435 Average train loss: -6933.6231 Average bpd: 3.256
====> Epoch: 436 Average train loss: -6931.1084 Average bpd: 3.255
====> [eval] Epoch: 436 Average bpd: 3.375
====> [test] Epoch: 436 Average bpd: 3.386
Best val_bpd: 3.371926412621067
Best test_bpd: 3.383273023162295
====> Epoch: 437 Average train loss: -6933.3826 Average bpd: 3.256
====> Epoch: 438 Average train loss: -6930.3524 Average bpd: 3.255
====> [eval] Epoch: 438 Average bpd: 3.373
====> [test] Epoch: 438 Average bpd: 3.384
Best val_bpd: 3.371926412621067
Best test_bpd: 3.383273023162295
====> Epoch: 439 Average train loss: -6929.6443 Average bpd: 3.254
====> Epoch: 440 Average train loss: -6930.7232 Average bpd: 3.255
====> [eval] Epoch: 440 Average bpd: 3.369
====> [test] Epoch: 440 Average bpd: 3.380
Best val_bpd: 3.3685616257561612
Best test_bpd: 3.3798495844081238
====> Epoch: 441 Average train loss: -6927.8531 Average bpd: 3.254
====> Epoch: 442 Average train loss: -6929.5068 Average bpd: 3.254
====> [eval] Epoch: 442 Average bpd: 3.373
====> [test] Epoch: 442 Average bpd: 3.385
Best val_bpd: 3.3685616257561612
Best test_bpd: 3.3798495844081238
====> Epoch: 443 Average train loss: -6928.3032 Average bpd: 3.254
====> Epoch: 444 Average train loss: -6927.9334 Average bpd: 3.254
====> [eval] Epoch: 444 Average bpd: 3.371
====> [test] Epoch: 444 Average bpd: 3.383
Best val_bpd: 3.3685616257561612
Best test_bpd: 3.3798495844081238
====> Epoch: 445 Average train loss: -6929.6102 Average bpd: 3.254
====> Epoch: 446 Average train loss: -6928.4473 Average bpd: 3.254
====> [eval] Epoch: 446 Average bpd: 3.372
====> [test] Epoch: 446 Average bpd: 3.384
Best val_bpd: 3.3685616257561612
Best test_bpd: 3.3798495844081238
====> Epoch: 447 Average train loss: -6927.2627 Average bpd: 3.253
====> Epoch: 448 Average train loss: -6925.3236 Average bpd: 3.252
====> [eval] Epoch: 448 Average bpd: 3.373
====> [test] Epoch: 448 Average bpd: 3.384
Best val_bpd: 3.3685616257561612
Best test_bpd: 3.3798495844081238
====> Epoch: 449 Average train loss: -6924.2048 Average bpd: 3.252
====> Epoch: 450 Average train loss: -6927.0829 Average bpd: 3.253
====> [eval] Epoch: 450 Average bpd: 3.365
====> [test] Epoch: 450 Average bpd: 3.376
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 451 Average train loss: -6925.4009 Average bpd: 3.252
====> Epoch: 452 Average train loss: -6926.1588 Average bpd: 3.253
====> [eval] Epoch: 452 Average bpd: 3.369
====> [test] Epoch: 452 Average bpd: 3.380
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 453 Average train loss: -6923.5717 Average bpd: 3.251
====> Epoch: 454 Average train loss: -6924.9283 Average bpd: 3.252
====> [eval] Epoch: 454 Average bpd: 3.374
====> [test] Epoch: 454 Average bpd: 3.385
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 455 Average train loss: -6925.1661 Average bpd: 3.252
====> Epoch: 456 Average train loss: -6921.4362 Average bpd: 3.250
====> [eval] Epoch: 456 Average bpd: 3.371
====> [test] Epoch: 456 Average bpd: 3.383
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 457 Average train loss: -6922.5156 Average bpd: 3.251
====> Epoch: 458 Average train loss: -6922.0212 Average bpd: 3.251
====> [eval] Epoch: 458 Average bpd: 3.371
====> [test] Epoch: 458 Average bpd: 3.383
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 459 Average train loss: -6921.9275 Average bpd: 3.251
====> Epoch: 460 Average train loss: -6920.6706 Average bpd: 3.250
====> [eval] Epoch: 460 Average bpd: 3.371
====> [test] Epoch: 460 Average bpd: 3.383
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 461 Average train loss: -6920.1952 Average bpd: 3.250
====> Epoch: 462 Average train loss: -6919.9075 Average bpd: 3.250
====> [eval] Epoch: 462 Average bpd: 3.365
====> [test] Epoch: 462 Average bpd: 3.377
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 463 Average train loss: -6919.3139 Average bpd: 3.249
====> Epoch: 464 Average train loss: -6918.6423 Average bpd: 3.249
====> [eval] Epoch: 464 Average bpd: 3.369
====> [test] Epoch: 464 Average bpd: 3.380
Best val_bpd: 3.3649873168542666
Best test_bpd: 3.376357194398594
====> Epoch: 465 Average train loss: -6918.8220 Average bpd: 3.249
====> Epoch: 466 Average train loss: -6917.5718 Average bpd: 3.249
====> [eval] Epoch: 466 Average bpd: 3.363
====> [test] Epoch: 466 Average bpd: 3.374
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 467 Average train loss: -6917.5534 Average bpd: 3.249
====> Epoch: 468 Average train loss: -6916.8026 Average bpd: 3.248
====> [eval] Epoch: 468 Average bpd: 3.369
====> [test] Epoch: 468 Average bpd: 3.380
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 469 Average train loss: -6917.1943 Average bpd: 3.249
====> Epoch: 470 Average train loss: -6916.2114 Average bpd: 3.248
====> [eval] Epoch: 470 Average bpd: 3.366
====> [test] Epoch: 470 Average bpd: 3.378
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 471 Average train loss: -6916.6451 Average bpd: 3.248
====> Epoch: 472 Average train loss: -6915.2434 Average bpd: 3.248
====> [eval] Epoch: 472 Average bpd: 3.366
====> [test] Epoch: 472 Average bpd: 3.377
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 473 Average train loss: -6914.2298 Average bpd: 3.247
====> Epoch: 474 Average train loss: -6914.8518 Average bpd: 3.247
====> [eval] Epoch: 474 Average bpd: 3.365
====> [test] Epoch: 474 Average bpd: 3.377
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 475 Average train loss: -6912.2853 Average bpd: 3.246
====> Epoch: 476 Average train loss: -6914.7870 Average bpd: 3.247
====> [eval] Epoch: 476 Average bpd: 3.364
====> [test] Epoch: 476 Average bpd: 3.375
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 477 Average train loss: -6913.5074 Average bpd: 3.247
====> Epoch: 478 Average train loss: -6912.4216 Average bpd: 3.246
====> [eval] Epoch: 478 Average bpd: 3.372
====> [test] Epoch: 478 Average bpd: 3.384
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 479 Average train loss: -6911.9653 Average bpd: 3.246
====> Epoch: 480 Average train loss: -6910.7791 Average bpd: 3.245
====> [eval] Epoch: 480 Average bpd: 3.364
====> [test] Epoch: 480 Average bpd: 3.375
Best val_bpd: 3.362668267958102
Best test_bpd: 3.374087429779713
====> Epoch: 481 Average train loss: -6912.4737 Average bpd: 3.246
====> Epoch: 482 Average train loss: -6909.8985 Average bpd: 3.245
====> [eval] Epoch: 482 Average bpd: 3.359
====> [test] Epoch: 482 Average bpd: 3.370
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 483 Average train loss: -6910.4712 Average bpd: 3.245
====> Epoch: 484 Average train loss: -6909.0774 Average bpd: 3.245
====> [eval] Epoch: 484 Average bpd: 3.368
====> [test] Epoch: 484 Average bpd: 3.380
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 485 Average train loss: -6909.7156 Average bpd: 3.245
====> Epoch: 486 Average train loss: -6910.0840 Average bpd: 3.245
====> [eval] Epoch: 486 Average bpd: 3.371
====> [test] Epoch: 486 Average bpd: 3.383
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 487 Average train loss: -6908.5519 Average bpd: 3.244
====> Epoch: 488 Average train loss: -6908.0128 Average bpd: 3.244
====> [eval] Epoch: 488 Average bpd: 3.361
====> [test] Epoch: 488 Average bpd: 3.372
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 489 Average train loss: -6905.5405 Average bpd: 3.243
====> Epoch: 490 Average train loss: -6907.3755 Average bpd: 3.244
====> [eval] Epoch: 490 Average bpd: 3.359
====> [test] Epoch: 490 Average bpd: 3.371
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 491 Average train loss: -6907.2760 Average bpd: 3.244
====> Epoch: 492 Average train loss: -6905.8545 Average bpd: 3.243
====> [eval] Epoch: 492 Average bpd: 3.362
====> [test] Epoch: 492 Average bpd: 3.374
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 493 Average train loss: -6906.6222 Average bpd: 3.244
====> Epoch: 494 Average train loss: -6905.0751 Average bpd: 3.243
====> [eval] Epoch: 494 Average bpd: 3.361
====> [test] Epoch: 494 Average bpd: 3.372
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 495 Average train loss: -6903.0415 Average bpd: 3.242
====> Epoch: 496 Average train loss: -6905.4595 Average bpd: 3.243
====> [eval] Epoch: 496 Average bpd: 3.361
====> [test] Epoch: 496 Average bpd: 3.372
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 497 Average train loss: -6903.8896 Average bpd: 3.242
====> Epoch: 498 Average train loss: -6903.9899 Average bpd: 3.242
====> [eval] Epoch: 498 Average bpd: 3.360
====> [test] Epoch: 498 Average bpd: 3.371
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 499 Average train loss: -6903.3592 Average bpd: 3.242
====> Epoch: 500 Average train loss: -6902.0253 Average bpd: 3.241
====> [eval] Epoch: 500 Average bpd: 3.363
====> [test] Epoch: 500 Average bpd: 3.374
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 501 Average train loss: -6902.9317 Average bpd: 3.242
====> Epoch: 502 Average train loss: -6902.2510 Average bpd: 3.241
====> [eval] Epoch: 502 Average bpd: 3.363
====> [test] Epoch: 502 Average bpd: 3.375
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 503 Average train loss: -6901.1840 Average bpd: 3.241
====> Epoch: 504 Average train loss: -6901.7312 Average bpd: 3.241
====> [eval] Epoch: 504 Average bpd: 3.359
====> [test] Epoch: 504 Average bpd: 3.371
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 505 Average train loss: -6902.7580 Average bpd: 3.242
====> Epoch: 506 Average train loss: -6901.3834 Average bpd: 3.241
====> [eval] Epoch: 506 Average bpd: 3.359
====> [test] Epoch: 506 Average bpd: 3.370
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 507 Average train loss: -6900.0734 Average bpd: 3.240
====> Epoch: 508 Average train loss: -6899.7335 Average bpd: 3.240
====> [eval] Epoch: 508 Average bpd: 3.360
====> [test] Epoch: 508 Average bpd: 3.372
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 509 Average train loss: -6897.9293 Average bpd: 3.239
====> Epoch: 510 Average train loss: -6898.6152 Average bpd: 3.240
====> [eval] Epoch: 510 Average bpd: 3.361
====> [test] Epoch: 510 Average bpd: 3.372
Best val_bpd: 3.3585908894838084
Best test_bpd: 3.3700747320211133
====> Epoch: 511 Average train loss: -6897.3442 Average bpd: 3.239
====> Epoch: 512 Average train loss: -6897.8782 Average bpd: 3.239
====> [eval] Epoch: 512 Average bpd: 3.358
====> [test] Epoch: 512 Average bpd: 3.370
Best val_bpd: 3.358032342807365
Best test_bpd: 3.369736109792207
====> Epoch: 513 Average train loss: -6896.6955 Average bpd: 3.239
====> Epoch: 514 Average train loss: -6896.9545 Average bpd: 3.239
====> [eval] Epoch: 514 Average bpd: 3.360
====> [test] Epoch: 514 Average bpd: 3.372
Best val_bpd: 3.358032342807365
Best test_bpd: 3.369736109792207
====> Epoch: 515 Average train loss: -6896.7332 Average bpd: 3.239
====> Epoch: 516 Average train loss: -6895.5653 Average bpd: 3.238
====> [eval] Epoch: 516 Average bpd: 3.357
====> [test] Epoch: 516 Average bpd: 3.368
Best val_bpd: 3.356775733064373
Best test_bpd: 3.3681538643246123
====> Epoch: 517 Average train loss: -6895.1304 Average bpd: 3.238
====> Epoch: 518 Average train loss: -6896.2693 Average bpd: 3.239
====> [eval] Epoch: 518 Average bpd: 3.362
====> [test] Epoch: 518 Average bpd: 3.374
Best val_bpd: 3.356775733064373
Best test_bpd: 3.3681538643246123
====> Epoch: 519 Average train loss: -6895.5736 Average bpd: 3.238
====> Epoch: 520 Average train loss: -6892.7914 Average bpd: 3.237
====> [eval] Epoch: 520 Average bpd: 3.361
====> [test] Epoch: 520 Average bpd: 3.373
Best val_bpd: 3.356775733064373
Best test_bpd: 3.3681538643246123
====> Epoch: 521 Average train loss: -6893.0589 Average bpd: 3.237
====> Epoch: 522 Average train loss: -6892.9354 Average bpd: 3.237
====> [eval] Epoch: 522 Average bpd: 3.356
====> [test] Epoch: 522 Average bpd: 3.367
Best val_bpd: 3.3555401313287336
Best test_bpd: 3.3670253903644958
====> Epoch: 523 Average train loss: -6892.4160 Average bpd: 3.237
====> Epoch: 524 Average train loss: -6891.8393 Average bpd: 3.237
====> [eval] Epoch: 524 Average bpd: 3.360
====> [test] Epoch: 524 Average bpd: 3.371
Best val_bpd: 3.3555401313287336
Best test_bpd: 3.3670253903644958
====> Epoch: 525 Average train loss: -6892.1765 Average bpd: 3.237
====> Epoch: 526 Average train loss: -6890.7829 Average bpd: 3.236
====> [eval] Epoch: 526 Average bpd: 3.355
====> [test] Epoch: 526 Average bpd: 3.367
Best val_bpd: 3.35534438967377
Best test_bpd: 3.3673064259806402
====> Epoch: 527 Average train loss: -6891.2500 Average bpd: 3.236
====> Epoch: 528 Average train loss: -6891.0247 Average bpd: 3.236
====> [eval] Epoch: 528 Average bpd: 3.359
====> [test] Epoch: 528 Average bpd: 3.371
Best val_bpd: 3.35534438967377
Best test_bpd: 3.3673064259806402
====> Epoch: 529 Average train loss: -6888.9774 Average bpd: 3.235
====> Epoch: 530 Average train loss: -6891.9400 Average bpd: 3.237
====> [eval] Epoch: 530 Average bpd: 3.364
====> [test] Epoch: 530 Average bpd: 3.376
Best val_bpd: 3.35534438967377
Best test_bpd: 3.3673064259806402
====> Epoch: 531 Average train loss: -6889.7713 Average bpd: 3.236
====> Epoch: 532 Average train loss: -6889.3395 Average bpd: 3.235
====> [eval] Epoch: 532 Average bpd: 3.356
====> [test] Epoch: 532 Average bpd: 3.368
Best val_bpd: 3.35534438967377
Best test_bpd: 3.3673064259806402
====> Epoch: 533 Average train loss: -6888.6030 Average bpd: 3.235
====> Epoch: 534 Average train loss: -6888.0100 Average bpd: 3.235
====> [eval] Epoch: 534 Average bpd: 3.356
====> [test] Epoch: 534 Average bpd: 3.368
Best val_bpd: 3.35534438967377
Best test_bpd: 3.3673064259806402
====> Epoch: 535 Average train loss: -6887.7035 Average bpd: 3.235
====> Epoch: 536 Average train loss: -6888.0499 Average bpd: 3.235
====> [eval] Epoch: 536 Average bpd: 3.357
====> [test] Epoch: 536 Average bpd: 3.369
Best val_bpd: 3.35534438967377
Best test_bpd: 3.3673064259806402
====> Epoch: 537 Average train loss: -6889.0316 Average bpd: 3.235
====> Epoch: 538 Average train loss: -6886.5684 Average bpd: 3.234
====> [eval] Epoch: 538 Average bpd: 3.353
====> [test] Epoch: 538 Average bpd: 3.365
Best val_bpd: 3.353353148418134
Best test_bpd: 3.3648048410847053
====> Epoch: 539 Average train loss: -6886.9791 Average bpd: 3.234
====> Epoch: 540 Average train loss: -6885.0909 Average bpd: 3.233
====> [eval] Epoch: 540 Average bpd: 3.353
====> [test] Epoch: 540 Average bpd: 3.365
Best val_bpd: 3.353244935773147
Best test_bpd: 3.3651517329941556
====> Epoch: 541 Average train loss: -6886.7783 Average bpd: 3.234
====> Epoch: 542 Average train loss: -6886.2476 Average bpd: 3.234
====> [eval] Epoch: 542 Average bpd: 3.353
====> [test] Epoch: 542 Average bpd: 3.365
Best val_bpd: 3.353244935773147
Best test_bpd: 3.3651517329941556
====> Epoch: 543 Average train loss: -6886.0316 Average bpd: 3.234
====> Epoch: 544 Average train loss: -6883.7225 Average bpd: 3.233
====> [eval] Epoch: 544 Average bpd: 3.358
====> [test] Epoch: 544 Average bpd: 3.369
Best val_bpd: 3.353244935773147
Best test_bpd: 3.3651517329941556
====> Epoch: 545 Average train loss: -6884.2292 Average bpd: 3.233
====> Epoch: 546 Average train loss: -6882.6031 Average bpd: 3.232
====> [eval] Epoch: 546 Average bpd: 3.355
====> [test] Epoch: 546 Average bpd: 3.367
Best val_bpd: 3.353244935773147
Best test_bpd: 3.3651517329941556
====> Epoch: 547 Average train loss: -6885.0429 Average bpd: 3.233
====> Epoch: 548 Average train loss: -6882.8748 Average bpd: 3.232
====> [eval] Epoch: 548 Average bpd: 3.353
====> [test] Epoch: 548 Average bpd: 3.365
Best val_bpd: 3.353201943738758
Best test_bpd: 3.365000768509957
====> Epoch: 549 Average train loss: -6881.7735 Average bpd: 3.232
====> Epoch: 550 Average train loss: -6882.1002 Average bpd: 3.232
====> [eval] Epoch: 550 Average bpd: 3.354
====> [test] Epoch: 550 Average bpd: 3.365
Best val_bpd: 3.353201943738758
Best test_bpd: 3.365000768509957
====> Epoch: 551 Average train loss: -6881.7995 Average bpd: 3.232
====> Epoch: 552 Average train loss: -6882.3069 Average bpd: 3.232
====> [eval] Epoch: 552 Average bpd: 3.352
====> [test] Epoch: 552 Average bpd: 3.363
Best val_bpd: 3.3516442054360778
Best test_bpd: 3.3631551682586593
====> Epoch: 553 Average train loss: -6881.9471 Average bpd: 3.232
====> Epoch: 554 Average train loss: -6881.1722 Average bpd: 3.232
====> [eval] Epoch: 554 Average bpd: 3.351
====> [test] Epoch: 554 Average bpd: 3.363
Best val_bpd: 3.351238340893345
Best test_bpd: 3.3629735124907674
====> Epoch: 555 Average train loss: -6879.2384 Average bpd: 3.231
====> Epoch: 556 Average train loss: -6880.3892 Average bpd: 3.231
====> [eval] Epoch: 556 Average bpd: 3.350
====> [test] Epoch: 556 Average bpd: 3.362
Best val_bpd: 3.3502398705783123
Best test_bpd: 3.3616990883929407
====> Epoch: 557 Average train loss: -6880.6374 Average bpd: 3.231
====> Epoch: 558 Average train loss: -6879.5485 Average bpd: 3.231
====> [eval] Epoch: 558 Average bpd: 3.354
====> [test] Epoch: 558 Average bpd: 3.366
Best val_bpd: 3.3502398705783123
Best test_bpd: 3.3616990883929407
====> Epoch: 559 Average train loss: -6878.7986 Average bpd: 3.230
====> Epoch: 560 Average train loss: -6877.0132 Average bpd: 3.230
====> [eval] Epoch: 560 Average bpd: 3.353
====> [test] Epoch: 560 Average bpd: 3.365
Best val_bpd: 3.3502398705783123
Best test_bpd: 3.3616990883929407
====> Epoch: 561 Average train loss: -6878.4504 Average bpd: 3.230
====> Epoch: 562 Average train loss: -6878.0684 Average bpd: 3.230
====> [eval] Epoch: 562 Average bpd: 3.354
====> [test] Epoch: 562 Average bpd: 3.365
Best val_bpd: 3.3502398705783123
Best test_bpd: 3.3616990883929407
====> Epoch: 563 Average train loss: -6876.9977 Average bpd: 3.230
====> Epoch: 564 Average train loss: -6878.0229 Average bpd: 3.230
====> [eval] Epoch: 564 Average bpd: 3.357
====> [test] Epoch: 564 Average bpd: 3.368
Best val_bpd: 3.3502398705783123
Best test_bpd: 3.3616990883929407
====> Epoch: 565 Average train loss: -6874.7007 Average bpd: 3.229
====> Epoch: 566 Average train loss: -6876.0702 Average bpd: 3.229
====> [eval] Epoch: 566 Average bpd: 3.351
====> [test] Epoch: 566 Average bpd: 3.363
Best val_bpd: 3.3502398705783123
Best test_bpd: 3.3616990883929407
====> Epoch: 567 Average train loss: -6876.2867 Average bpd: 3.229
====> Epoch: 568 Average train loss: -6873.4102 Average bpd: 3.228
====> [eval] Epoch: 568 Average bpd: 3.350
====> [test] Epoch: 568 Average bpd: 3.361
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 569 Average train loss: -6874.1211 Average bpd: 3.228
====> Epoch: 570 Average train loss: -6874.0919 Average bpd: 3.228
====> [eval] Epoch: 570 Average bpd: 3.354
====> [test] Epoch: 570 Average bpd: 3.365
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 571 Average train loss: -6875.9718 Average bpd: 3.229
====> Epoch: 572 Average train loss: -6874.2962 Average bpd: 3.228
====> [eval] Epoch: 572 Average bpd: 3.351
====> [test] Epoch: 572 Average bpd: 3.363
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 573 Average train loss: -6874.1816 Average bpd: 3.228
====> Epoch: 574 Average train loss: -6871.8302 Average bpd: 3.227
====> [eval] Epoch: 574 Average bpd: 3.350
====> [test] Epoch: 574 Average bpd: 3.361
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 575 Average train loss: -6872.7678 Average bpd: 3.228
====> Epoch: 576 Average train loss: -6872.3789 Average bpd: 3.227
====> [eval] Epoch: 576 Average bpd: 3.352
====> [test] Epoch: 576 Average bpd: 3.364
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 577 Average train loss: -6873.1448 Average bpd: 3.228
====> Epoch: 578 Average train loss: -6872.4716 Average bpd: 3.228
====> [eval] Epoch: 578 Average bpd: 3.350
====> [test] Epoch: 578 Average bpd: 3.361
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 579 Average train loss: -6873.6246 Average bpd: 3.228
====> Epoch: 580 Average train loss: -6872.6766 Average bpd: 3.228
====> [eval] Epoch: 580 Average bpd: 3.352
====> [test] Epoch: 580 Average bpd: 3.363
Best val_bpd: 3.3497089942413636
Best test_bpd: 3.361220696516776
====> Epoch: 581 Average train loss: -6870.0798 Average bpd: 3.226
====> Epoch: 582 Average train loss: -6870.8040 Average bpd: 3.227
====> [eval] Epoch: 582 Average bpd: 3.349
====> [test] Epoch: 582 Average bpd: 3.360
Best val_bpd: 3.3485161131413306
Best test_bpd: 3.3600308196706457
====> Epoch: 583 Average train loss: -6870.4710 Average bpd: 3.227
====> Epoch: 584 Average train loss: -6868.3520 Average bpd: 3.226
====> [eval] Epoch: 584 Average bpd: 3.349
====> [test] Epoch: 584 Average bpd: 3.360
Best val_bpd: 3.3485161131413306
Best test_bpd: 3.3600308196706457
====> Epoch: 585 Average train loss: -6869.0985 Average bpd: 3.226
====> Epoch: 586 Average train loss: -6870.1210 Average bpd: 3.226
====> [eval] Epoch: 586 Average bpd: 3.350
====> [test] Epoch: 586 Average bpd: 3.362
Best val_bpd: 3.3485161131413306
Best test_bpd: 3.3600308196706457
====> Epoch: 587 Average train loss: -6869.5344 Average bpd: 3.226
====> Epoch: 588 Average train loss: -6867.9473 Average bpd: 3.225
====> [eval] Epoch: 588 Average bpd: 3.350
====> [test] Epoch: 588 Average bpd: 3.362
Best val_bpd: 3.3485161131413306
Best test_bpd: 3.3600308196706457
====> Epoch: 589 Average train loss: -6868.7075 Average bpd: 3.226
====> Epoch: 590 Average train loss: -6866.7813 Average bpd: 3.225
====> [eval] Epoch: 590 Average bpd: 3.349
====> [test] Epoch: 590 Average bpd: 3.361
Best val_bpd: 3.3485161131413306
Best test_bpd: 3.3600308196706457
====> Epoch: 591 Average train loss: -6866.9984 Average bpd: 3.225
====> Epoch: 592 Average train loss: -6867.6424 Average bpd: 3.225
====> [eval] Epoch: 592 Average bpd: 3.353
====> [test] Epoch: 592 Average bpd: 3.365
Best val_bpd: 3.3485161131413306
Best test_bpd: 3.3600308196706457
====> Epoch: 593 Average train loss: -6865.9021 Average bpd: 3.224
====> Epoch: 594 Average train loss: -6865.9970 Average bpd: 3.224
====> [eval] Epoch: 594 Average bpd: 3.348
====> [test] Epoch: 594 Average bpd: 3.360
Best val_bpd: 3.348348452552361
Best test_bpd: 3.360113253785299
====> Epoch: 595 Average train loss: -6866.9183 Average bpd: 3.225
====> Epoch: 596 Average train loss: -6866.1368 Average bpd: 3.225
====> [eval] Epoch: 596 Average bpd: 3.348
====> [test] Epoch: 596 Average bpd: 3.359
Best val_bpd: 3.3475952309487877
Best test_bpd: 3.3592578265932116
====> Epoch: 597 Average train loss: -6866.3685 Average bpd: 3.225
====> Epoch: 598 Average train loss: -6863.5625 Average bpd: 3.223
====> [eval] Epoch: 598 Average bpd: 3.352
====> [test] Epoch: 598 Average bpd: 3.363
Best val_bpd: 3.3475952309487877
Best test_bpd: 3.3592578265932116
====> Epoch: 599 Average train loss: -6864.1144 Average bpd: 3.224
====> Epoch: 600 Average train loss: -6865.4886 Average bpd: 3.224
====> [eval] Epoch: 600 Average bpd: 3.346
====> [test] Epoch: 600 Average bpd: 3.358
Best val_bpd: 3.345912020780159
Best test_bpd: 3.357716232889758
====> Epoch: 601 Average train loss: -6864.2421 Average bpd: 3.224
====> Epoch: 602 Average train loss: -6862.6531 Average bpd: 3.223
====> [eval] Epoch: 602 Average bpd: 3.350
====> [test] Epoch: 602 Average bpd: 3.361
Best val_bpd: 3.345912020780159
Best test_bpd: 3.357716232889758
====> Epoch: 603 Average train loss: -6863.8468 Average bpd: 3.223
====> Epoch: 604 Average train loss: -6862.7218 Average bpd: 3.223
====> [eval] Epoch: 604 Average bpd: 3.348
====> [test] Epoch: 604 Average bpd: 3.359
Best val_bpd: 3.345912020780159
Best test_bpd: 3.357716232889758
====> Epoch: 605 Average train loss: -6862.4581 Average bpd: 3.223
====> Epoch: 606 Average train loss: -6861.2360 Average bpd: 3.222
====> [eval] Epoch: 606 Average bpd: 3.345
====> [test] Epoch: 606 Average bpd: 3.357
Best val_bpd: 3.3452404894844165
Best test_bpd: 3.3567870396537547
====> Epoch: 607 Average train loss: -6861.0376 Average bpd: 3.222
====> Epoch: 608 Average train loss: -6861.7902 Average bpd: 3.222
====> [eval] Epoch: 608 Average bpd: 3.344
====> [test] Epoch: 608 Average bpd: 3.356
Best val_bpd: 3.343886750906604
Best test_bpd: 3.355689024764469
====> Epoch: 609 Average train loss: -6861.9499 Average bpd: 3.223
====> Epoch: 610 Average train loss: -6860.8292 Average bpd: 3.222
====> [eval] Epoch: 610 Average bpd: 3.342
====> [test] Epoch: 610 Average bpd: 3.353
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 611 Average train loss: -6860.8333 Average bpd: 3.222
====> Epoch: 612 Average train loss: -6858.4212 Average bpd: 3.221
====> [eval] Epoch: 612 Average bpd: 3.348
====> [test] Epoch: 612 Average bpd: 3.360
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 613 Average train loss: -6860.2587 Average bpd: 3.222
====> Epoch: 614 Average train loss: -6860.7046 Average bpd: 3.222
====> [eval] Epoch: 614 Average bpd: 3.345
====> [test] Epoch: 614 Average bpd: 3.356
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 615 Average train loss: -6859.6744 Average bpd: 3.221
====> Epoch: 616 Average train loss: -6859.1896 Average bpd: 3.221
====> [eval] Epoch: 616 Average bpd: 3.342
====> [test] Epoch: 616 Average bpd: 3.354
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 617 Average train loss: -6859.7190 Average bpd: 3.222
====> Epoch: 618 Average train loss: -6858.2406 Average bpd: 3.221
====> [eval] Epoch: 618 Average bpd: 3.343
====> [test] Epoch: 618 Average bpd: 3.355
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 619 Average train loss: -6856.9123 Average bpd: 3.220
====> Epoch: 620 Average train loss: -6857.5973 Average bpd: 3.221
====> [eval] Epoch: 620 Average bpd: 3.345
====> [test] Epoch: 620 Average bpd: 3.357
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 621 Average train loss: -6856.9058 Average bpd: 3.220
====> Epoch: 622 Average train loss: -6855.3448 Average bpd: 3.219
====> [eval] Epoch: 622 Average bpd: 3.348
====> [test] Epoch: 622 Average bpd: 3.360
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 623 Average train loss: -6857.0733 Average bpd: 3.220
====> Epoch: 624 Average train loss: -6858.6771 Average bpd: 3.221
====> [eval] Epoch: 624 Average bpd: 3.343
====> [test] Epoch: 624 Average bpd: 3.354
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 625 Average train loss: -6855.0567 Average bpd: 3.219
====> Epoch: 626 Average train loss: -6854.3357 Average bpd: 3.219
====> [eval] Epoch: 626 Average bpd: 3.345
====> [test] Epoch: 626 Average bpd: 3.357
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 627 Average train loss: -6854.4210 Average bpd: 3.219
====> Epoch: 628 Average train loss: -6855.2871 Average bpd: 3.219
====> [eval] Epoch: 628 Average bpd: 3.344
====> [test] Epoch: 628 Average bpd: 3.356
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 629 Average train loss: -6855.0007 Average bpd: 3.219
====> Epoch: 630 Average train loss: -6854.0298 Average bpd: 3.219
====> [eval] Epoch: 630 Average bpd: 3.344
====> [test] Epoch: 630 Average bpd: 3.356
Best val_bpd: 3.34180424419091
Best test_bpd: 3.3534654951777285
====> Epoch: 631 Average train loss: -6854.8817 Average bpd: 3.219
====> Epoch: 632 Average train loss: -6852.5826 Average bpd: 3.218
====> [eval] Epoch: 632 Average bpd: 3.341
====> [test] Epoch: 632 Average bpd: 3.352
Best val_bpd: 3.3408137866709433
Best test_bpd: 3.3524025212093096
====> Epoch: 633 Average train loss: -6853.3945 Average bpd: 3.219
====> Epoch: 634 Average train loss: -6852.1037 Average bpd: 3.218
====> [eval] Epoch: 634 Average bpd: 3.342
====> [test] Epoch: 634 Average bpd: 3.354
Best val_bpd: 3.3408137866709433
Best test_bpd: 3.3524025212093096
====> Epoch: 635 Average train loss: -6852.3599 Average bpd: 3.218
====> Epoch: 636 Average train loss: -6851.9086 Average bpd: 3.218
====> [eval] Epoch: 636 Average bpd: 3.343
====> [test] Epoch: 636 Average bpd: 3.354
Best val_bpd: 3.3408137866709433
Best test_bpd: 3.3524025212093096
====> Epoch: 637 Average train loss: -6852.2206 Average bpd: 3.218
====> Epoch: 638 Average train loss: -6852.6479 Average bpd: 3.218
====> [eval] Epoch: 638 Average bpd: 3.340
====> [test] Epoch: 638 Average bpd: 3.352
Best val_bpd: 3.340163614604506
Best test_bpd: 3.351533075617013
====> Epoch: 639 Average train loss: -6849.7600 Average bpd: 3.217
====> Epoch: 640 Average train loss: -6852.1865 Average bpd: 3.218
====> [eval] Epoch: 640 Average bpd: 3.342
====> [test] Epoch: 640 Average bpd: 3.353
Best val_bpd: 3.340163614604506
Best test_bpd: 3.351533075617013
====> Epoch: 641 Average train loss: -6850.6729 Average bpd: 3.217
====> Epoch: 642 Average train loss: -6848.8869 Average bpd: 3.216
====> [eval] Epoch: 642 Average bpd: 3.343
====> [test] Epoch: 642 Average bpd: 3.354
Best val_bpd: 3.340163614604506
Best test_bpd: 3.351533075617013
====> Epoch: 643 Average train loss: -6850.8261 Average bpd: 3.217
====> Epoch: 644 Average train loss: -6850.2111 Average bpd: 3.217
====> [eval] Epoch: 644 Average bpd: 3.347
====> [test] Epoch: 644 Average bpd: 3.358
Best val_bpd: 3.340163614604506
Best test_bpd: 3.351533075617013
====> Epoch: 645 Average train loss: -6849.0264 Average bpd: 3.216
====> Epoch: 646 Average train loss: -6849.7594 Average bpd: 3.217
====> [eval] Epoch: 646 Average bpd: 3.348
====> [test] Epoch: 646 Average bpd: 3.360
Best val_bpd: 3.340163614604506
Best test_bpd: 3.351533075617013
====> Epoch: 647 Average train loss: -6848.2771 Average bpd: 3.216
====> Epoch: 648 Average train loss: -6847.0484 Average bpd: 3.216
====> [eval] Epoch: 648 Average bpd: 3.341
====> [test] Epoch: 648 Average bpd: 3.352
Best val_bpd: 3.340163614604506
Best test_bpd: 3.351533075617013
====> Epoch: 649 Average train loss: -6847.8856 Average bpd: 3.216
====> Epoch: 650 Average train loss: -6848.7603 Average bpd: 3.216
====> [eval] Epoch: 650 Average bpd: 3.340
====> [test] Epoch: 650 Average bpd: 3.352
Best val_bpd: 3.340010737992523
Best test_bpd: 3.3518428722474374
====> Epoch: 651 Average train loss: -6847.4120 Average bpd: 3.216
====> Epoch: 652 Average train loss: -6847.5444 Average bpd: 3.216
====> [eval] Epoch: 652 Average bpd: 3.340
====> [test] Epoch: 652 Average bpd: 3.352
Best val_bpd: 3.3397380446234433
Best test_bpd: 3.3516018918986443
====> Epoch: 653 Average train loss: -6847.8989 Average bpd: 3.216
====> Epoch: 654 Average train loss: -6846.5206 Average bpd: 3.215
====> [eval] Epoch: 654 Average bpd: 3.343
====> [test] Epoch: 654 Average bpd: 3.355
Best val_bpd: 3.3397380446234433
Best test_bpd: 3.3516018918986443
====> Epoch: 655 Average train loss: -6845.4409 Average bpd: 3.215
====> Epoch: 656 Average train loss: -6846.8736 Average bpd: 3.215
====> [eval] Epoch: 656 Average bpd: 3.339
====> [test] Epoch: 656 Average bpd: 3.351
Best val_bpd: 3.339466429592842
Best test_bpd: 3.351251524778582
====> Epoch: 657 Average train loss: -6845.4977 Average bpd: 3.215
====> Epoch: 658 Average train loss: -6846.9398 Average bpd: 3.216
====> [eval] Epoch: 658 Average bpd: 3.342
====> [test] Epoch: 658 Average bpd: 3.354
Best val_bpd: 3.339466429592842
Best test_bpd: 3.351251524778582
====> Epoch: 659 Average train loss: -6845.3045 Average bpd: 3.215
====> Epoch: 660 Average train loss: -6844.3713 Average bpd: 3.214
====> [eval] Epoch: 660 Average bpd: 3.340
====> [test] Epoch: 660 Average bpd: 3.351
Best val_bpd: 3.339466429592842
Best test_bpd: 3.351251524778582
====> Epoch: 661 Average train loss: -6845.4516 Average bpd: 3.215
====> Epoch: 662 Average train loss: -6844.8001 Average bpd: 3.215
====> [eval] Epoch: 662 Average bpd: 3.340
====> [test] Epoch: 662 Average bpd: 3.351
Best val_bpd: 3.339466429592842
Best test_bpd: 3.351251524778582
====> Epoch: 663 Average train loss: -6844.8861 Average bpd: 3.215
====> Epoch: 664 Average train loss: -6844.3836 Average bpd: 3.214
====> [eval] Epoch: 664 Average bpd: 3.337
====> [test] Epoch: 664 Average bpd: 3.349
Best val_bpd: 3.3374428657581556
Best test_bpd: 3.349029754204269
====> Epoch: 665 Average train loss: -6843.3056 Average bpd: 3.214
====> Epoch: 666 Average train loss: -6843.4592 Average bpd: 3.214
====> [eval] Epoch: 666 Average bpd: 3.341
====> [test] Epoch: 666 Average bpd: 3.352
Best val_bpd: 3.3374428657581556
Best test_bpd: 3.349029754204269
====> Epoch: 667 Average train loss: -6843.4123 Average bpd: 3.214
====> Epoch: 668 Average train loss: -6843.9160 Average bpd: 3.214
====> [eval] Epoch: 668 Average bpd: 3.340
====> [test] Epoch: 668 Average bpd: 3.352
Best val_bpd: 3.3374428657581556
Best test_bpd: 3.349029754204269
====> Epoch: 669 Average train loss: -6843.0143 Average bpd: 3.214
====> Epoch: 670 Average train loss: -6842.0057 Average bpd: 3.213
====> [eval] Epoch: 670 Average bpd: 3.339
====> [test] Epoch: 670 Average bpd: 3.351
Best val_bpd: 3.3374428657581556
Best test_bpd: 3.349029754204269
====> Epoch: 671 Average train loss: -6840.8564 Average bpd: 3.213
====> Epoch: 672 Average train loss: -6842.3933 Average bpd: 3.213
====> [eval] Epoch: 672 Average bpd: 3.338
====> [test] Epoch: 672 Average bpd: 3.350
Best val_bpd: 3.3374428657581556
Best test_bpd: 3.349029754204269
====> Epoch: 673 Average train loss: -6840.8952 Average bpd: 3.213
====> Epoch: 674 Average train loss: -6840.6496 Average bpd: 3.213
====> [eval] Epoch: 674 Average bpd: 3.339
====> [test] Epoch: 674 Average bpd: 3.350
Best val_bpd: 3.3374428657581556
Best test_bpd: 3.349029754204269
====> Epoch: 675 Average train loss: -6840.1544 Average bpd: 3.212
====> Epoch: 676 Average train loss: -6840.0363 Average bpd: 3.212
====> [eval] Epoch: 676 Average bpd: 3.337
====> [test] Epoch: 676 Average bpd: 3.349
Best val_bpd: 3.337128937915785
Best test_bpd: 3.3489890175373
====> Epoch: 677 Average train loss: -6840.2719 Average bpd: 3.212
====> Epoch: 678 Average train loss: -6839.5684 Average bpd: 3.212
====> [eval] Epoch: 678 Average bpd: 3.336
====> [test] Epoch: 678 Average bpd: 3.348
Best val_bpd: 3.336403787944559
Best test_bpd: 3.348162585023686
====> Epoch: 679 Average train loss: -6840.5017 Average bpd: 3.212
====> Epoch: 680 Average train loss: -6837.5036 Average bpd: 3.211
====> [eval] Epoch: 680 Average bpd: 3.336
====> [test] Epoch: 680 Average bpd: 3.348
Best val_bpd: 3.3358978425609975
Best test_bpd: 3.3477865380637555
====> Epoch: 681 Average train loss: -6838.4195 Average bpd: 3.212
====> Epoch: 682 Average train loss: -6837.7694 Average bpd: 3.211
====> [eval] Epoch: 682 Average bpd: 3.340
====> [test] Epoch: 682 Average bpd: 3.352
Best val_bpd: 3.3358978425609975
Best test_bpd: 3.3477865380637555
====> Epoch: 683 Average train loss: -6838.9992 Average bpd: 3.212
====> Epoch: 684 Average train loss: -6837.3712 Average bpd: 3.211
====> [eval] Epoch: 684 Average bpd: 3.340
====> [test] Epoch: 684 Average bpd: 3.351
Best val_bpd: 3.3358978425609975
Best test_bpd: 3.3477865380637555
====> Epoch: 685 Average train loss: -6836.8295 Average bpd: 3.211
====> Epoch: 686 Average train loss: -6836.5703 Average bpd: 3.211
====> [eval] Epoch: 686 Average bpd: 3.338
====> [test] Epoch: 686 Average bpd: 3.349
Best val_bpd: 3.3358978425609975
Best test_bpd: 3.3477865380637555
====> Epoch: 687 Average train loss: -6837.0235 Average bpd: 3.211
====> Epoch: 688 Average train loss: -6836.7765 Average bpd: 3.211
====> [eval] Epoch: 688 Average bpd: 3.336
====> [test] Epoch: 688 Average bpd: 3.347
Best val_bpd: 3.3355210300242875
Best test_bpd: 3.347287862394102
====> Epoch: 689 Average train loss: -6834.1260 Average bpd: 3.209
====> Epoch: 690 Average train loss: -6837.4569 Average bpd: 3.211
====> [eval] Epoch: 690 Average bpd: 3.339
====> [test] Epoch: 690 Average bpd: 3.351
Best val_bpd: 3.3355210300242875
Best test_bpd: 3.347287862394102
====> Epoch: 691 Average train loss: -6837.1952 Average bpd: 3.211
====> Epoch: 692 Average train loss: -6835.7285 Average bpd: 3.210
====> [eval] Epoch: 692 Average bpd: 3.337
====> [test] Epoch: 692 Average bpd: 3.349
Best val_bpd: 3.3355210300242875
Best test_bpd: 3.347287862394102
====> Epoch: 693 Average train loss: -6835.6616 Average bpd: 3.210
====> Epoch: 694 Average train loss: -6834.4526 Average bpd: 3.210
====> [eval] Epoch: 694 Average bpd: 3.337
====> [test] Epoch: 694 Average bpd: 3.349
Best val_bpd: 3.3355210300242875
Best test_bpd: 3.347287862394102
====> Epoch: 695 Average train loss: -6835.2775 Average bpd: 3.210
====> Epoch: 696 Average train loss: -6833.4849 Average bpd: 3.209
====> [eval] Epoch: 696 Average bpd: 3.336
====> [test] Epoch: 696 Average bpd: 3.348
Best val_bpd: 3.3355210300242875
Best test_bpd: 3.347287862394102
====> Epoch: 697 Average train loss: -6835.9019 Average bpd: 3.210
====> Epoch: 698 Average train loss: -6833.8721 Average bpd: 3.209
====> [eval] Epoch: 698 Average bpd: 3.335
====> [test] Epoch: 698 Average bpd: 3.347
Best val_bpd: 3.3350562204235508
Best test_bpd: 3.346935221764988
====> Epoch: 699 Average train loss: -6833.9658 Average bpd: 3.209
====> Epoch: 700 Average train loss: -6833.3822 Average bpd: 3.209
====> [eval] Epoch: 700 Average bpd: 3.337
====> [test] Epoch: 700 Average bpd: 3.348
Best val_bpd: 3.3350562204235508
Best test_bpd: 3.346935221764988
====> Epoch: 701 Average train loss: -6833.2343 Average bpd: 3.209
====> Epoch: 702 Average train loss: -6835.3173 Average bpd: 3.210
====> [eval] Epoch: 702 Average bpd: 3.336
====> [test] Epoch: 702 Average bpd: 3.348
Best val_bpd: 3.3350562204235508
Best test_bpd: 3.346935221764988
====> Epoch: 703 Average train loss: -6832.4370 Average bpd: 3.209
====> Epoch: 704 Average train loss: -6832.9006 Average bpd: 3.209
====> [eval] Epoch: 704 Average bpd: 3.338
====> [test] Epoch: 704 Average bpd: 3.350
Best val_bpd: 3.3350562204235508
Best test_bpd: 3.346935221764988
====> Epoch: 705 Average train loss: -6831.9489 Average bpd: 3.208
====> Epoch: 706 Average train loss: -6831.6856 Average bpd: 3.208
====> [eval] Epoch: 706 Average bpd: 3.336
====> [test] Epoch: 706 Average bpd: 3.347
Best val_bpd: 3.3350562204235508
Best test_bpd: 3.346935221764988
====> Epoch: 707 Average train loss: -6831.2535 Average bpd: 3.208
====> Epoch: 708 Average train loss: -6832.2872 Average bpd: 3.209
====> [eval] Epoch: 708 Average bpd: 3.333
====> [test] Epoch: 708 Average bpd: 3.345
Best val_bpd: 3.333411827537454
Best test_bpd: 3.3447477649950166
====> Epoch: 709 Average train loss: -6830.2868 Average bpd: 3.208
====> Epoch: 710 Average train loss: -6830.5055 Average bpd: 3.208
====> [eval] Epoch: 710 Average bpd: 3.336
====> [test] Epoch: 710 Average bpd: 3.348
Best val_bpd: 3.333411827537454
Best test_bpd: 3.3447477649950166
====> Epoch: 711 Average train loss: -6828.2586 Average bpd: 3.207
====> Epoch: 712 Average train loss: -6830.5091 Average bpd: 3.208
====> [eval] Epoch: 712 Average bpd: 3.332
====> [test] Epoch: 712 Average bpd: 3.344
Best val_bpd: 3.3320306225320673
Best test_bpd: 3.343794034116159
====> Epoch: 713 Average train loss: -6829.4906 Average bpd: 3.207
====> Epoch: 714 Average train loss: -6828.0660 Average bpd: 3.207
====> [eval] Epoch: 714 Average bpd: 3.336
====> [test] Epoch: 714 Average bpd: 3.348
Best val_bpd: 3.3320306225320673
Best test_bpd: 3.343794034116159
====> Epoch: 715 Average train loss: -6827.9087 Average bpd: 3.207
====> Epoch: 716 Average train loss: -6829.3457 Average bpd: 3.207
====> [eval] Epoch: 716 Average bpd: 3.337
====> [test] Epoch: 716 Average bpd: 3.348
Best val_bpd: 3.3320306225320673
Best test_bpd: 3.343794034116159
====> Epoch: 717 Average train loss: -6827.7788 Average bpd: 3.207
====> Epoch: 718 Average train loss: -6827.8328 Average bpd: 3.207
====> [eval] Epoch: 718 Average bpd: 3.332
====> [test] Epoch: 718 Average bpd: 3.343
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 719 Average train loss: -6828.4636 Average bpd: 3.207
====> Epoch: 720 Average train loss: -6827.7272 Average bpd: 3.206
====> [eval] Epoch: 720 Average bpd: 3.333
====> [test] Epoch: 720 Average bpd: 3.345
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 721 Average train loss: -6828.2412 Average bpd: 3.207
====> Epoch: 722 Average train loss: -6827.6784 Average bpd: 3.206
====> [eval] Epoch: 722 Average bpd: 3.335
====> [test] Epoch: 722 Average bpd: 3.346
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 723 Average train loss: -6826.6453 Average bpd: 3.206
====> Epoch: 724 Average train loss: -6827.6186 Average bpd: 3.206
====> [eval] Epoch: 724 Average bpd: 3.336
====> [test] Epoch: 724 Average bpd: 3.348
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 725 Average train loss: -6827.1079 Average bpd: 3.206
====> Epoch: 726 Average train loss: -6825.6546 Average bpd: 3.206
====> [eval] Epoch: 726 Average bpd: 3.332
====> [test] Epoch: 726 Average bpd: 3.344
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 727 Average train loss: -6825.2976 Average bpd: 3.205
====> Epoch: 728 Average train loss: -6826.2245 Average bpd: 3.206
====> [eval] Epoch: 728 Average bpd: 3.335
====> [test] Epoch: 728 Average bpd: 3.346
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 729 Average train loss: -6826.0337 Average bpd: 3.206
====> Epoch: 730 Average train loss: -6825.5003 Average bpd: 3.205
====> [eval] Epoch: 730 Average bpd: 3.334
====> [test] Epoch: 730 Average bpd: 3.345
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 731 Average train loss: -6825.8913 Average bpd: 3.206
====> Epoch: 732 Average train loss: -6823.8558 Average bpd: 3.205
====> [eval] Epoch: 732 Average bpd: 3.334
====> [test] Epoch: 732 Average bpd: 3.346
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 733 Average train loss: -6823.9184 Average bpd: 3.205
====> Epoch: 734 Average train loss: -6824.6329 Average bpd: 3.205
====> [eval] Epoch: 734 Average bpd: 3.335
====> [test] Epoch: 734 Average bpd: 3.346
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 735 Average train loss: -6823.9086 Average bpd: 3.205
====> Epoch: 736 Average train loss: -6824.0438 Average bpd: 3.205
====> [eval] Epoch: 736 Average bpd: 3.335
====> [test] Epoch: 736 Average bpd: 3.346
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 737 Average train loss: -6822.3012 Average bpd: 3.204
====> Epoch: 738 Average train loss: -6822.2310 Average bpd: 3.204
====> [eval] Epoch: 738 Average bpd: 3.333
====> [test] Epoch: 738 Average bpd: 3.344
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 739 Average train loss: -6822.0579 Average bpd: 3.204
====> Epoch: 740 Average train loss: -6823.1924 Average bpd: 3.204
====> [eval] Epoch: 740 Average bpd: 3.333
====> [test] Epoch: 740 Average bpd: 3.345
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 741 Average train loss: -6822.6964 Average bpd: 3.204
====> Epoch: 742 Average train loss: -6821.7546 Average bpd: 3.204
====> [eval] Epoch: 742 Average bpd: 3.335
====> [test] Epoch: 742 Average bpd: 3.347
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 743 Average train loss: -6821.6794 Average bpd: 3.204
====> Epoch: 744 Average train loss: -6819.6262 Average bpd: 3.203
====> [eval] Epoch: 744 Average bpd: 3.333
====> [test] Epoch: 744 Average bpd: 3.345
Best val_bpd: 3.331783320188108
Best test_bpd: 3.343337643682998
====> Epoch: 745 Average train loss: -6820.9495 Average bpd: 3.203
====> Epoch: 746 Average train loss: -6821.3370 Average bpd: 3.203
====> [eval] Epoch: 746 Average bpd: 3.332
====> [test] Epoch: 746 Average bpd: 3.343
Best val_bpd: 3.331571962943845
Best test_bpd: 3.3429629428320013
====> Epoch: 747 Average train loss: -6820.0586 Average bpd: 3.203
====> Epoch: 748 Average train loss: -6821.3111 Average bpd: 3.203
====> [eval] Epoch: 748 Average bpd: 3.334
====> [test] Epoch: 748 Average bpd: 3.346
Best val_bpd: 3.331571962943845
Best test_bpd: 3.3429629428320013
====> Epoch: 749 Average train loss: -6820.3649 Average bpd: 3.203
====> Epoch: 750 Average train loss: -6819.4470 Average bpd: 3.203
====> [eval] Epoch: 750 Average bpd: 3.330
====> [test] Epoch: 750 Average bpd: 3.342
Best val_bpd: 3.3304945859726645
Best test_bpd: 3.34205961738315
====> Epoch: 751 Average train loss: -6818.4979 Average bpd: 3.202
====> Epoch: 752 Average train loss: -6818.8413 Average bpd: 3.202
====> [eval] Epoch: 752 Average bpd: 3.330
====> [test] Epoch: 752 Average bpd: 3.341
Best val_bpd: 3.329583491552253
Best test_bpd: 3.3411800581950524
====> Epoch: 753 Average train loss: -6819.6015 Average bpd: 3.203
====> Epoch: 754 Average train loss: -6819.9362 Average bpd: 3.203
====> [eval] Epoch: 754 Average bpd: 3.330
====> [test] Epoch: 754 Average bpd: 3.342
Best val_bpd: 3.329583491552253
Best test_bpd: 3.3411800581950524
====> Epoch: 755 Average train loss: -6819.0536 Average bpd: 3.202
====> Epoch: 756 Average train loss: -6817.4399 Average bpd: 3.202
====> [eval] Epoch: 756 Average bpd: 3.331
====> [test] Epoch: 756 Average bpd: 3.342
Best val_bpd: 3.329583491552253
Best test_bpd: 3.3411800581950524
====> Epoch: 757 Average train loss: -6818.0722 Average bpd: 3.202
====> Epoch: 758 Average train loss: -6817.7551 Average bpd: 3.202
====> [eval] Epoch: 758 Average bpd: 3.335
====> [test] Epoch: 758 Average bpd: 3.347
Best val_bpd: 3.329583491552253
Best test_bpd: 3.3411800581950524
====> Epoch: 759 Average train loss: -6816.6234 Average bpd: 3.201
====> Epoch: 760 Average train loss: -6816.6725 Average bpd: 3.201
====> [eval] Epoch: 760 Average bpd: 3.336
====> [test] Epoch: 760 Average bpd: 3.348
Best val_bpd: 3.329583491552253
Best test_bpd: 3.3411800581950524
====> Epoch: 761 Average train loss: -6815.5698 Average bpd: 3.201
====> Epoch: 762 Average train loss: -6817.1047 Average bpd: 3.201
====> [eval] Epoch: 762 Average bpd: 3.333
====> [test] Epoch: 762 Average bpd: 3.344
Best val_bpd: 3.329583491552253
Best test_bpd: 3.3411800581950524
====> Epoch: 763 Average train loss: -6815.4074 Average bpd: 3.201
====> Epoch: 764 Average train loss: -6815.9818 Average bpd: 3.201
====> [eval] Epoch: 764 Average bpd: 3.329
====> [test] Epoch: 764 Average bpd: 3.341
Best val_bpd: 3.3294438322061035
Best test_bpd: 3.3411940468429875
====> Epoch: 765 Average train loss: -6815.8019 Average bpd: 3.201
====> Epoch: 766 Average train loss: -6815.5655 Average bpd: 3.201
====> [eval] Epoch: 766 Average bpd: 3.329
====> [test] Epoch: 766 Average bpd: 3.340
Best val_bpd: 3.3288132763211777
Best test_bpd: 3.3402356129280104
====> Epoch: 767 Average train loss: -6814.9028 Average bpd: 3.200
====> Epoch: 768 Average train loss: -6814.9727 Average bpd: 3.200
====> [eval] Epoch: 768 Average bpd: 3.329
====> [test] Epoch: 768 Average bpd: 3.341
Best val_bpd: 3.3288132763211777
Best test_bpd: 3.3402356129280104
====> Epoch: 769 Average train loss: -6816.2935 Average bpd: 3.201
====> Epoch: 770 Average train loss: -6813.9778 Average bpd: 3.200
====> [eval] Epoch: 770 Average bpd: 3.329
====> [test] Epoch: 770 Average bpd: 3.341
Best val_bpd: 3.3288132763211777
Best test_bpd: 3.3402356129280104
====> Epoch: 771 Average train loss: -6815.0020 Average bpd: 3.201
====> Epoch: 772 Average train loss: -6813.5352 Average bpd: 3.200
====> [eval] Epoch: 772 Average bpd: 3.327
====> [test] Epoch: 772 Average bpd: 3.338
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 773 Average train loss: -6812.8344 Average bpd: 3.199
====> Epoch: 774 Average train loss: -6814.8044 Average bpd: 3.200
====> [eval] Epoch: 774 Average bpd: 3.328
====> [test] Epoch: 774 Average bpd: 3.339
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 775 Average train loss: -6812.9770 Average bpd: 3.200
====> Epoch: 776 Average train loss: -6813.0091 Average bpd: 3.200
====> [eval] Epoch: 776 Average bpd: 3.329
====> [test] Epoch: 776 Average bpd: 3.340
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 777 Average train loss: -6812.7418 Average bpd: 3.199
====> Epoch: 778 Average train loss: -6812.2833 Average bpd: 3.199
====> [eval] Epoch: 778 Average bpd: 3.328
====> [test] Epoch: 778 Average bpd: 3.339
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 779 Average train loss: -6814.9193 Average bpd: 3.200
====> Epoch: 780 Average train loss: -6812.0248 Average bpd: 3.199
====> [eval] Epoch: 780 Average bpd: 3.328
====> [test] Epoch: 780 Average bpd: 3.340
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 781 Average train loss: -6812.9042 Average bpd: 3.200
====> Epoch: 782 Average train loss: -6810.1943 Average bpd: 3.198
====> [eval] Epoch: 782 Average bpd: 3.328
====> [test] Epoch: 782 Average bpd: 3.340
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 783 Average train loss: -6809.7568 Average bpd: 3.198
====> Epoch: 784 Average train loss: -6811.8785 Average bpd: 3.199
====> [eval] Epoch: 784 Average bpd: 3.328
====> [test] Epoch: 784 Average bpd: 3.340
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 785 Average train loss: -6811.3543 Average bpd: 3.199
====> Epoch: 786 Average train loss: -6811.6041 Average bpd: 3.199
====> [eval] Epoch: 786 Average bpd: 3.327
====> [test] Epoch: 786 Average bpd: 3.339
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 787 Average train loss: -6811.9298 Average bpd: 3.199
====> Epoch: 788 Average train loss: -6811.4028 Average bpd: 3.199
====> [eval] Epoch: 788 Average bpd: 3.328
====> [test] Epoch: 788 Average bpd: 3.339
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 789 Average train loss: -6810.1602 Average bpd: 3.198
====> Epoch: 790 Average train loss: -6810.4276 Average bpd: 3.198
====> [eval] Epoch: 790 Average bpd: 3.330
====> [test] Epoch: 790 Average bpd: 3.341
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 791 Average train loss: -6811.6488 Average bpd: 3.199
====> Epoch: 792 Average train loss: -6810.4090 Average bpd: 3.198
====> [eval] Epoch: 792 Average bpd: 3.330
====> [test] Epoch: 792 Average bpd: 3.342
Best val_bpd: 3.3265048003668847
Best test_bpd: 3.3379301985551684
====> Epoch: 793 Average train loss: -6809.4284 Average bpd: 3.198
====> Epoch: 794 Average train loss: -6807.2155 Average bpd: 3.197
====> [eval] Epoch: 794 Average bpd: 3.326
====> [test] Epoch: 794 Average bpd: 3.338
Best val_bpd: 3.325923792901372
Best test_bpd: 3.337517340776966
====> Epoch: 795 Average train loss: -6808.7413 Average bpd: 3.198
====> Epoch: 796 Average train loss: -6809.0242 Average bpd: 3.198
====> [eval] Epoch: 796 Average bpd: 3.326
====> [test] Epoch: 796 Average bpd: 3.338
Best val_bpd: 3.325923792901372
Best test_bpd: 3.337517340776966
====> Epoch: 797 Average train loss: -6807.6006 Average bpd: 3.197
====> Epoch: 798 Average train loss: -6807.2593 Average bpd: 3.197
====> [eval] Epoch: 798 Average bpd: 3.327
====> [test] Epoch: 798 Average bpd: 3.338
Best val_bpd: 3.325923792901372
Best test_bpd: 3.337517340776966
====> Epoch: 799 Average train loss: -6809.0008 Average bpd: 3.198
====> Epoch: 800 Average train loss: -6807.5693 Average bpd: 3.197
====> [eval] Epoch: 800 Average bpd: 3.325
====> [test] Epoch: 800 Average bpd: 3.337
Best val_bpd: 3.325098480089149
Best test_bpd: 3.3368048543890314
====> Epoch: 801 Average train loss: -6807.1341 Average bpd: 3.197
====> Epoch: 802 Average train loss: -6806.3050 Average bpd: 3.196
====> [eval] Epoch: 802 Average bpd: 3.326
====> [test] Epoch: 802 Average bpd: 3.337
Best val_bpd: 3.325098480089149
Best test_bpd: 3.3368048543890314
====> Epoch: 803 Average train loss: -6807.5042 Average bpd: 3.197
====> Epoch: 804 Average train loss: -6807.6685 Average bpd: 3.197
====> [eval] Epoch: 804 Average bpd: 3.325
====> [test] Epoch: 804 Average bpd: 3.337
Best val_bpd: 3.325098480089149
Best test_bpd: 3.3368048543890314
====> Epoch: 805 Average train loss: -6805.8128 Average bpd: 3.196
====> Epoch: 806 Average train loss: -6807.5591 Average bpd: 3.197
====> [eval] Epoch: 806 Average bpd: 3.324
====> [test] Epoch: 806 Average bpd: 3.336
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 807 Average train loss: -6805.4873 Average bpd: 3.196
====> Epoch: 808 Average train loss: -6805.4822 Average bpd: 3.196
====> [eval] Epoch: 808 Average bpd: 3.329
====> [test] Epoch: 808 Average bpd: 3.341
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 809 Average train loss: -6805.4987 Average bpd: 3.196
====> Epoch: 810 Average train loss: -6806.0886 Average bpd: 3.196
====> [eval] Epoch: 810 Average bpd: 3.325
====> [test] Epoch: 810 Average bpd: 3.337
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 811 Average train loss: -6804.3116 Average bpd: 3.195
====> Epoch: 812 Average train loss: -6805.5654 Average bpd: 3.196
====> [eval] Epoch: 812 Average bpd: 3.328
====> [test] Epoch: 812 Average bpd: 3.340
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 813 Average train loss: -6804.5659 Average bpd: 3.196
====> Epoch: 814 Average train loss: -6804.0502 Average bpd: 3.195
====> [eval] Epoch: 814 Average bpd: 3.327
====> [test] Epoch: 814 Average bpd: 3.338
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 815 Average train loss: -6804.2042 Average bpd: 3.195
====> Epoch: 816 Average train loss: -6802.6334 Average bpd: 3.195
====> [eval] Epoch: 816 Average bpd: 3.326
====> [test] Epoch: 816 Average bpd: 3.337
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 817 Average train loss: -6803.6666 Average bpd: 3.195
====> Epoch: 818 Average train loss: -6801.8211 Average bpd: 3.194
====> [eval] Epoch: 818 Average bpd: 3.326
====> [test] Epoch: 818 Average bpd: 3.338
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 819 Average train loss: -6801.8042 Average bpd: 3.194
====> Epoch: 820 Average train loss: -6802.8542 Average bpd: 3.195
====> [eval] Epoch: 820 Average bpd: 3.325
====> [test] Epoch: 820 Average bpd: 3.337
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 821 Average train loss: -6802.7163 Average bpd: 3.195
====> Epoch: 822 Average train loss: -6802.8519 Average bpd: 3.195
====> [eval] Epoch: 822 Average bpd: 3.326
====> [test] Epoch: 822 Average bpd: 3.338
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 823 Average train loss: -6802.2464 Average bpd: 3.195
====> Epoch: 824 Average train loss: -6801.9139 Average bpd: 3.194
====> [eval] Epoch: 824 Average bpd: 3.325
====> [test] Epoch: 824 Average bpd: 3.337
Best val_bpd: 3.323983613802742
Best test_bpd: 3.3355452062858744
====> Epoch: 825 Average train loss: -6802.1127 Average bpd: 3.194
====> Epoch: 826 Average train loss: -6803.3051 Average bpd: 3.195
====> [eval] Epoch: 826 Average bpd: 3.323
====> [test] Epoch: 826 Average bpd: 3.335
Best val_bpd: 3.323070142103155
Best test_bpd: 3.334905529515131
====> Epoch: 827 Average train loss: -6801.8727 Average bpd: 3.194
====> Epoch: 828 Average train loss: -6801.9339 Average bpd: 3.194
====> [eval] Epoch: 828 Average bpd: 3.326
====> [test] Epoch: 828 Average bpd: 3.338
Best val_bpd: 3.323070142103155
Best test_bpd: 3.334905529515131
====> Epoch: 829 Average train loss: -6799.8425 Average bpd: 3.193
====> Epoch: 830 Average train loss: -6802.7065 Average bpd: 3.195
====> [eval] Epoch: 830 Average bpd: 3.325
====> [test] Epoch: 830 Average bpd: 3.337
Best val_bpd: 3.323070142103155
Best test_bpd: 3.334905529515131
====> Epoch: 831 Average train loss: -6801.1062 Average bpd: 3.194
====> Epoch: 832 Average train loss: -6800.7125 Average bpd: 3.194
====> [eval] Epoch: 832 Average bpd: 3.323
====> [test] Epoch: 832 Average bpd: 3.334
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 833 Average train loss: -6800.3132 Average bpd: 3.194
====> Epoch: 834 Average train loss: -6800.7427 Average bpd: 3.194
====> [eval] Epoch: 834 Average bpd: 3.325
====> [test] Epoch: 834 Average bpd: 3.337
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 835 Average train loss: -6800.1042 Average bpd: 3.194
====> Epoch: 836 Average train loss: -6799.2644 Average bpd: 3.193
====> [eval] Epoch: 836 Average bpd: 3.324
====> [test] Epoch: 836 Average bpd: 3.336
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 837 Average train loss: -6799.6115 Average bpd: 3.193
====> Epoch: 838 Average train loss: -6797.8765 Average bpd: 3.192
====> [eval] Epoch: 838 Average bpd: 3.326
====> [test] Epoch: 838 Average bpd: 3.338
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 839 Average train loss: -6799.2302 Average bpd: 3.193
====> Epoch: 840 Average train loss: -6798.7399 Average bpd: 3.193
====> [eval] Epoch: 840 Average bpd: 3.324
====> [test] Epoch: 840 Average bpd: 3.336
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 841 Average train loss: -6797.7562 Average bpd: 3.192
====> Epoch: 842 Average train loss: -6798.4353 Average bpd: 3.193
====> [eval] Epoch: 842 Average bpd: 3.324
====> [test] Epoch: 842 Average bpd: 3.335
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 843 Average train loss: -6798.4522 Average bpd: 3.193
====> Epoch: 844 Average train loss: -6799.0338 Average bpd: 3.193
====> [eval] Epoch: 844 Average bpd: 3.323
====> [test] Epoch: 844 Average bpd: 3.335
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 845 Average train loss: -6797.7449 Average bpd: 3.192
====> Epoch: 846 Average train loss: -6796.4030 Average bpd: 3.192
====> [eval] Epoch: 846 Average bpd: 3.324
====> [test] Epoch: 846 Average bpd: 3.336
Best val_bpd: 3.3227245026918277
Best test_bpd: 3.334411589536532
====> Epoch: 847 Average train loss: -6795.5027 Average bpd: 3.191
====> Epoch: 848 Average train loss: -6795.8749 Average bpd: 3.192
====> [eval] Epoch: 848 Average bpd: 3.322
====> [test] Epoch: 848 Average bpd: 3.334
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 849 Average train loss: -6795.6632 Average bpd: 3.191
====> Epoch: 850 Average train loss: -6794.7124 Average bpd: 3.191
====> [eval] Epoch: 850 Average bpd: 3.327
====> [test] Epoch: 850 Average bpd: 3.339
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 851 Average train loss: -6796.1985 Average bpd: 3.192
====> Epoch: 852 Average train loss: -6798.1680 Average bpd: 3.193
====> [eval] Epoch: 852 Average bpd: 3.324
====> [test] Epoch: 852 Average bpd: 3.336
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 853 Average train loss: -6796.4370 Average bpd: 3.192
====> Epoch: 854 Average train loss: -6795.6700 Average bpd: 3.191
====> [eval] Epoch: 854 Average bpd: 3.322
====> [test] Epoch: 854 Average bpd: 3.334
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 855 Average train loss: -6796.1453 Average bpd: 3.192
====> Epoch: 856 Average train loss: -6796.1053 Average bpd: 3.192
====> [eval] Epoch: 856 Average bpd: 3.323
====> [test] Epoch: 856 Average bpd: 3.334
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 857 Average train loss: -6795.0032 Average bpd: 3.191
====> Epoch: 858 Average train loss: -6795.3985 Average bpd: 3.191
====> [eval] Epoch: 858 Average bpd: 3.326
====> [test] Epoch: 858 Average bpd: 3.337
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 859 Average train loss: -6794.2243 Average bpd: 3.191
====> Epoch: 860 Average train loss: -6794.6886 Average bpd: 3.191
====> [eval] Epoch: 860 Average bpd: 3.324
====> [test] Epoch: 860 Average bpd: 3.336
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 861 Average train loss: -6794.4201 Average bpd: 3.191
====> Epoch: 862 Average train loss: -6794.4734 Average bpd: 3.191
====> [eval] Epoch: 862 Average bpd: 3.324
====> [test] Epoch: 862 Average bpd: 3.336
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 863 Average train loss: -6793.1680 Average bpd: 3.190
====> Epoch: 864 Average train loss: -6792.8384 Average bpd: 3.190
====> [eval] Epoch: 864 Average bpd: 3.322
====> [test] Epoch: 864 Average bpd: 3.334
Best val_bpd: 3.321711933427751
Best test_bpd: 3.3336791183291674
====> Epoch: 865 Average train loss: -6793.0263 Average bpd: 3.190
====> Epoch: 866 Average train loss: -6793.8737 Average bpd: 3.191
====> [eval] Epoch: 866 Average bpd: 3.321
====> [test] Epoch: 866 Average bpd: 3.333
Best val_bpd: 3.3211921539629405
Best test_bpd: 3.3328389388141253
====> Epoch: 867 Average train loss: -6793.0822 Average bpd: 3.190
====> Epoch: 868 Average train loss: -6792.5600 Average bpd: 3.190
====> [eval] Epoch: 868 Average bpd: 3.321
====> [test] Epoch: 868 Average bpd: 3.333
Best val_bpd: 3.320813864697561
Best test_bpd: 3.33274465966652
====> Epoch: 869 Average train loss: -6791.5116 Average bpd: 3.189
====> Epoch: 870 Average train loss: -6792.2872 Average bpd: 3.190
====> [eval] Epoch: 870 Average bpd: 3.325
====> [test] Epoch: 870 Average bpd: 3.337
Best val_bpd: 3.320813864697561
Best test_bpd: 3.33274465966652
====> Epoch: 871 Average train loss: -6793.5706 Average bpd: 3.190
====> Epoch: 872 Average train loss: -6790.9628 Average bpd: 3.189
====> [eval] Epoch: 872 Average bpd: 3.326
====> [test] Epoch: 872 Average bpd: 3.338
Best val_bpd: 3.320813864697561
Best test_bpd: 3.33274465966652
====> Epoch: 873 Average train loss: -6791.9850 Average bpd: 3.190
====> Epoch: 874 Average train loss: -6791.4450 Average bpd: 3.189
====> [eval] Epoch: 874 Average bpd: 3.323
====> [test] Epoch: 874 Average bpd: 3.335
Best val_bpd: 3.320813864697561
Best test_bpd: 3.33274465966652
====> Epoch: 875 Average train loss: -6791.6541 Average bpd: 3.190
====> Epoch: 876 Average train loss: -6792.4040 Average bpd: 3.190
====> [eval] Epoch: 876 Average bpd: 3.323
====> [test] Epoch: 876 Average bpd: 3.335
Best val_bpd: 3.320813864697561
Best test_bpd: 3.33274465966652
====> Epoch: 877 Average train loss: -6790.2412 Average bpd: 3.189
====> Epoch: 878 Average train loss: -6789.5437 Average bpd: 3.189
====> [eval] Epoch: 878 Average bpd: 3.323
====> [test] Epoch: 878 Average bpd: 3.335
Best val_bpd: 3.320813864697561
Best test_bpd: 3.33274465966652
====> Epoch: 879 Average train loss: -6791.4717 Average bpd: 3.189
====> Epoch: 880 Average train loss: -6789.7752 Average bpd: 3.189
====> [eval] Epoch: 880 Average bpd: 3.319
====> [test] Epoch: 880 Average bpd: 3.331
Best val_bpd: 3.3194231817790816
Best test_bpd: 3.3313090609752223
====> Epoch: 881 Average train loss: -6790.3363 Average bpd: 3.189
====> Epoch: 882 Average train loss: -6790.5675 Average bpd: 3.189
====> [eval] Epoch: 882 Average bpd: 3.324
====> [test] Epoch: 882 Average bpd: 3.336
Best val_bpd: 3.3194231817790816
Best test_bpd: 3.3313090609752223
====> Epoch: 883 Average train loss: -6789.7179 Average bpd: 3.189
====> Epoch: 884 Average train loss: -6788.4430 Average bpd: 3.188
====> [eval] Epoch: 884 Average bpd: 3.321
====> [test] Epoch: 884 Average bpd: 3.332
Best val_bpd: 3.3194231817790816
Best test_bpd: 3.3313090609752223
====> Epoch: 885 Average train loss: -6789.6917 Average bpd: 3.189
====> Epoch: 886 Average train loss: -6789.5610 Average bpd: 3.189
====> [eval] Epoch: 886 Average bpd: 3.320
====> [test] Epoch: 886 Average bpd: 3.332
Best val_bpd: 3.3194231817790816
Best test_bpd: 3.3313090609752223
====> Epoch: 887 Average train loss: -6788.8100 Average bpd: 3.188
====> Epoch: 888 Average train loss: -6787.5949 Average bpd: 3.188
====> [eval] Epoch: 888 Average bpd: 3.319
====> [test] Epoch: 888 Average bpd: 3.331
Best val_bpd: 3.3194231817790816
Best test_bpd: 3.3313090609752223
====> Epoch: 889 Average train loss: -6789.7300 Average bpd: 3.189
====> Epoch: 890 Average train loss: -6787.7174 Average bpd: 3.188
====> [eval] Epoch: 890 Average bpd: 3.318
====> [test] Epoch: 890 Average bpd: 3.330
Best val_bpd: 3.318470532867027
Best test_bpd: 3.3303537706418638
====> Epoch: 891 Average train loss: -6786.9262 Average bpd: 3.187
====> Epoch: 892 Average train loss: -6787.2695 Average bpd: 3.187
====> [eval] Epoch: 892 Average bpd: 3.321
====> [test] Epoch: 892 Average bpd: 3.332
Best val_bpd: 3.318470532867027
Best test_bpd: 3.3303537706418638
====> Epoch: 893 Average train loss: -6788.4293 Average bpd: 3.188
====> Epoch: 894 Average train loss: -6786.4955 Average bpd: 3.187
====> [eval] Epoch: 894 Average bpd: 3.318
====> [test] Epoch: 894 Average bpd: 3.329
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 895 Average train loss: -6787.3864 Average bpd: 3.188
====> Epoch: 896 Average train loss: -6787.3524 Average bpd: 3.188
====> [eval] Epoch: 896 Average bpd: 3.320
====> [test] Epoch: 896 Average bpd: 3.331
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 897 Average train loss: -6787.8098 Average bpd: 3.188
====> Epoch: 898 Average train loss: -6785.8430 Average bpd: 3.187
====> [eval] Epoch: 898 Average bpd: 3.321
====> [test] Epoch: 898 Average bpd: 3.333
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 899 Average train loss: -6786.2953 Average bpd: 3.187
====> Epoch: 900 Average train loss: -6786.1539 Average bpd: 3.187
====> [eval] Epoch: 900 Average bpd: 3.320
====> [test] Epoch: 900 Average bpd: 3.332
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 901 Average train loss: -6786.2019 Average bpd: 3.187
====> Epoch: 902 Average train loss: -6784.6728 Average bpd: 3.186
====> [eval] Epoch: 902 Average bpd: 3.321
====> [test] Epoch: 902 Average bpd: 3.333
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 903 Average train loss: -6786.1821 Average bpd: 3.187
====> Epoch: 904 Average train loss: -6785.8336 Average bpd: 3.187
====> [eval] Epoch: 904 Average bpd: 3.318
====> [test] Epoch: 904 Average bpd: 3.330
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 905 Average train loss: -6784.5902 Average bpd: 3.186
====> Epoch: 906 Average train loss: -6784.8931 Average bpd: 3.186
====> [eval] Epoch: 906 Average bpd: 3.319
====> [test] Epoch: 906 Average bpd: 3.330
Best val_bpd: 3.317702266772662
Best test_bpd: 3.329344365278029
====> Epoch: 907 Average train loss: -6784.7603 Average bpd: 3.186
====> Epoch: 908 Average train loss: -6785.2638 Average bpd: 3.187
====> [eval] Epoch: 908 Average bpd: 3.317
====> [test] Epoch: 908 Average bpd: 3.329
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 909 Average train loss: -6785.8573 Average bpd: 3.187
====> Epoch: 910 Average train loss: -6784.0307 Average bpd: 3.186
====> [eval] Epoch: 910 Average bpd: 3.321
====> [test] Epoch: 910 Average bpd: 3.333
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 911 Average train loss: -6784.3314 Average bpd: 3.186
====> Epoch: 912 Average train loss: -6783.4336 Average bpd: 3.186
====> [eval] Epoch: 912 Average bpd: 3.318
====> [test] Epoch: 912 Average bpd: 3.330
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 913 Average train loss: -6783.7423 Average bpd: 3.186
====> Epoch: 914 Average train loss: -6783.0940 Average bpd: 3.186
====> [eval] Epoch: 914 Average bpd: 3.318
====> [test] Epoch: 914 Average bpd: 3.330
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 915 Average train loss: -6781.9660 Average bpd: 3.185
====> Epoch: 916 Average train loss: -6781.8977 Average bpd: 3.185
====> [eval] Epoch: 916 Average bpd: 3.318
====> [test] Epoch: 916 Average bpd: 3.329
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 917 Average train loss: -6782.2339 Average bpd: 3.185
====> Epoch: 918 Average train loss: -6781.9495 Average bpd: 3.185
====> [eval] Epoch: 918 Average bpd: 3.318
====> [test] Epoch: 918 Average bpd: 3.329
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 919 Average train loss: -6782.5594 Average bpd: 3.185
====> Epoch: 920 Average train loss: -6781.1860 Average bpd: 3.185
====> [eval] Epoch: 920 Average bpd: 3.320
====> [test] Epoch: 920 Average bpd: 3.332
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 921 Average train loss: -6782.9602 Average bpd: 3.185
====> Epoch: 922 Average train loss: -6780.4483 Average bpd: 3.184
====> [eval] Epoch: 922 Average bpd: 3.321
====> [test] Epoch: 922 Average bpd: 3.333
Best val_bpd: 3.316919862543333
Best test_bpd: 3.3287857569200443
====> Epoch: 923 Average train loss: -6782.4232 Average bpd: 3.185
====> Epoch: 924 Average train loss: -6781.1480 Average bpd: 3.185
====> [eval] Epoch: 924 Average bpd: 3.317
====> [test] Epoch: 924 Average bpd: 3.328
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 925 Average train loss: -6782.2381 Average bpd: 3.185
====> Epoch: 926 Average train loss: -6781.0124 Average bpd: 3.185
====> [eval] Epoch: 926 Average bpd: 3.318
====> [test] Epoch: 926 Average bpd: 3.330
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 927 Average train loss: -6781.4370 Average bpd: 3.185
====> Epoch: 928 Average train loss: -6781.9121 Average bpd: 3.185
====> [eval] Epoch: 928 Average bpd: 3.318
====> [test] Epoch: 928 Average bpd: 3.330
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 929 Average train loss: -6780.0581 Average bpd: 3.184
====> Epoch: 930 Average train loss: -6781.8626 Average bpd: 3.185
====> [eval] Epoch: 930 Average bpd: 3.320
====> [test] Epoch: 930 Average bpd: 3.332
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 931 Average train loss: -6780.0585 Average bpd: 3.184
====> Epoch: 932 Average train loss: -6780.5974 Average bpd: 3.184
====> [eval] Epoch: 932 Average bpd: 3.317
====> [test] Epoch: 932 Average bpd: 3.328
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 933 Average train loss: -6779.3081 Average bpd: 3.184
====> Epoch: 934 Average train loss: -6778.9238 Average bpd: 3.184
====> [eval] Epoch: 934 Average bpd: 3.322
====> [test] Epoch: 934 Average bpd: 3.334
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 935 Average train loss: -6778.4094 Average bpd: 3.183
====> Epoch: 936 Average train loss: -6778.0340 Average bpd: 3.183
====> [eval] Epoch: 936 Average bpd: 3.317
====> [test] Epoch: 936 Average bpd: 3.329
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 937 Average train loss: -6777.2441 Average bpd: 3.183
====> Epoch: 938 Average train loss: -6776.8011 Average bpd: 3.183
====> [eval] Epoch: 938 Average bpd: 3.318
====> [test] Epoch: 938 Average bpd: 3.330
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 939 Average train loss: -6778.4445 Average bpd: 3.183
====> Epoch: 940 Average train loss: -6778.2519 Average bpd: 3.183
====> [eval] Epoch: 940 Average bpd: 3.317
====> [test] Epoch: 940 Average bpd: 3.329
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 941 Average train loss: -6778.3058 Average bpd: 3.183
====> Epoch: 942 Average train loss: -6779.2651 Average bpd: 3.184
====> [eval] Epoch: 942 Average bpd: 3.318
====> [test] Epoch: 942 Average bpd: 3.330
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 943 Average train loss: -6778.0225 Average bpd: 3.183
====> Epoch: 944 Average train loss: -6777.5681 Average bpd: 3.183
====> [eval] Epoch: 944 Average bpd: 3.318
====> [test] Epoch: 944 Average bpd: 3.330
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 945 Average train loss: -6777.2271 Average bpd: 3.183
====> Epoch: 946 Average train loss: -6777.9645 Average bpd: 3.183
====> [eval] Epoch: 946 Average bpd: 3.319
====> [test] Epoch: 946 Average bpd: 3.331
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 947 Average train loss: -6779.1356 Average bpd: 3.184
====> Epoch: 948 Average train loss: -6776.7289 Average bpd: 3.183
====> [eval] Epoch: 948 Average bpd: 3.317
====> [test] Epoch: 948 Average bpd: 3.329
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 949 Average train loss: -6778.1518 Average bpd: 3.183
====> Epoch: 950 Average train loss: -6776.5827 Average bpd: 3.182
====> [eval] Epoch: 950 Average bpd: 3.318
====> [test] Epoch: 950 Average bpd: 3.329
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 951 Average train loss: -6778.2381 Average bpd: 3.183
====> Epoch: 952 Average train loss: -6776.9022 Average bpd: 3.183
====> [eval] Epoch: 952 Average bpd: 3.317
====> [test] Epoch: 952 Average bpd: 3.329
Best val_bpd: 3.316530554777656
Best test_bpd: 3.328356433072937
====> Epoch: 953 Average train loss: -6776.0563 Average bpd: 3.182
====> Epoch: 954 Average train loss: -6775.1242 Average bpd: 3.182
====> [eval] Epoch: 954 Average bpd: 3.316
====> [test] Epoch: 954 Average bpd: 3.327
Best val_bpd: 3.3155446246828326
Best test_bpd: 3.3273168226210874
====> Epoch: 955 Average train loss: -6774.3791 Average bpd: 3.181
====> Epoch: 956 Average train loss: -6775.2893 Average bpd: 3.182
====> [eval] Epoch: 956 Average bpd: 3.316
====> [test] Epoch: 956 Average bpd: 3.327
Best val_bpd: 3.3155302492553487
Best test_bpd: 3.3271298136210623
====> Epoch: 957 Average train loss: -6774.4335 Average bpd: 3.181
====> Epoch: 958 Average train loss: -6775.1486 Average bpd: 3.182
====> [eval] Epoch: 958 Average bpd: 3.318
====> [test] Epoch: 958 Average bpd: 3.329
Best val_bpd: 3.3155302492553487
Best test_bpd: 3.3271298136210623
====> Epoch: 959 Average train loss: -6774.1006 Average bpd: 3.181
====> Epoch: 960 Average train loss: -6775.9488 Average bpd: 3.182
====> [eval] Epoch: 960 Average bpd: 3.314
====> [test] Epoch: 960 Average bpd: 3.326
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 961 Average train loss: -6775.0317 Average bpd: 3.182
====> Epoch: 962 Average train loss: -6774.8105 Average bpd: 3.182
====> [eval] Epoch: 962 Average bpd: 3.316
====> [test] Epoch: 962 Average bpd: 3.328
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 963 Average train loss: -6774.4624 Average bpd: 3.181
====> Epoch: 964 Average train loss: -6773.3307 Average bpd: 3.181
====> [eval] Epoch: 964 Average bpd: 3.315
====> [test] Epoch: 964 Average bpd: 3.327
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 965 Average train loss: -6773.4648 Average bpd: 3.181
====> Epoch: 966 Average train loss: -6772.6446 Average bpd: 3.181
====> [eval] Epoch: 966 Average bpd: 3.318
====> [test] Epoch: 966 Average bpd: 3.329
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 967 Average train loss: -6773.8934 Average bpd: 3.181
====> Epoch: 968 Average train loss: -6774.4350 Average bpd: 3.181
====> [eval] Epoch: 968 Average bpd: 3.316
====> [test] Epoch: 968 Average bpd: 3.328
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 969 Average train loss: -6772.6391 Average bpd: 3.181
====> Epoch: 970 Average train loss: -6773.8027 Average bpd: 3.181
====> [eval] Epoch: 970 Average bpd: 3.317
====> [test] Epoch: 970 Average bpd: 3.328
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 971 Average train loss: -6772.1852 Average bpd: 3.180
====> Epoch: 972 Average train loss: -6773.1164 Average bpd: 3.181
====> [eval] Epoch: 972 Average bpd: 3.314
====> [test] Epoch: 972 Average bpd: 3.326
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 973 Average train loss: -6773.0912 Average bpd: 3.181
====> Epoch: 974 Average train loss: -6771.4375 Average bpd: 3.180
====> [eval] Epoch: 974 Average bpd: 3.319
====> [test] Epoch: 974 Average bpd: 3.330
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 975 Average train loss: -6772.7574 Average bpd: 3.181
====> Epoch: 976 Average train loss: -6772.0066 Average bpd: 3.180
====> [eval] Epoch: 976 Average bpd: 3.316
====> [test] Epoch: 976 Average bpd: 3.327
Best val_bpd: 3.3141552283687403
Best test_bpd: 3.3257700476681515
====> Epoch: 977 Average train loss: -6770.5606 Average bpd: 3.180
====> Epoch: 978 Average train loss: -6773.8597 Average bpd: 3.181
====> [eval] Epoch: 978 Average bpd: 3.314
====> [test] Epoch: 978 Average bpd: 3.326
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 979 Average train loss: -6771.1934 Average bpd: 3.180
====> Epoch: 980 Average train loss: -6772.2833 Average bpd: 3.180
====> [eval] Epoch: 980 Average bpd: 3.314
====> [test] Epoch: 980 Average bpd: 3.326
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 981 Average train loss: -6771.1695 Average bpd: 3.180
====> Epoch: 982 Average train loss: -6770.9425 Average bpd: 3.180
====> [eval] Epoch: 982 Average bpd: 3.319
====> [test] Epoch: 982 Average bpd: 3.330
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 983 Average train loss: -6772.1387 Average bpd: 3.180
====> Epoch: 984 Average train loss: -6770.6518 Average bpd: 3.180
====> [eval] Epoch: 984 Average bpd: 3.316
====> [test] Epoch: 984 Average bpd: 3.327
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 985 Average train loss: -6770.8778 Average bpd: 3.180
====> Epoch: 986 Average train loss: -6770.0772 Average bpd: 3.179
====> [eval] Epoch: 986 Average bpd: 3.316
====> [test] Epoch: 986 Average bpd: 3.328
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 987 Average train loss: -6771.8567 Average bpd: 3.180
====> Epoch: 988 Average train loss: -6771.5249 Average bpd: 3.180
====> [eval] Epoch: 988 Average bpd: 3.315
====> [test] Epoch: 988 Average bpd: 3.326
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 989 Average train loss: -6770.8823 Average bpd: 3.180
====> Epoch: 990 Average train loss: -6770.0459 Average bpd: 3.179
====> [eval] Epoch: 990 Average bpd: 3.315
====> [test] Epoch: 990 Average bpd: 3.327
Best val_bpd: 3.3139415874561133
Best test_bpd: 3.3255842178740664
====> Epoch: 991 Average train loss: -6770.9543 Average bpd: 3.180
====> Epoch: 992 Average train loss: -6769.9601 Average bpd: 3.179
====> [eval] Epoch: 992 Average bpd: 3.313
====> [test] Epoch: 992 Average bpd: 3.325
Best val_bpd: 3.3132557554764013
Best test_bpd: 3.3251119629484447
====> Epoch: 993 Average train loss: -6769.2260 Average bpd: 3.179
====> Epoch: 994 Average train loss: -6768.4456 Average bpd: 3.179
====> [eval] Epoch: 994 Average bpd: 3.316
====> [test] Epoch: 994 Average bpd: 3.328
Best val_bpd: 3.3132557554764013
Best test_bpd: 3.3251119629484447
====> Epoch: 995 Average train loss: -6768.7102 Average bpd: 3.179
====> Epoch: 996 Average train loss: -6768.3807 Average bpd: 3.179
====> [eval] Epoch: 996 Average bpd: 3.313
====> [test] Epoch: 996 Average bpd: 3.325
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 997 Average train loss: -6769.1120 Average bpd: 3.179
====> Epoch: 998 Average train loss: -6767.6331 Average bpd: 3.178
====> [eval] Epoch: 998 Average bpd: 3.314
====> [test] Epoch: 998 Average bpd: 3.326
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 999 Average train loss: -6767.4129 Average bpd: 3.178
====> Epoch: 1000 Average train loss: -6767.3723 Average bpd: 3.178
====> [eval] Epoch: 1000 Average bpd: 3.314
====> [test] Epoch: 1000 Average bpd: 3.326
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 1001 Average train loss: -6768.1741 Average bpd: 3.179
====> Epoch: 1002 Average train loss: -6766.2236 Average bpd: 3.178
====> [eval] Epoch: 1002 Average bpd: 3.315
====> [test] Epoch: 1002 Average bpd: 3.327
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 1003 Average train loss: -6766.7154 Average bpd: 3.178
====> Epoch: 1004 Average train loss: -6767.1945 Average bpd: 3.178
====> [eval] Epoch: 1004 Average bpd: 3.315
====> [test] Epoch: 1004 Average bpd: 3.327
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 1005 Average train loss: -6766.5483 Average bpd: 3.178
====> Epoch: 1006 Average train loss: -6766.5329 Average bpd: 3.178
====> [eval] Epoch: 1006 Average bpd: 3.313
====> [test] Epoch: 1006 Average bpd: 3.325
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 1007 Average train loss: -6765.8953 Average bpd: 3.177
====> Epoch: 1008 Average train loss: -6766.9332 Average bpd: 3.178
====> [eval] Epoch: 1008 Average bpd: 3.314
====> [test] Epoch: 1008 Average bpd: 3.326
Best val_bpd: 3.3131551877142282
Best test_bpd: 3.3248510855857107
====> Epoch: 1009 Average train loss: -6766.4815 Average bpd: 3.178
====> Epoch: 1010 Average train loss: -6766.9502 Average bpd: 3.178
====> [eval] Epoch: 1010 Average bpd: 3.313
====> [test] Epoch: 1010 Average bpd: 3.325
Best val_bpd: 3.3127340638838065
Best test_bpd: 3.324518079279739
====> Epoch: 1011 Average train loss: -6767.2927 Average bpd: 3.178
====> Epoch: 1012 Average train loss: -6765.4969 Average bpd: 3.177
====> [eval] Epoch: 1012 Average bpd: 3.315
====> [test] Epoch: 1012 Average bpd: 3.327
Best val_bpd: 3.3127340638838065
Best test_bpd: 3.324518079279739
====> Epoch: 1013 Average train loss: -6765.8932 Average bpd: 3.177
====> Epoch: 1014 Average train loss: -6765.0797 Average bpd: 3.177
====> [eval] Epoch: 1014 Average bpd: 3.314
====> [test] Epoch: 1014 Average bpd: 3.326
Best val_bpd: 3.3127340638838065
Best test_bpd: 3.324518079279739
====> Epoch: 1015 Average train loss: -6764.4078 Average bpd: 3.177
====> Epoch: 1016 Average train loss: -6764.0869 Average bpd: 3.177
====> [eval] Epoch: 1016 Average bpd: 3.312
====> [test] Epoch: 1016 Average bpd: 3.323
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1017 Average train loss: -6765.5303 Average bpd: 3.177
====> Epoch: 1018 Average train loss: -6765.2738 Average bpd: 3.177
====> [eval] Epoch: 1018 Average bpd: 3.315
====> [test] Epoch: 1018 Average bpd: 3.327
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1019 Average train loss: -6765.9749 Average bpd: 3.177
====> Epoch: 1020 Average train loss: -6763.7161 Average bpd: 3.176
====> [eval] Epoch: 1020 Average bpd: 3.314
====> [test] Epoch: 1020 Average bpd: 3.326
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1021 Average train loss: -6763.6782 Average bpd: 3.176
====> Epoch: 1022 Average train loss: -6763.9405 Average bpd: 3.177
====> [eval] Epoch: 1022 Average bpd: 3.314
====> [test] Epoch: 1022 Average bpd: 3.326
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1023 Average train loss: -6764.0209 Average bpd: 3.177
====> Epoch: 1024 Average train loss: -6764.0139 Average bpd: 3.177
====> [eval] Epoch: 1024 Average bpd: 3.313
====> [test] Epoch: 1024 Average bpd: 3.324
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1025 Average train loss: -6763.8911 Average bpd: 3.177
====> Epoch: 1026 Average train loss: -6765.0107 Average bpd: 3.177
====> [eval] Epoch: 1026 Average bpd: 3.313
====> [test] Epoch: 1026 Average bpd: 3.324
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1027 Average train loss: -6763.7961 Average bpd: 3.176
====> Epoch: 1028 Average train loss: -6763.5442 Average bpd: 3.176
====> [eval] Epoch: 1028 Average bpd: 3.313
====> [test] Epoch: 1028 Average bpd: 3.325
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1029 Average train loss: -6763.6526 Average bpd: 3.176
====> Epoch: 1030 Average train loss: -6763.5962 Average bpd: 3.176
====> [eval] Epoch: 1030 Average bpd: 3.315
====> [test] Epoch: 1030 Average bpd: 3.326
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1031 Average train loss: -6762.8024 Average bpd: 3.176
====> Epoch: 1032 Average train loss: -6763.7396 Average bpd: 3.176
====> [eval] Epoch: 1032 Average bpd: 3.313
====> [test] Epoch: 1032 Average bpd: 3.324
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1033 Average train loss: -6764.8109 Average bpd: 3.177
====> Epoch: 1034 Average train loss: -6762.4755 Average bpd: 3.176
====> [eval] Epoch: 1034 Average bpd: 3.312
====> [test] Epoch: 1034 Average bpd: 3.324
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1035 Average train loss: -6761.2537 Average bpd: 3.175
====> Epoch: 1036 Average train loss: -6761.9535 Average bpd: 3.176
====> [eval] Epoch: 1036 Average bpd: 3.313
====> [test] Epoch: 1036 Average bpd: 3.324
Best val_bpd: 3.311910470898093
Best test_bpd: 3.323465850235839
====> Epoch: 1037 Average train loss: -6760.6810 Average bpd: 3.175
====> Epoch: 1038 Average train loss: -6761.9637 Average bpd: 3.176
====> [eval] Epoch: 1038 Average bpd: 3.310
====> [test] Epoch: 1038 Average bpd: 3.322
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1039 Average train loss: -6761.2663 Average bpd: 3.175
====> Epoch: 1040 Average train loss: -6760.4754 Average bpd: 3.175
====> [eval] Epoch: 1040 Average bpd: 3.314
====> [test] Epoch: 1040 Average bpd: 3.326
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1041 Average train loss: -6761.6061 Average bpd: 3.175
====> Epoch: 1042 Average train loss: -6760.8003 Average bpd: 3.175
====> [eval] Epoch: 1042 Average bpd: 3.315
====> [test] Epoch: 1042 Average bpd: 3.327
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1043 Average train loss: -6760.2734 Average bpd: 3.175
====> Epoch: 1044 Average train loss: -6760.4707 Average bpd: 3.175
====> [eval] Epoch: 1044 Average bpd: 3.314
====> [test] Epoch: 1044 Average bpd: 3.326
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1045 Average train loss: -6761.5861 Average bpd: 3.175
====> Epoch: 1046 Average train loss: -6762.8061 Average bpd: 3.176
====> [eval] Epoch: 1046 Average bpd: 3.314
====> [test] Epoch: 1046 Average bpd: 3.326
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1047 Average train loss: -6759.5551 Average bpd: 3.174
====> Epoch: 1048 Average train loss: -6759.8390 Average bpd: 3.175
====> [eval] Epoch: 1048 Average bpd: 3.311
====> [test] Epoch: 1048 Average bpd: 3.323
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1049 Average train loss: -6759.9634 Average bpd: 3.175
====> Epoch: 1050 Average train loss: -6759.9318 Average bpd: 3.175
====> [eval] Epoch: 1050 Average bpd: 3.313
====> [test] Epoch: 1050 Average bpd: 3.325
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1051 Average train loss: -6759.9652 Average bpd: 3.175
====> Epoch: 1052 Average train loss: -6759.5386 Average bpd: 3.174
====> [eval] Epoch: 1052 Average bpd: 3.312
====> [test] Epoch: 1052 Average bpd: 3.324
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1053 Average train loss: -6758.1475 Average bpd: 3.174
====> Epoch: 1054 Average train loss: -6759.5879 Average bpd: 3.174
====> [eval] Epoch: 1054 Average bpd: 3.311
====> [test] Epoch: 1054 Average bpd: 3.323
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1055 Average train loss: -6759.2125 Average bpd: 3.174
====> Epoch: 1056 Average train loss: -6758.3414 Average bpd: 3.174
====> [eval] Epoch: 1056 Average bpd: 3.311
====> [test] Epoch: 1056 Average bpd: 3.323
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1057 Average train loss: -6756.8023 Average bpd: 3.173
====> Epoch: 1058 Average train loss: -6757.3731 Average bpd: 3.173
====> [eval] Epoch: 1058 Average bpd: 3.311
====> [test] Epoch: 1058 Average bpd: 3.323
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1059 Average train loss: -6758.1303 Average bpd: 3.174
====> Epoch: 1060 Average train loss: -6759.1810 Average bpd: 3.174
====> [eval] Epoch: 1060 Average bpd: 3.314
====> [test] Epoch: 1060 Average bpd: 3.325
Best val_bpd: 3.310444053931692
Best test_bpd: 3.3222416080631394
====> Epoch: 1061 Average train loss: -6759.0632 Average bpd: 3.174
====> Epoch: 1062 Average train loss: -6758.3995 Average bpd: 3.174
====> [eval] Epoch: 1062 Average bpd: 3.310
====> [test] Epoch: 1062 Average bpd: 3.322
Best val_bpd: 3.310380851395997
Best test_bpd: 3.3220747580421905
====> Epoch: 1063 Average train loss: -6758.5648 Average bpd: 3.174
====> Epoch: 1064 Average train loss: -6757.0701 Average bpd: 3.173
====> [eval] Epoch: 1064 Average bpd: 3.309
====> [test] Epoch: 1064 Average bpd: 3.321
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1065 Average train loss: -6758.3108 Average bpd: 3.174
====> Epoch: 1066 Average train loss: -6756.2553 Average bpd: 3.173
====> [eval] Epoch: 1066 Average bpd: 3.312
====> [test] Epoch: 1066 Average bpd: 3.323
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1067 Average train loss: -6758.2659 Average bpd: 3.174
====> Epoch: 1068 Average train loss: -6757.1202 Average bpd: 3.173
====> [eval] Epoch: 1068 Average bpd: 3.312
====> [test] Epoch: 1068 Average bpd: 3.324
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1069 Average train loss: -6757.7984 Average bpd: 3.174
====> Epoch: 1070 Average train loss: -6756.6797 Average bpd: 3.173
====> [eval] Epoch: 1070 Average bpd: 3.311
====> [test] Epoch: 1070 Average bpd: 3.322
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1071 Average train loss: -6757.2635 Average bpd: 3.173
====> Epoch: 1072 Average train loss: -6755.6231 Average bpd: 3.173
====> [eval] Epoch: 1072 Average bpd: 3.313
====> [test] Epoch: 1072 Average bpd: 3.325
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1073 Average train loss: -6756.1167 Average bpd: 3.173
====> Epoch: 1074 Average train loss: -6754.9130 Average bpd: 3.172
====> [eval] Epoch: 1074 Average bpd: 3.310
====> [test] Epoch: 1074 Average bpd: 3.322
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1075 Average train loss: -6756.5584 Average bpd: 3.173
====> Epoch: 1076 Average train loss: -6756.7132 Average bpd: 3.173
====> [eval] Epoch: 1076 Average bpd: 3.312
====> [test] Epoch: 1076 Average bpd: 3.324
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1077 Average train loss: -6756.6485 Average bpd: 3.173
====> Epoch: 1078 Average train loss: -6754.2185 Average bpd: 3.172
====> [eval] Epoch: 1078 Average bpd: 3.312
====> [test] Epoch: 1078 Average bpd: 3.324
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1079 Average train loss: -6755.2966 Average bpd: 3.172
====> Epoch: 1080 Average train loss: -6755.6272 Average bpd: 3.173
====> [eval] Epoch: 1080 Average bpd: 3.314
====> [test] Epoch: 1080 Average bpd: 3.326
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1081 Average train loss: -6754.2151 Average bpd: 3.172
====> Epoch: 1082 Average train loss: -6754.6008 Average bpd: 3.172
====> [eval] Epoch: 1082 Average bpd: 3.309
====> [test] Epoch: 1082 Average bpd: 3.321
Best val_bpd: 3.308971798263144
Best test_bpd: 3.320874576024647
====> Epoch: 1083 Average train loss: -6755.5325 Average bpd: 3.173
====> Epoch: 1084 Average train loss: -6754.2285 Average bpd: 3.172
====> [eval] Epoch: 1084 Average bpd: 3.309
====> [test] Epoch: 1084 Average bpd: 3.320
Best val_bpd: 3.3086313364729723
Best test_bpd: 3.320397147106194
====> Epoch: 1085 Average train loss: -6754.8380 Average bpd: 3.172
====> Epoch: 1086 Average train loss: -6755.5487 Average bpd: 3.173
====> [eval] Epoch: 1086 Average bpd: 3.310
====> [test] Epoch: 1086 Average bpd: 3.322
Best val_bpd: 3.3086313364729723
Best test_bpd: 3.320397147106194
====> Epoch: 1087 Average train loss: -6753.3916 Average bpd: 3.172
====> Epoch: 1088 Average train loss: -6753.5783 Average bpd: 3.172
====> [eval] Epoch: 1088 Average bpd: 3.311
====> [test] Epoch: 1088 Average bpd: 3.323
Best val_bpd: 3.3086313364729723
Best test_bpd: 3.320397147106194
====> Epoch: 1089 Average train loss: -6753.9437 Average bpd: 3.172
====> Epoch: 1090 Average train loss: -6753.3428 Average bpd: 3.172
====> [eval] Epoch: 1090 Average bpd: 3.312
====> [test] Epoch: 1090 Average bpd: 3.324
Best val_bpd: 3.3086313364729723
Best test_bpd: 3.320397147106194
====> Epoch: 1091 Average train loss: -6754.2637 Average bpd: 3.172
====> Epoch: 1092 Average train loss: -6754.2471 Average bpd: 3.172
====> [eval] Epoch: 1092 Average bpd: 3.312
====> [test] Epoch: 1092 Average bpd: 3.323
Best val_bpd: 3.3086313364729723
Best test_bpd: 3.320397147106194
====> Epoch: 1093 Average train loss: -6753.0944 Average bpd: 3.171
====> Epoch: 1094 Average train loss: -6753.0989 Average bpd: 3.171
====> [eval] Epoch: 1094 Average bpd: 3.309
====> [test] Epoch: 1094 Average bpd: 3.321
Best val_bpd: 3.3086313364729723
Best test_bpd: 3.320397147106194
====> Epoch: 1095 Average train loss: -6751.9811 Average bpd: 3.171
====> Epoch: 1096 Average train loss: -6753.4423 Average bpd: 3.172
====> [eval] Epoch: 1096 Average bpd: 3.309
====> [test] Epoch: 1096 Average bpd: 3.321
Best val_bpd: 3.308602159652537
Best test_bpd: 3.320500286625814
====> Epoch: 1097 Average train loss: -6753.7227 Average bpd: 3.172
====> Epoch: 1098 Average train loss: -6752.7021 Average bpd: 3.171
====> [eval] Epoch: 1098 Average bpd: 3.308
====> [test] Epoch: 1098 Average bpd: 3.319
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1099 Average train loss: -6751.3425 Average bpd: 3.171
====> Epoch: 1100 Average train loss: -6753.5987 Average bpd: 3.172
====> [eval] Epoch: 1100 Average bpd: 3.310
====> [test] Epoch: 1100 Average bpd: 3.322
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1101 Average train loss: -6752.1799 Average bpd: 3.171
====> Epoch: 1102 Average train loss: -6752.1992 Average bpd: 3.171
====> [eval] Epoch: 1102 Average bpd: 3.310
====> [test] Epoch: 1102 Average bpd: 3.322
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1103 Average train loss: -6752.0494 Average bpd: 3.171
====> Epoch: 1104 Average train loss: -6752.7120 Average bpd: 3.171
====> [eval] Epoch: 1104 Average bpd: 3.308
====> [test] Epoch: 1104 Average bpd: 3.320
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1105 Average train loss: -6752.4032 Average bpd: 3.171
====> Epoch: 1106 Average train loss: -6752.8698 Average bpd: 3.171
====> [eval] Epoch: 1106 Average bpd: 3.310
====> [test] Epoch: 1106 Average bpd: 3.322
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1107 Average train loss: -6750.9360 Average bpd: 3.170
====> Epoch: 1108 Average train loss: -6751.3492 Average bpd: 3.171
====> [eval] Epoch: 1108 Average bpd: 3.309
====> [test] Epoch: 1108 Average bpd: 3.321
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1109 Average train loss: -6750.3150 Average bpd: 3.170
====> Epoch: 1110 Average train loss: -6750.9325 Average bpd: 3.170
====> [eval] Epoch: 1110 Average bpd: 3.309
====> [test] Epoch: 1110 Average bpd: 3.321
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1111 Average train loss: -6750.7892 Average bpd: 3.170
====> Epoch: 1112 Average train loss: -6750.9479 Average bpd: 3.170
====> [eval] Epoch: 1112 Average bpd: 3.308
====> [test] Epoch: 1112 Average bpd: 3.320
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1113 Average train loss: -6751.0230 Average bpd: 3.170
====> Epoch: 1114 Average train loss: -6750.8937 Average bpd: 3.170
====> [eval] Epoch: 1114 Average bpd: 3.309
====> [test] Epoch: 1114 Average bpd: 3.321
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1115 Average train loss: -6749.9280 Average bpd: 3.170
====> Epoch: 1116 Average train loss: -6750.2052 Average bpd: 3.170
====> [eval] Epoch: 1116 Average bpd: 3.309
====> [test] Epoch: 1116 Average bpd: 3.321
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1117 Average train loss: -6749.5894 Average bpd: 3.170
====> Epoch: 1118 Average train loss: -6750.1907 Average bpd: 3.170
====> [eval] Epoch: 1118 Average bpd: 3.308
====> [test] Epoch: 1118 Average bpd: 3.320
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1119 Average train loss: -6748.5643 Average bpd: 3.169
====> Epoch: 1120 Average train loss: -6748.2662 Average bpd: 3.169
====> [eval] Epoch: 1120 Average bpd: 3.310
====> [test] Epoch: 1120 Average bpd: 3.322
Best val_bpd: 3.307509627889455
Best test_bpd: 3.319338365305598
====> Epoch: 1121 Average train loss: -6749.0183 Average bpd: 3.170
====> Epoch: 1122 Average train loss: -6748.4195 Average bpd: 3.169
====> [eval] Epoch: 1122 Average bpd: 3.307
====> [test] Epoch: 1122 Average bpd: 3.319
Best val_bpd: 3.3073141968191946
Best test_bpd: 3.3192634512593457
====> Epoch: 1123 Average train loss: -6748.7277 Average bpd: 3.169
====> Epoch: 1124 Average train loss: -6748.3396 Average bpd: 3.169
====> [eval] Epoch: 1124 Average bpd: 3.309
====> [test] Epoch: 1124 Average bpd: 3.321
Best val_bpd: 3.3073141968191946
Best test_bpd: 3.3192634512593457
====> Epoch: 1125 Average train loss: -6747.1751 Average bpd: 3.169
====> Epoch: 1126 Average train loss: -6747.1341 Average bpd: 3.169
====> [eval] Epoch: 1126 Average bpd: 3.307
====> [test] Epoch: 1126 Average bpd: 3.319
Best val_bpd: 3.3070782700028105
Best test_bpd: 3.31883915772336
====> Epoch: 1127 Average train loss: -6747.1974 Average bpd: 3.169
====> Epoch: 1128 Average train loss: -6746.7169 Average bpd: 3.168
====> [eval] Epoch: 1128 Average bpd: 3.309
====> [test] Epoch: 1128 Average bpd: 3.320
Best val_bpd: 3.3070782700028105
Best test_bpd: 3.31883915772336
====> Epoch: 1129 Average train loss: -6748.4769 Average bpd: 3.169
====> Epoch: 1130 Average train loss: -6747.7696 Average bpd: 3.169
====> [eval] Epoch: 1130 Average bpd: 3.307
====> [test] Epoch: 1130 Average bpd: 3.319
Best val_bpd: 3.3070782700028105
Best test_bpd: 3.31883915772336
====> Epoch: 1131 Average train loss: -6747.4146 Average bpd: 3.169
====> Epoch: 1132 Average train loss: -6746.5691 Average bpd: 3.168
====> [eval] Epoch: 1132 Average bpd: 3.307
====> [test] Epoch: 1132 Average bpd: 3.319
Best val_bpd: 3.3070782700028105
Best test_bpd: 3.31883915772336
====> Epoch: 1133 Average train loss: -6747.5874 Average bpd: 3.169
====> Epoch: 1134 Average train loss: -6748.1073 Average bpd: 3.169
====> [eval] Epoch: 1134 Average bpd: 3.307
====> [test] Epoch: 1134 Average bpd: 3.319
Best val_bpd: 3.306914388678167
Best test_bpd: 3.318797908011099
====> Epoch: 1135 Average train loss: -6746.9580 Average bpd: 3.169
====> Epoch: 1136 Average train loss: -6747.1509 Average bpd: 3.169
====> [eval] Epoch: 1136 Average bpd: 3.309
====> [test] Epoch: 1136 Average bpd: 3.321
Best val_bpd: 3.306914388678167
Best test_bpd: 3.318797908011099
====> Epoch: 1137 Average train loss: -6746.4849 Average bpd: 3.168
====> Epoch: 1138 Average train loss: -6747.6804 Average bpd: 3.169
====> [eval] Epoch: 1138 Average bpd: 3.307
====> [test] Epoch: 1138 Average bpd: 3.318
Best val_bpd: 3.306914388678167
Best test_bpd: 3.318797908011099
====> Epoch: 1139 Average train loss: -6748.0226 Average bpd: 3.169
====> Epoch: 1140 Average train loss: -6746.7054 Average bpd: 3.168
====> [eval] Epoch: 1140 Average bpd: 3.311
====> [test] Epoch: 1140 Average bpd: 3.323
Best val_bpd: 3.306914388678167
Best test_bpd: 3.318797908011099
====> Epoch: 1141 Average train loss: -6746.5192 Average bpd: 3.168
====> Epoch: 1142 Average train loss: -6747.3224 Average bpd: 3.169
====> [eval] Epoch: 1142 Average bpd: 3.310
====> [test] Epoch: 1142 Average bpd: 3.321
Best val_bpd: 3.306914388678167
Best test_bpd: 3.318797908011099
====> Epoch: 1143 Average train loss: -6745.3520 Average bpd: 3.168
====> Epoch: 1144 Average train loss: -6746.9042 Average bpd: 3.169
====> [eval] Epoch: 1144 Average bpd: 3.307
====> [test] Epoch: 1144 Average bpd: 3.319
Best val_bpd: 3.306914388678167
Best test_bpd: 3.318797908011099
====> Epoch: 1145 Average train loss: -6747.2517 Average bpd: 3.169
====> Epoch: 1146 Average train loss: -6745.7299 Average bpd: 3.168
====> [eval] Epoch: 1146 Average bpd: 3.307
====> [test] Epoch: 1146 Average bpd: 3.318
Best val_bpd: 3.306707662629169
Best test_bpd: 3.3184701482644745
====> Epoch: 1147 Average train loss: -6747.0049 Average bpd: 3.169
====> Epoch: 1148 Average train loss: -6744.4963 Average bpd: 3.167
====> [eval] Epoch: 1148 Average bpd: 3.307
====> [test] Epoch: 1148 Average bpd: 3.319
Best val_bpd: 3.306707662629169
Best test_bpd: 3.3184701482644745
====> Epoch: 1149 Average train loss: -6745.2477 Average bpd: 3.168
====> Epoch: 1150 Average train loss: -6745.1744 Average bpd: 3.168
====> [eval] Epoch: 1150 Average bpd: 3.307
====> [test] Epoch: 1150 Average bpd: 3.318
Best val_bpd: 3.3066246697523174
Best test_bpd: 3.3183136636451493
====> Epoch: 1151 Average train loss: -6746.0344 Average bpd: 3.168
====> Epoch: 1152 Average train loss: -6744.8117 Average bpd: 3.168
====> [eval] Epoch: 1152 Average bpd: 3.308
====> [test] Epoch: 1152 Average bpd: 3.320
Best val_bpd: 3.3066246697523174
Best test_bpd: 3.3183136636451493
====> Epoch: 1153 Average train loss: -6744.9612 Average bpd: 3.168
====> Epoch: 1154 Average train loss: -6744.7371 Average bpd: 3.168
====> [eval] Epoch: 1154 Average bpd: 3.308
====> [test] Epoch: 1154 Average bpd: 3.320
Best val_bpd: 3.3066246697523174
Best test_bpd: 3.3183136636451493
====> Epoch: 1155 Average train loss: -6745.0170 Average bpd: 3.168
====> Epoch: 1156 Average train loss: -6745.0564 Average bpd: 3.168
====> [eval] Epoch: 1156 Average bpd: 3.308
====> [test] Epoch: 1156 Average bpd: 3.320
Best val_bpd: 3.3066246697523174
Best test_bpd: 3.3183136636451493
====> Epoch: 1157 Average train loss: -6743.6504 Average bpd: 3.167
====> Epoch: 1158 Average train loss: -6744.7276 Average bpd: 3.168
====> [eval] Epoch: 1158 Average bpd: 3.309
====> [test] Epoch: 1158 Average bpd: 3.320
Best val_bpd: 3.3066246697523174
Best test_bpd: 3.3183136636451493
====> Epoch: 1159 Average train loss: -6744.9734 Average bpd: 3.168
====> Epoch: 1160 Average train loss: -6744.1288 Average bpd: 3.167
====> [eval] Epoch: 1160 Average bpd: 3.310
====> [test] Epoch: 1160 Average bpd: 3.322
Best val_bpd: 3.3066246697523174
Best test_bpd: 3.3183136636451493
====> Epoch: 1161 Average train loss: -6743.6145 Average bpd: 3.167
====> Epoch: 1162 Average train loss: -6742.5703 Average bpd: 3.166
====> [eval] Epoch: 1162 Average bpd: 3.306
====> [test] Epoch: 1162 Average bpd: 3.317
Best val_bpd: 3.306075510280818
Best test_bpd: 3.3174288129064604
====> Epoch: 1163 Average train loss: -6743.3321 Average bpd: 3.167
====> Epoch: 1164 Average train loss: -6744.4474 Average bpd: 3.167
====> [eval] Epoch: 1164 Average bpd: 3.305
====> [test] Epoch: 1164 Average bpd: 3.317
Best val_bpd: 3.3047408008213734
Best test_bpd: 3.3166225480799025
====> Epoch: 1165 Average train loss: -6743.2736 Average bpd: 3.167
====> Epoch: 1166 Average train loss: -6743.3259 Average bpd: 3.167
====> [eval] Epoch: 1166 Average bpd: 3.307
====> [test] Epoch: 1166 Average bpd: 3.319
Best val_bpd: 3.3047408008213734
Best test_bpd: 3.3166225480799025
====> Epoch: 1167 Average train loss: -6742.6567 Average bpd: 3.167
====> Epoch: 1168 Average train loss: -6741.8731 Average bpd: 3.166
====> [eval] Epoch: 1168 Average bpd: 3.307
====> [test] Epoch: 1168 Average bpd: 3.319
Best val_bpd: 3.3047408008213734
Best test_bpd: 3.3166225480799025
====> Epoch: 1169 Average train loss: -6743.0693 Average bpd: 3.167
====> Epoch: 1170 Average train loss: -6742.2560 Average bpd: 3.166
====> [eval] Epoch: 1170 Average bpd: 3.309
====> [test] Epoch: 1170 Average bpd: 3.320
Best val_bpd: 3.3047408008213734
Best test_bpd: 3.3166225480799025
====> Epoch: 1171 Average train loss: -6742.5640 Average bpd: 3.166
====> Epoch: 1172 Average train loss: -6741.9978 Average bpd: 3.166
====> [eval] Epoch: 1172 Average bpd: 3.307
====> [test] Epoch: 1172 Average bpd: 3.318
Best val_bpd: 3.3047408008213734
Best test_bpd: 3.3166225480799025
====> Epoch: 1173 Average train loss: -6741.8410 Average bpd: 3.166
====> Epoch: 1174 Average train loss: -6742.2005 Average bpd: 3.166
====> [eval] Epoch: 1174 Average bpd: 3.307
====> [test] Epoch: 1174 Average bpd: 3.319
Best val_bpd: 3.3047408008213734
Best test_bpd: 3.3166225480799025
====> Epoch: 1175 Average train loss: -6743.1784 Average bpd: 3.167
====> Epoch: 1176 Average train loss: -6740.9738 Average bpd: 3.166
====> [eval] Epoch: 1176 Average bpd: 3.305
====> [test] Epoch: 1176 Average bpd: 3.316
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1177 Average train loss: -6740.9121 Average bpd: 3.166
====> Epoch: 1178 Average train loss: -6741.4221 Average bpd: 3.166
====> [eval] Epoch: 1178 Average bpd: 3.306
====> [test] Epoch: 1178 Average bpd: 3.318
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1179 Average train loss: -6741.0612 Average bpd: 3.166
====> Epoch: 1180 Average train loss: -6741.6237 Average bpd: 3.166
====> [eval] Epoch: 1180 Average bpd: 3.307
====> [test] Epoch: 1180 Average bpd: 3.318
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1181 Average train loss: -6742.6831 Average bpd: 3.167
====> Epoch: 1182 Average train loss: -6741.4460 Average bpd: 3.166
====> [eval] Epoch: 1182 Average bpd: 3.307
====> [test] Epoch: 1182 Average bpd: 3.318
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1183 Average train loss: -6739.5734 Average bpd: 3.165
====> Epoch: 1184 Average train loss: -6737.6570 Average bpd: 3.164
====> [eval] Epoch: 1184 Average bpd: 3.308
====> [test] Epoch: 1184 Average bpd: 3.320
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1185 Average train loss: -6739.1351 Average bpd: 3.165
====> Epoch: 1186 Average train loss: -6739.4077 Average bpd: 3.165
====> [eval] Epoch: 1186 Average bpd: 3.309
====> [test] Epoch: 1186 Average bpd: 3.321
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1187 Average train loss: -6739.4296 Average bpd: 3.165
====> Epoch: 1188 Average train loss: -6742.5348 Average bpd: 3.166
====> [eval] Epoch: 1188 Average bpd: 3.310
====> [test] Epoch: 1188 Average bpd: 3.321
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1189 Average train loss: -6739.8500 Average bpd: 3.165
====> Epoch: 1190 Average train loss: -6740.3722 Average bpd: 3.165
====> [eval] Epoch: 1190 Average bpd: 3.305
====> [test] Epoch: 1190 Average bpd: 3.317
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1191 Average train loss: -6740.1427 Average bpd: 3.165
====> Epoch: 1192 Average train loss: -6738.9546 Average bpd: 3.165
====> [eval] Epoch: 1192 Average bpd: 3.305
====> [test] Epoch: 1192 Average bpd: 3.317
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1193 Average train loss: -6739.8465 Average bpd: 3.165
====> Epoch: 1194 Average train loss: -6739.3196 Average bpd: 3.165
====> [eval] Epoch: 1194 Average bpd: 3.305
====> [test] Epoch: 1194 Average bpd: 3.317
Best val_bpd: 3.304548292730512
Best test_bpd: 3.316194510837183
====> Epoch: 1195 Average train loss: -6739.7650 Average bpd: 3.165
====> Epoch: 1196 Average train loss: -6739.2207 Average bpd: 3.165
====> [eval] Epoch: 1196 Average bpd: 3.304
====> [test] Epoch: 1196 Average bpd: 3.316
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1197 Average train loss: -6740.2650 Average bpd: 3.165
====> Epoch: 1198 Average train loss: -6738.7640 Average bpd: 3.165
====> [eval] Epoch: 1198 Average bpd: 3.305
====> [test] Epoch: 1198 Average bpd: 3.316
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1199 Average train loss: -6737.4231 Average bpd: 3.164
====> Epoch: 1200 Average train loss: -6739.3786 Average bpd: 3.165
====> [eval] Epoch: 1200 Average bpd: 3.308
====> [test] Epoch: 1200 Average bpd: 3.319
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1201 Average train loss: -6739.1495 Average bpd: 3.165
====> Epoch: 1202 Average train loss: -6738.3365 Average bpd: 3.165
====> [eval] Epoch: 1202 Average bpd: 3.307
====> [test] Epoch: 1202 Average bpd: 3.319
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1203 Average train loss: -6739.7485 Average bpd: 3.165
====> Epoch: 1204 Average train loss: -6738.0373 Average bpd: 3.164
====> [eval] Epoch: 1204 Average bpd: 3.305
====> [test] Epoch: 1204 Average bpd: 3.317
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1205 Average train loss: -6737.0918 Average bpd: 3.164
====> Epoch: 1206 Average train loss: -6737.6169 Average bpd: 3.164
====> [eval] Epoch: 1206 Average bpd: 3.307
====> [test] Epoch: 1206 Average bpd: 3.318
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1207 Average train loss: -6735.7123 Average bpd: 3.163
====> Epoch: 1208 Average train loss: -6735.8721 Average bpd: 3.163
====> [eval] Epoch: 1208 Average bpd: 3.306
====> [test] Epoch: 1208 Average bpd: 3.317
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1209 Average train loss: -6736.4516 Average bpd: 3.164
====> Epoch: 1210 Average train loss: -6737.3628 Average bpd: 3.164
====> [eval] Epoch: 1210 Average bpd: 3.306
====> [test] Epoch: 1210 Average bpd: 3.318
Best val_bpd: 3.304350916151868
Best test_bpd: 3.3159772060409933
====> Epoch: 1211 Average train loss: -6735.9993 Average bpd: 3.163
====> Epoch: 1212 Average train loss: -6737.5309 Average bpd: 3.164
====> [eval] Epoch: 1212 Average bpd: 3.304
====> [test] Epoch: 1212 Average bpd: 3.316
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1213 Average train loss: -6736.3329 Average bpd: 3.164
====> Epoch: 1214 Average train loss: -6736.8491 Average bpd: 3.164
====> [eval] Epoch: 1214 Average bpd: 3.304
====> [test] Epoch: 1214 Average bpd: 3.316
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1215 Average train loss: -6735.3764 Average bpd: 3.163
====> Epoch: 1216 Average train loss: -6736.1400 Average bpd: 3.163
====> [eval] Epoch: 1216 Average bpd: 3.306
====> [test] Epoch: 1216 Average bpd: 3.318
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1217 Average train loss: -6735.3985 Average bpd: 3.163
====> Epoch: 1218 Average train loss: -6735.8935 Average bpd: 3.163
====> [eval] Epoch: 1218 Average bpd: 3.306
====> [test] Epoch: 1218 Average bpd: 3.318
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1219 Average train loss: -6737.7700 Average bpd: 3.164
====> Epoch: 1220 Average train loss: -6736.2866 Average bpd: 3.164
====> [eval] Epoch: 1220 Average bpd: 3.306
====> [test] Epoch: 1220 Average bpd: 3.318
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1221 Average train loss: -6737.6172 Average bpd: 3.164
====> Epoch: 1222 Average train loss: -6735.4632 Average bpd: 3.163
====> [eval] Epoch: 1222 Average bpd: 3.307
====> [test] Epoch: 1222 Average bpd: 3.319
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1223 Average train loss: -6735.3206 Average bpd: 3.163
====> Epoch: 1224 Average train loss: -6736.6137 Average bpd: 3.164
====> [eval] Epoch: 1224 Average bpd: 3.305
====> [test] Epoch: 1224 Average bpd: 3.316
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1225 Average train loss: -6735.3649 Average bpd: 3.163
====> Epoch: 1226 Average train loss: -6733.8187 Average bpd: 3.162
====> [eval] Epoch: 1226 Average bpd: 3.305
====> [test] Epoch: 1226 Average bpd: 3.317
Best val_bpd: 3.3043241688585008
Best test_bpd: 3.3160148455689167
====> Epoch: 1227 Average train loss: -6736.0277 Average bpd: 3.163
====> Epoch: 1228 Average train loss: -6734.3855 Average bpd: 3.163
====> [eval] Epoch: 1228 Average bpd: 3.303
====> [test] Epoch: 1228 Average bpd: 3.315
Best val_bpd: 3.3033867030392905
Best test_bpd: 3.315081480483715
====> Epoch: 1229 Average train loss: -6735.8286 Average bpd: 3.163
====> Epoch: 1230 Average train loss: -6735.1591 Average bpd: 3.163
====> [eval] Epoch: 1230 Average bpd: 3.304
====> [test] Epoch: 1230 Average bpd: 3.316
Best val_bpd: 3.3033867030392905
Best test_bpd: 3.315081480483715
====> Epoch: 1231 Average train loss: -6734.7024 Average bpd: 3.163
====> Epoch: 1232 Average train loss: -6734.7045 Average bpd: 3.163
====> [eval] Epoch: 1232 Average bpd: 3.306
====> [test] Epoch: 1232 Average bpd: 3.317
Best val_bpd: 3.3033867030392905
Best test_bpd: 3.315081480483715
====> Epoch: 1233 Average train loss: -6734.5845 Average bpd: 3.163
====> Epoch: 1234 Average train loss: -6734.2563 Average bpd: 3.163
====> [eval] Epoch: 1234 Average bpd: 3.306
====> [test] Epoch: 1234 Average bpd: 3.318
Best val_bpd: 3.3033867030392905
Best test_bpd: 3.315081480483715
====> Epoch: 1235 Average train loss: -6734.2966 Average bpd: 3.163
====> Epoch: 1236 Average train loss: -6733.4015 Average bpd: 3.162
====> [eval] Epoch: 1236 Average bpd: 3.303
====> [test] Epoch: 1236 Average bpd: 3.315
Best val_bpd: 3.3032431940392906
Best test_bpd: 3.314856477105488
====> Epoch: 1237 Average train loss: -6733.9013 Average bpd: 3.162
====> Epoch: 1238 Average train loss: -6734.2275 Average bpd: 3.163
====> [eval] Epoch: 1238 Average bpd: 3.305
====> [test] Epoch: 1238 Average bpd: 3.316
Best val_bpd: 3.3032431940392906
Best test_bpd: 3.314856477105488
====> Epoch: 1239 Average train loss: -6733.2894 Average bpd: 3.162
====> Epoch: 1240 Average train loss: -6732.3983 Average bpd: 3.162
====> [eval] Epoch: 1240 Average bpd: 3.303
====> [test] Epoch: 1240 Average bpd: 3.315
Best val_bpd: 3.303193774062617
Best test_bpd: 3.3149145680555594
====> Epoch: 1241 Average train loss: -6732.9289 Average bpd: 3.162
====> Epoch: 1242 Average train loss: -6733.3880 Average bpd: 3.162
====> [eval] Epoch: 1242 Average bpd: 3.303
====> [test] Epoch: 1242 Average bpd: 3.314
Best val_bpd: 3.302932932257478
Best test_bpd: 3.3144949753786954
====> Epoch: 1243 Average train loss: -6732.2546 Average bpd: 3.162
====> Epoch: 1244 Average train loss: -6732.7208 Average bpd: 3.162
====> [eval] Epoch: 1244 Average bpd: 3.304
====> [test] Epoch: 1244 Average bpd: 3.316
Best val_bpd: 3.302932932257478
Best test_bpd: 3.3144949753786954
====> Epoch: 1245 Average train loss: -6731.4347 Average bpd: 3.161
====> Epoch: 1246 Average train loss: -6732.5908 Average bpd: 3.162
====> [eval] Epoch: 1246 Average bpd: 3.305
====> [test] Epoch: 1246 Average bpd: 3.316
Best val_bpd: 3.302932932257478
Best test_bpd: 3.3144949753786954
====> Epoch: 1247 Average train loss: -6731.7778 Average bpd: 3.161
====> Epoch: 1248 Average train loss: -6732.6493 Average bpd: 3.162
====> [eval] Epoch: 1248 Average bpd: 3.306
====> [test] Epoch: 1248 Average bpd: 3.317
Best val_bpd: 3.302932932257478
Best test_bpd: 3.3144949753786954
====> Epoch: 1249 Average train loss: -6732.6899 Average bpd: 3.162
====> Epoch: 1250 Average train loss: -6732.9212 Average bpd: 3.162
====> [eval] Epoch: 1250 Average bpd: 3.304
====> [test] Epoch: 1250 Average bpd: 3.316
Best val_bpd: 3.302932932257478
Best test_bpd: 3.3144949753786954
====> Epoch: 1251 Average train loss: -6734.0912 Average bpd: 3.163
====> Epoch: 1252 Average train loss: -6731.2470 Average bpd: 3.161
====> [eval] Epoch: 1252 Average bpd: 3.302
====> [test] Epoch: 1252 Average bpd: 3.314
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1253 Average train loss: -6732.5678 Average bpd: 3.162
====> Epoch: 1254 Average train loss: -6732.6052 Average bpd: 3.162
====> [eval] Epoch: 1254 Average bpd: 3.304
====> [test] Epoch: 1254 Average bpd: 3.316
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1255 Average train loss: -6731.5909 Average bpd: 3.161
====> Epoch: 1256 Average train loss: -6732.2682 Average bpd: 3.162
====> [eval] Epoch: 1256 Average bpd: 3.304
====> [test] Epoch: 1256 Average bpd: 3.315
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1257 Average train loss: -6731.6838 Average bpd: 3.161
====> Epoch: 1258 Average train loss: -6732.5700 Average bpd: 3.162
====> [eval] Epoch: 1258 Average bpd: 3.305
====> [test] Epoch: 1258 Average bpd: 3.316
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1259 Average train loss: -6731.4555 Average bpd: 3.161
====> Epoch: 1260 Average train loss: -6730.7852 Average bpd: 3.161
====> [eval] Epoch: 1260 Average bpd: 3.303
====> [test] Epoch: 1260 Average bpd: 3.314
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1261 Average train loss: -6730.6823 Average bpd: 3.161
====> Epoch: 1262 Average train loss: -6731.3145 Average bpd: 3.161
====> [eval] Epoch: 1262 Average bpd: 3.306
====> [test] Epoch: 1262 Average bpd: 3.317
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1263 Average train loss: -6731.1975 Average bpd: 3.161
====> Epoch: 1264 Average train loss: -6729.6341 Average bpd: 3.160
====> [eval] Epoch: 1264 Average bpd: 3.306
====> [test] Epoch: 1264 Average bpd: 3.317
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1265 Average train loss: -6730.3467 Average bpd: 3.161
====> Epoch: 1266 Average train loss: -6731.3275 Average bpd: 3.161
====> [eval] Epoch: 1266 Average bpd: 3.303
====> [test] Epoch: 1266 Average bpd: 3.314
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1267 Average train loss: -6731.3560 Average bpd: 3.161
====> Epoch: 1268 Average train loss: -6731.2165 Average bpd: 3.161
====> [eval] Epoch: 1268 Average bpd: 3.305
====> [test] Epoch: 1268 Average bpd: 3.316
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1269 Average train loss: -6732.5467 Average bpd: 3.162
====> Epoch: 1270 Average train loss: -6730.5098 Average bpd: 3.161
====> [eval] Epoch: 1270 Average bpd: 3.304
====> [test] Epoch: 1270 Average bpd: 3.315
Best val_bpd: 3.302342992579097
Best test_bpd: 3.3139264595138243
====> Epoch: 1271 Average train loss: -6729.6679 Average bpd: 3.160
====> Epoch: 1272 Average train loss: -6730.6333 Average bpd: 3.161
====> [eval] Epoch: 1272 Average bpd: 3.302
====> [test] Epoch: 1272 Average bpd: 3.314
Best val_bpd: 3.3017333597986966
Best test_bpd: 3.313516296130485
====> Epoch: 1273 Average train loss: -6730.3019 Average bpd: 3.161
====> Epoch: 1274 Average train loss: -6730.4500 Average bpd: 3.161
====> [eval] Epoch: 1274 Average bpd: 3.303
====> [test] Epoch: 1274 Average bpd: 3.315
Best val_bpd: 3.3017333597986966
Best test_bpd: 3.313516296130485
====> Epoch: 1275 Average train loss: -6729.7260 Average bpd: 3.160
====> Epoch: 1276 Average train loss: -6730.9246 Average bpd: 3.161
====> [eval] Epoch: 1276 Average bpd: 3.301
====> [test] Epoch: 1276 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1277 Average train loss: -6729.2541 Average bpd: 3.160
====> Epoch: 1278 Average train loss: -6730.4978 Average bpd: 3.161
====> [eval] Epoch: 1278 Average bpd: 3.303
====> [test] Epoch: 1278 Average bpd: 3.315
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1279 Average train loss: -6730.7841 Average bpd: 3.161
====> Epoch: 1280 Average train loss: -6727.9616 Average bpd: 3.160
====> [eval] Epoch: 1280 Average bpd: 3.301
====> [test] Epoch: 1280 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1281 Average train loss: -6729.6131 Average bpd: 3.160
====> Epoch: 1282 Average train loss: -6728.4811 Average bpd: 3.160
====> [eval] Epoch: 1282 Average bpd: 3.304
====> [test] Epoch: 1282 Average bpd: 3.316
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1283 Average train loss: -6729.2826 Average bpd: 3.160
====> Epoch: 1284 Average train loss: -6728.9429 Average bpd: 3.160
====> [eval] Epoch: 1284 Average bpd: 3.303
====> [test] Epoch: 1284 Average bpd: 3.315
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1285 Average train loss: -6729.4041 Average bpd: 3.160
====> Epoch: 1286 Average train loss: -6728.3604 Average bpd: 3.160
====> [eval] Epoch: 1286 Average bpd: 3.304
====> [test] Epoch: 1286 Average bpd: 3.315
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1287 Average train loss: -6728.8962 Average bpd: 3.160
====> Epoch: 1288 Average train loss: -6728.1293 Average bpd: 3.160
====> [eval] Epoch: 1288 Average bpd: 3.302
====> [test] Epoch: 1288 Average bpd: 3.314
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1289 Average train loss: -6727.4467 Average bpd: 3.159
====> Epoch: 1290 Average train loss: -6728.1414 Average bpd: 3.160
====> [eval] Epoch: 1290 Average bpd: 3.303
====> [test] Epoch: 1290 Average bpd: 3.314
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1291 Average train loss: -6727.9906 Average bpd: 3.160
====> Epoch: 1292 Average train loss: -6727.1398 Average bpd: 3.159
====> [eval] Epoch: 1292 Average bpd: 3.301
====> [test] Epoch: 1292 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1293 Average train loss: -6727.2801 Average bpd: 3.159
====> Epoch: 1294 Average train loss: -6727.3890 Average bpd: 3.159
====> [eval] Epoch: 1294 Average bpd: 3.303
====> [test] Epoch: 1294 Average bpd: 3.315
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1295 Average train loss: -6726.7123 Average bpd: 3.159
====> Epoch: 1296 Average train loss: -6727.7435 Average bpd: 3.160
====> [eval] Epoch: 1296 Average bpd: 3.302
====> [test] Epoch: 1296 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1297 Average train loss: -6727.3652 Average bpd: 3.159
====> Epoch: 1298 Average train loss: -6727.4463 Average bpd: 3.159
====> [eval] Epoch: 1298 Average bpd: 3.303
====> [test] Epoch: 1298 Average bpd: 3.314
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1299 Average train loss: -6726.7132 Average bpd: 3.159
====> Epoch: 1300 Average train loss: -6726.6580 Average bpd: 3.159
====> [eval] Epoch: 1300 Average bpd: 3.302
====> [test] Epoch: 1300 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1301 Average train loss: -6725.8098 Average bpd: 3.159
====> Epoch: 1302 Average train loss: -6727.1628 Average bpd: 3.159
====> [eval] Epoch: 1302 Average bpd: 3.302
====> [test] Epoch: 1302 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1303 Average train loss: -6726.1431 Average bpd: 3.159
====> Epoch: 1304 Average train loss: -6726.6783 Average bpd: 3.159
====> [eval] Epoch: 1304 Average bpd: 3.302
====> [test] Epoch: 1304 Average bpd: 3.314
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1305 Average train loss: -6725.5437 Average bpd: 3.158
====> Epoch: 1306 Average train loss: -6726.0564 Average bpd: 3.159
====> [eval] Epoch: 1306 Average bpd: 3.301
====> [test] Epoch: 1306 Average bpd: 3.313
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1307 Average train loss: -6725.8851 Average bpd: 3.159
====> Epoch: 1308 Average train loss: -6726.4638 Average bpd: 3.159
====> [eval] Epoch: 1308 Average bpd: 3.303
====> [test] Epoch: 1308 Average bpd: 3.315
Best val_bpd: 3.3011772012864045
Best test_bpd: 3.3129371848281224
====> Epoch: 1309 Average train loss: -6725.5516 Average bpd: 3.159
====> Epoch: 1310 Average train loss: -6726.1183 Average bpd: 3.159
====> [eval] Epoch: 1310 Average bpd: 3.301
====> [test] Epoch: 1310 Average bpd: 3.312
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1311 Average train loss: -6725.7339 Average bpd: 3.159
====> Epoch: 1312 Average train loss: -6724.6910 Average bpd: 3.158
====> [eval] Epoch: 1312 Average bpd: 3.302
====> [test] Epoch: 1312 Average bpd: 3.314
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1313 Average train loss: -6724.5713 Average bpd: 3.158
====> Epoch: 1314 Average train loss: -6726.2427 Average bpd: 3.159
====> [eval] Epoch: 1314 Average bpd: 3.304
====> [test] Epoch: 1314 Average bpd: 3.315
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1315 Average train loss: -6724.2366 Average bpd: 3.158
====> Epoch: 1316 Average train loss: -6725.1542 Average bpd: 3.158
====> [eval] Epoch: 1316 Average bpd: 3.302
====> [test] Epoch: 1316 Average bpd: 3.314
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1317 Average train loss: -6724.1250 Average bpd: 3.158
====> Epoch: 1318 Average train loss: -6724.3778 Average bpd: 3.158
====> [eval] Epoch: 1318 Average bpd: 3.305
====> [test] Epoch: 1318 Average bpd: 3.316
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1319 Average train loss: -6724.6967 Average bpd: 3.158
====> Epoch: 1320 Average train loss: -6724.4102 Average bpd: 3.158
====> [eval] Epoch: 1320 Average bpd: 3.301
====> [test] Epoch: 1320 Average bpd: 3.313
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1321 Average train loss: -6724.2661 Average bpd: 3.158
====> Epoch: 1322 Average train loss: -6725.4436 Average bpd: 3.158
====> [eval] Epoch: 1322 Average bpd: 3.301
====> [test] Epoch: 1322 Average bpd: 3.313
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1323 Average train loss: -6723.4091 Average bpd: 3.157
====> Epoch: 1324 Average train loss: -6724.5684 Average bpd: 3.158
====> [eval] Epoch: 1324 Average bpd: 3.301
====> [test] Epoch: 1324 Average bpd: 3.313
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1325 Average train loss: -6724.6116 Average bpd: 3.158
====> Epoch: 1326 Average train loss: -6723.2761 Average bpd: 3.157
====> [eval] Epoch: 1326 Average bpd: 3.304
====> [test] Epoch: 1326 Average bpd: 3.315
Best val_bpd: 3.3006423047642834
Best test_bpd: 3.3122994093001865
====> Epoch: 1327 Average train loss: -6723.7082 Average bpd: 3.158
====> Epoch: 1328 Average train loss: -6724.3140 Average bpd: 3.158
====> [eval] Epoch: 1328 Average bpd: 3.300
====> [test] Epoch: 1328 Average bpd: 3.311
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1329 Average train loss: -6724.3136 Average bpd: 3.158
====> Epoch: 1330 Average train loss: -6724.0324 Average bpd: 3.158
====> [eval] Epoch: 1330 Average bpd: 3.303
====> [test] Epoch: 1330 Average bpd: 3.314
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1331 Average train loss: -6724.0633 Average bpd: 3.158
====> Epoch: 1332 Average train loss: -6725.2040 Average bpd: 3.158
====> [eval] Epoch: 1332 Average bpd: 3.302
====> [test] Epoch: 1332 Average bpd: 3.314
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1333 Average train loss: -6723.9322 Average bpd: 3.158
====> Epoch: 1334 Average train loss: -6723.0937 Average bpd: 3.157
====> [eval] Epoch: 1334 Average bpd: 3.302
====> [test] Epoch: 1334 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1335 Average train loss: -6723.8895 Average bpd: 3.158
====> Epoch: 1336 Average train loss: -6722.4342 Average bpd: 3.157
====> [eval] Epoch: 1336 Average bpd: 3.303
====> [test] Epoch: 1336 Average bpd: 3.315
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1337 Average train loss: -6722.5589 Average bpd: 3.157
====> Epoch: 1338 Average train loss: -6722.2781 Average bpd: 3.157
====> [eval] Epoch: 1338 Average bpd: 3.301
====> [test] Epoch: 1338 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1339 Average train loss: -6721.9009 Average bpd: 3.157
====> Epoch: 1340 Average train loss: -6723.1512 Average bpd: 3.157
====> [eval] Epoch: 1340 Average bpd: 3.301
====> [test] Epoch: 1340 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1341 Average train loss: -6722.1672 Average bpd: 3.157
====> Epoch: 1342 Average train loss: -6722.9255 Average bpd: 3.157
====> [eval] Epoch: 1342 Average bpd: 3.303
====> [test] Epoch: 1342 Average bpd: 3.314
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1343 Average train loss: -6721.6189 Average bpd: 3.157
====> Epoch: 1344 Average train loss: -6722.3233 Average bpd: 3.157
====> [eval] Epoch: 1344 Average bpd: 3.300
====> [test] Epoch: 1344 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1345 Average train loss: -6721.6124 Average bpd: 3.157
====> Epoch: 1346 Average train loss: -6722.3586 Average bpd: 3.157
====> [eval] Epoch: 1346 Average bpd: 3.301
====> [test] Epoch: 1346 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1347 Average train loss: -6721.0896 Average bpd: 3.156
====> Epoch: 1348 Average train loss: -6722.1479 Average bpd: 3.157
====> [eval] Epoch: 1348 Average bpd: 3.301
====> [test] Epoch: 1348 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1349 Average train loss: -6721.8268 Average bpd: 3.157
====> Epoch: 1350 Average train loss: -6721.6253 Average bpd: 3.157
====> [eval] Epoch: 1350 Average bpd: 3.302
====> [test] Epoch: 1350 Average bpd: 3.314
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1351 Average train loss: -6721.1770 Average bpd: 3.156
====> Epoch: 1352 Average train loss: -6721.4527 Average bpd: 3.157
====> [eval] Epoch: 1352 Average bpd: 3.301
====> [test] Epoch: 1352 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1353 Average train loss: -6721.8358 Average bpd: 3.157
====> Epoch: 1354 Average train loss: -6722.7031 Average bpd: 3.157
====> [eval] Epoch: 1354 Average bpd: 3.300
====> [test] Epoch: 1354 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1355 Average train loss: -6722.4719 Average bpd: 3.157
====> Epoch: 1356 Average train loss: -6720.3571 Average bpd: 3.156
====> [eval] Epoch: 1356 Average bpd: 3.301
====> [test] Epoch: 1356 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1357 Average train loss: -6720.3777 Average bpd: 3.156
====> Epoch: 1358 Average train loss: -6720.7512 Average bpd: 3.156
====> [eval] Epoch: 1358 Average bpd: 3.301
====> [test] Epoch: 1358 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1359 Average train loss: -6720.8705 Average bpd: 3.156
====> Epoch: 1360 Average train loss: -6721.1086 Average bpd: 3.156
====> [eval] Epoch: 1360 Average bpd: 3.301
====> [test] Epoch: 1360 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1361 Average train loss: -6720.6398 Average bpd: 3.156
====> Epoch: 1362 Average train loss: -6719.4694 Average bpd: 3.156
====> [eval] Epoch: 1362 Average bpd: 3.301
====> [test] Epoch: 1362 Average bpd: 3.313
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1363 Average train loss: -6719.3218 Average bpd: 3.156
====> Epoch: 1364 Average train loss: -6719.7501 Average bpd: 3.156
====> [eval] Epoch: 1364 Average bpd: 3.304
====> [test] Epoch: 1364 Average bpd: 3.316
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1365 Average train loss: -6720.3673 Average bpd: 3.156
====> Epoch: 1366 Average train loss: -6720.2133 Average bpd: 3.156
====> [eval] Epoch: 1366 Average bpd: 3.301
====> [test] Epoch: 1366 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1367 Average train loss: -6719.6339 Average bpd: 3.156
====> Epoch: 1368 Average train loss: -6718.8970 Average bpd: 3.155
====> [eval] Epoch: 1368 Average bpd: 3.300
====> [test] Epoch: 1368 Average bpd: 3.311
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1369 Average train loss: -6719.5975 Average bpd: 3.156
====> Epoch: 1370 Average train loss: -6718.2551 Average bpd: 3.155
====> [eval] Epoch: 1370 Average bpd: 3.300
====> [test] Epoch: 1370 Average bpd: 3.311
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1371 Average train loss: -6719.5694 Average bpd: 3.156
====> Epoch: 1372 Average train loss: -6719.0777 Average bpd: 3.155
====> [eval] Epoch: 1372 Average bpd: 3.300
====> [test] Epoch: 1372 Average bpd: 3.312
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1373 Average train loss: -6719.1839 Average bpd: 3.156
====> Epoch: 1374 Average train loss: -6718.7679 Average bpd: 3.155
====> [eval] Epoch: 1374 Average bpd: 3.300
====> [test] Epoch: 1374 Average bpd: 3.311
Best val_bpd: 3.2995062207532824
Best test_bpd: 3.3111960498867528
====> Epoch: 1375 Average train loss: -6718.3615 Average bpd: 3.155
====> Epoch: 1376 Average train loss: -6717.1889 Average bpd: 3.155
====> [eval] Epoch: 1376 Average bpd: 3.299
====> [test] Epoch: 1376 Average bpd: 3.311
Best val_bpd: 3.2992293620659923
Best test_bpd: 3.3106158704052255
====> Epoch: 1377 Average train loss: -6718.9080 Average bpd: 3.155
====> Epoch: 1378 Average train loss: -6719.8230 Average bpd: 3.156
====> [eval] Epoch: 1378 Average bpd: 3.300
====> [test] Epoch: 1378 Average bpd: 3.311
Best val_bpd: 3.2992293620659923
Best test_bpd: 3.3106158704052255
====> Epoch: 1379 Average train loss: -6718.8109 Average bpd: 3.155
====> Epoch: 1380 Average train loss: -6719.5926 Average bpd: 3.156
====> [eval] Epoch: 1380 Average bpd: 3.300
====> [test] Epoch: 1380 Average bpd: 3.312
Best val_bpd: 3.2992293620659923
Best test_bpd: 3.3106158704052255
====> Epoch: 1381 Average train loss: -6719.7199 Average bpd: 3.156
====> Epoch: 1382 Average train loss: -6717.8587 Average bpd: 3.155
====> [eval] Epoch: 1382 Average bpd: 3.299
====> [test] Epoch: 1382 Average bpd: 3.311
Best val_bpd: 3.2990354846449113
Best test_bpd: 3.310701912527221
====> Epoch: 1383 Average train loss: -6717.8792 Average bpd: 3.155
====> Epoch: 1384 Average train loss: -6717.5098 Average bpd: 3.155
====> [eval] Epoch: 1384 Average bpd: 3.299
====> [test] Epoch: 1384 Average bpd: 3.310
Best val_bpd: 3.298837762649634
Best test_bpd: 3.310182943630431
====> Epoch: 1385 Average train loss: -6717.9247 Average bpd: 3.155
====> Epoch: 1386 Average train loss: -6718.1343 Average bpd: 3.155
====> [eval] Epoch: 1386 Average bpd: 3.299
====> [test] Epoch: 1386 Average bpd: 3.311
Best val_bpd: 3.298837762649634
Best test_bpd: 3.310182943630431
====> Epoch: 1387 Average train loss: -6716.2740 Average bpd: 3.154
====> Epoch: 1388 Average train loss: -6717.0523 Average bpd: 3.155
====> [eval] Epoch: 1388 Average bpd: 3.299
====> [test] Epoch: 1388 Average bpd: 3.311
Best val_bpd: 3.298837762649634
Best test_bpd: 3.310182943630431
====> Epoch: 1389 Average train loss: -6715.6765 Average bpd: 3.154
====> Epoch: 1390 Average train loss: -6716.3340 Average bpd: 3.154
====> [eval] Epoch: 1390 Average bpd: 3.299
====> [test] Epoch: 1390 Average bpd: 3.311
Best val_bpd: 3.298837762649634
Best test_bpd: 3.310182943630431
====> Epoch: 1391 Average train loss: -6716.6507 Average bpd: 3.154
====> Epoch: 1392 Average train loss: -6715.8561 Average bpd: 3.154
====> [eval] Epoch: 1392 Average bpd: 3.298
====> [test] Epoch: 1392 Average bpd: 3.310
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1393 Average train loss: -6716.6488 Average bpd: 3.154
====> Epoch: 1394 Average train loss: -6715.2517 Average bpd: 3.154
====> [eval] Epoch: 1394 Average bpd: 3.300
====> [test] Epoch: 1394 Average bpd: 3.312
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1395 Average train loss: -6715.6424 Average bpd: 3.154
====> Epoch: 1396 Average train loss: -6716.5930 Average bpd: 3.154
====> [eval] Epoch: 1396 Average bpd: 3.299
====> [test] Epoch: 1396 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1397 Average train loss: -6716.8021 Average bpd: 3.154
====> Epoch: 1398 Average train loss: -6716.7049 Average bpd: 3.154
====> [eval] Epoch: 1398 Average bpd: 3.299
====> [test] Epoch: 1398 Average bpd: 3.310
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1399 Average train loss: -6716.5906 Average bpd: 3.154
====> Epoch: 1400 Average train loss: -6717.0405 Average bpd: 3.155
====> [eval] Epoch: 1400 Average bpd: 3.302
====> [test] Epoch: 1400 Average bpd: 3.313
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1401 Average train loss: -6716.5279 Average bpd: 3.154
====> Epoch: 1402 Average train loss: -6715.7422 Average bpd: 3.154
====> [eval] Epoch: 1402 Average bpd: 3.299
====> [test] Epoch: 1402 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1403 Average train loss: -6716.2802 Average bpd: 3.154
====> Epoch: 1404 Average train loss: -6714.9354 Average bpd: 3.154
====> [eval] Epoch: 1404 Average bpd: 3.300
====> [test] Epoch: 1404 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1405 Average train loss: -6715.5760 Average bpd: 3.154
====> Epoch: 1406 Average train loss: -6716.5204 Average bpd: 3.154
====> [eval] Epoch: 1406 Average bpd: 3.300
====> [test] Epoch: 1406 Average bpd: 3.312
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1407 Average train loss: -6716.7053 Average bpd: 3.154
====> Epoch: 1408 Average train loss: -6715.1584 Average bpd: 3.154
====> [eval] Epoch: 1408 Average bpd: 3.299
====> [test] Epoch: 1408 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1409 Average train loss: -6715.9874 Average bpd: 3.154
====> Epoch: 1410 Average train loss: -6715.1903 Average bpd: 3.154
====> [eval] Epoch: 1410 Average bpd: 3.300
====> [test] Epoch: 1410 Average bpd: 3.312
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1411 Average train loss: -6715.8834 Average bpd: 3.154
====> Epoch: 1412 Average train loss: -6715.6567 Average bpd: 3.154
====> [eval] Epoch: 1412 Average bpd: 3.299
====> [test] Epoch: 1412 Average bpd: 3.310
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1413 Average train loss: -6716.5459 Average bpd: 3.154
====> Epoch: 1414 Average train loss: -6715.5913 Average bpd: 3.154
====> [eval] Epoch: 1414 Average bpd: 3.300
====> [test] Epoch: 1414 Average bpd: 3.312
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1415 Average train loss: -6715.3853 Average bpd: 3.154
====> Epoch: 1416 Average train loss: -6713.9834 Average bpd: 3.153
====> [eval] Epoch: 1416 Average bpd: 3.299
====> [test] Epoch: 1416 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1417 Average train loss: -6716.0953 Average bpd: 3.154
====> Epoch: 1418 Average train loss: -6714.7369 Average bpd: 3.153
====> [eval] Epoch: 1418 Average bpd: 3.299
====> [test] Epoch: 1418 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1419 Average train loss: -6714.5499 Average bpd: 3.153
====> Epoch: 1420 Average train loss: -6714.0442 Average bpd: 3.153
====> [eval] Epoch: 1420 Average bpd: 3.300
====> [test] Epoch: 1420 Average bpd: 3.312
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1421 Average train loss: -6715.4002 Average bpd: 3.154
====> Epoch: 1422 Average train loss: -6714.8330 Average bpd: 3.153
====> [eval] Epoch: 1422 Average bpd: 3.299
====> [test] Epoch: 1422 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1423 Average train loss: -6713.5796 Average bpd: 3.153
====> Epoch: 1424 Average train loss: -6712.3720 Average bpd: 3.152
====> [eval] Epoch: 1424 Average bpd: 3.299
====> [test] Epoch: 1424 Average bpd: 3.311
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1425 Average train loss: -6715.3205 Average bpd: 3.154
====> Epoch: 1426 Average train loss: -6714.2239 Average bpd: 3.153
====> [eval] Epoch: 1426 Average bpd: 3.302
====> [test] Epoch: 1426 Average bpd: 3.314
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1427 Average train loss: -6714.2147 Average bpd: 3.153
====> Epoch: 1428 Average train loss: -6713.9694 Average bpd: 3.153
====> [eval] Epoch: 1428 Average bpd: 3.300
====> [test] Epoch: 1428 Average bpd: 3.312
Best val_bpd: 3.298398801968756
Best test_bpd: 3.31040264457309
====> Epoch: 1429 Average train loss: -6712.8515 Average bpd: 3.153
====> Epoch: 1430 Average train loss: -6713.0805 Average bpd: 3.153
====> [eval] Epoch: 1430 Average bpd: 3.298
====> [test] Epoch: 1430 Average bpd: 3.310
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1431 Average train loss: -6715.0041 Average bpd: 3.154
====> Epoch: 1432 Average train loss: -6713.5402 Average bpd: 3.153
====> [eval] Epoch: 1432 Average bpd: 3.300
====> [test] Epoch: 1432 Average bpd: 3.311
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1433 Average train loss: -6713.8611 Average bpd: 3.153
====> Epoch: 1434 Average train loss: -6713.1165 Average bpd: 3.153
====> [eval] Epoch: 1434 Average bpd: 3.298
====> [test] Epoch: 1434 Average bpd: 3.310
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1435 Average train loss: -6713.6516 Average bpd: 3.153
====> Epoch: 1436 Average train loss: -6713.0905 Average bpd: 3.153
====> [eval] Epoch: 1436 Average bpd: 3.299
====> [test] Epoch: 1436 Average bpd: 3.311
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1437 Average train loss: -6713.7949 Average bpd: 3.153
====> Epoch: 1438 Average train loss: -6713.5444 Average bpd: 3.153
====> [eval] Epoch: 1438 Average bpd: 3.298
====> [test] Epoch: 1438 Average bpd: 3.310
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1439 Average train loss: -6712.8085 Average bpd: 3.153
====> Epoch: 1440 Average train loss: -6711.7171 Average bpd: 3.152
====> [eval] Epoch: 1440 Average bpd: 3.299
====> [test] Epoch: 1440 Average bpd: 3.311
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1441 Average train loss: -6712.9567 Average bpd: 3.153
====> Epoch: 1442 Average train loss: -6712.0575 Average bpd: 3.152
====> [eval] Epoch: 1442 Average bpd: 3.300
====> [test] Epoch: 1442 Average bpd: 3.312
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1443 Average train loss: -6712.4194 Average bpd: 3.152
====> Epoch: 1444 Average train loss: -6712.3054 Average bpd: 3.152
====> [eval] Epoch: 1444 Average bpd: 3.298
====> [test] Epoch: 1444 Average bpd: 3.309
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1445 Average train loss: -6712.9287 Average bpd: 3.153
====> Epoch: 1446 Average train loss: -6712.2662 Average bpd: 3.152
====> [eval] Epoch: 1446 Average bpd: 3.299
====> [test] Epoch: 1446 Average bpd: 3.310
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1447 Average train loss: -6711.6149 Average bpd: 3.152
====> Epoch: 1448 Average train loss: -6711.7241 Average bpd: 3.152
====> [eval] Epoch: 1448 Average bpd: 3.298
====> [test] Epoch: 1448 Average bpd: 3.310
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1449 Average train loss: -6711.5400 Average bpd: 3.152
====> Epoch: 1450 Average train loss: -6712.9342 Average bpd: 3.153
====> [eval] Epoch: 1450 Average bpd: 3.299
====> [test] Epoch: 1450 Average bpd: 3.311
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1451 Average train loss: -6711.4578 Average bpd: 3.152
====> Epoch: 1452 Average train loss: -6709.7554 Average bpd: 3.151
====> [eval] Epoch: 1452 Average bpd: 3.301
====> [test] Epoch: 1452 Average bpd: 3.313
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1453 Average train loss: -6711.4851 Average bpd: 3.152
====> Epoch: 1454 Average train loss: -6711.3783 Average bpd: 3.152
====> [eval] Epoch: 1454 Average bpd: 3.298
====> [test] Epoch: 1454 Average bpd: 3.309
Best val_bpd: 3.297659219968161
Best test_bpd: 3.3095695939929097
====> Epoch: 1455 Average train loss: -6711.4818 Average bpd: 3.152
====> Epoch: 1456 Average train loss: -6711.6017 Average bpd: 3.152
====> [eval] Epoch: 1456 Average bpd: 3.298
====> [test] Epoch: 1456 Average bpd: 3.309
Best val_bpd: 3.29755093693365
Best test_bpd: 3.3091485318440292
====> Epoch: 1457 Average train loss: -6710.3864 Average bpd: 3.151
====> Epoch: 1458 Average train loss: -6711.0847 Average bpd: 3.152
====> [eval] Epoch: 1458 Average bpd: 3.298
====> [test] Epoch: 1458 Average bpd: 3.310
Best val_bpd: 3.29755093693365
Best test_bpd: 3.3091485318440292
====> Epoch: 1459 Average train loss: -6710.1005 Average bpd: 3.151
====> Epoch: 1460 Average train loss: -6712.5481 Average bpd: 3.152
====> [eval] Epoch: 1460 Average bpd: 3.297
====> [test] Epoch: 1460 Average bpd: 3.309
Best val_bpd: 3.2970702483271497
Best test_bpd: 3.308898738291461
====> Epoch: 1461 Average train loss: -6709.0666 Average bpd: 3.151
====> Epoch: 1462 Average train loss: -6709.9515 Average bpd: 3.151
====> [eval] Epoch: 1462 Average bpd: 3.298
====> [test] Epoch: 1462 Average bpd: 3.310
Best val_bpd: 3.2970702483271497
Best test_bpd: 3.308898738291461
====> Epoch: 1463 Average train loss: -6710.3348 Average bpd: 3.151
====> Epoch: 1464 Average train loss: -6709.9333 Average bpd: 3.151
====> [eval] Epoch: 1464 Average bpd: 3.297
====> [test] Epoch: 1464 Average bpd: 3.309
Best val_bpd: 3.2970702483271497
Best test_bpd: 3.308898738291461
====> Epoch: 1465 Average train loss: -6710.2845 Average bpd: 3.151
====> Epoch: 1466 Average train loss: -6710.8572 Average bpd: 3.152
====> [eval] Epoch: 1466 Average bpd: 3.298
====> [test] Epoch: 1466 Average bpd: 3.310
Best val_bpd: 3.2970702483271497
Best test_bpd: 3.308898738291461
====> Epoch: 1467 Average train loss: -6710.7931 Average bpd: 3.152
====> Epoch: 1468 Average train loss: -6710.7558 Average bpd: 3.152
====> [eval] Epoch: 1468 Average bpd: 3.297
====> [test] Epoch: 1468 Average bpd: 3.309
Best val_bpd: 3.2970475843518243
Best test_bpd: 3.308810973440297
====> Epoch: 1469 Average train loss: -6709.2354 Average bpd: 3.151
====> Epoch: 1470 Average train loss: -6708.2144 Average bpd: 3.150
====> [eval] Epoch: 1470 Average bpd: 3.299
====> [test] Epoch: 1470 Average bpd: 3.311
Best val_bpd: 3.2970475843518243
Best test_bpd: 3.308810973440297
====> Epoch: 1471 Average train loss: -6709.1766 Average bpd: 3.151
====> Epoch: 1472 Average train loss: -6710.6518 Average bpd: 3.152
====> [eval] Epoch: 1472 Average bpd: 3.298
====> [test] Epoch: 1472 Average bpd: 3.309
Best val_bpd: 3.2970475843518243
Best test_bpd: 3.308810973440297
====> Epoch: 1473 Average train loss: -6709.1531 Average bpd: 3.151
====> Epoch: 1474 Average train loss: -6709.7202 Average bpd: 3.151
====> [eval] Epoch: 1474 Average bpd: 3.298
====> [test] Epoch: 1474 Average bpd: 3.310
Best val_bpd: 3.2970475843518243
Best test_bpd: 3.308810973440297
====> Epoch: 1475 Average train loss: -6709.7669 Average bpd: 3.151
====> Epoch: 1476 Average train loss: -6709.7521 Average bpd: 3.151
====> [eval] Epoch: 1476 Average bpd: 3.297
====> [test] Epoch: 1476 Average bpd: 3.309
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1477 Average train loss: -6708.6019 Average bpd: 3.151
====> Epoch: 1478 Average train loss: -6708.9522 Average bpd: 3.151
====> [eval] Epoch: 1478 Average bpd: 3.297
====> [test] Epoch: 1478 Average bpd: 3.309
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1479 Average train loss: -6709.3355 Average bpd: 3.151
====> Epoch: 1480 Average train loss: -6709.1739 Average bpd: 3.151
====> [eval] Epoch: 1480 Average bpd: 3.297
====> [test] Epoch: 1480 Average bpd: 3.309
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1481 Average train loss: -6708.6469 Average bpd: 3.151
====> Epoch: 1482 Average train loss: -6709.0270 Average bpd: 3.151
====> [eval] Epoch: 1482 Average bpd: 3.298
====> [test] Epoch: 1482 Average bpd: 3.310
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1483 Average train loss: -6708.4295 Average bpd: 3.150
====> Epoch: 1484 Average train loss: -6707.7774 Average bpd: 3.150
====> [eval] Epoch: 1484 Average bpd: 3.297
====> [test] Epoch: 1484 Average bpd: 3.309
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1485 Average train loss: -6709.3261 Average bpd: 3.151
====> Epoch: 1486 Average train loss: -6707.7603 Average bpd: 3.150
====> [eval] Epoch: 1486 Average bpd: 3.298
====> [test] Epoch: 1486 Average bpd: 3.310
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1487 Average train loss: -6707.6463 Average bpd: 3.150
====> Epoch: 1488 Average train loss: -6708.2527 Average bpd: 3.150
====> [eval] Epoch: 1488 Average bpd: 3.299
====> [test] Epoch: 1488 Average bpd: 3.310
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1489 Average train loss: -6707.3429 Average bpd: 3.150
====> Epoch: 1490 Average train loss: -6708.6230 Average bpd: 3.151
====> [eval] Epoch: 1490 Average bpd: 3.298
====> [test] Epoch: 1490 Average bpd: 3.310
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1491 Average train loss: -6708.6314 Average bpd: 3.151
====> Epoch: 1492 Average train loss: -6708.4625 Average bpd: 3.150
====> [eval] Epoch: 1492 Average bpd: 3.298
====> [test] Epoch: 1492 Average bpd: 3.309
Best val_bpd: 3.2967068918001226
Best test_bpd: 3.3086061936253475
====> Epoch: 1493 Average train loss: -6707.2594 Average bpd: 3.150
====> Epoch: 1494 Average train loss: -6706.7930 Average bpd: 3.150
====> [eval] Epoch: 1494 Average bpd: 3.296
====> [test] Epoch: 1494 Average bpd: 3.308
Best val_bpd: 3.296257536691012
Best test_bpd: 3.308021430842458
====> Epoch: 1495 Average train loss: -6707.1659 Average bpd: 3.150
====> Epoch: 1496 Average train loss: -6706.9753 Average bpd: 3.150
====> [eval] Epoch: 1496 Average bpd: 3.297
====> [test] Epoch: 1496 Average bpd: 3.309
Best val_bpd: 3.296257536691012
Best test_bpd: 3.308021430842458
====> Epoch: 1497 Average train loss: -6707.6196 Average bpd: 3.150
====> Epoch: 1498 Average train loss: -6707.9788 Average bpd: 3.150
====> [eval] Epoch: 1498 Average bpd: 3.295
====> [test] Epoch: 1498 Average bpd: 3.307
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1499 Average train loss: -6706.5868 Average bpd: 3.150
====> Epoch: 1500 Average train loss: -6707.7389 Average bpd: 3.150
====> [eval] Epoch: 1500 Average bpd: 3.298
====> [test] Epoch: 1500 Average bpd: 3.309
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1501 Average train loss: -6707.2331 Average bpd: 3.150
====> Epoch: 1502 Average train loss: -6707.3530 Average bpd: 3.150
====> [eval] Epoch: 1502 Average bpd: 3.296
====> [test] Epoch: 1502 Average bpd: 3.308
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1503 Average train loss: -6706.4612 Average bpd: 3.150
====> Epoch: 1504 Average train loss: -6705.9912 Average bpd: 3.149
====> [eval] Epoch: 1504 Average bpd: 3.298
====> [test] Epoch: 1504 Average bpd: 3.310
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1505 Average train loss: -6706.6832 Average bpd: 3.150
====> Epoch: 1506 Average train loss: -6707.7393 Average bpd: 3.150
====> [eval] Epoch: 1506 Average bpd: 3.296
====> [test] Epoch: 1506 Average bpd: 3.308
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1507 Average train loss: -6707.0202 Average bpd: 3.150
====> Epoch: 1508 Average train loss: -6706.0351 Average bpd: 3.149
====> [eval] Epoch: 1508 Average bpd: 3.296
====> [test] Epoch: 1508 Average bpd: 3.308
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1509 Average train loss: -6706.6227 Average bpd: 3.150
====> Epoch: 1510 Average train loss: -6706.6804 Average bpd: 3.150
====> [eval] Epoch: 1510 Average bpd: 3.298
====> [test] Epoch: 1510 Average bpd: 3.309
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1511 Average train loss: -6706.1614 Average bpd: 3.149
====> Epoch: 1512 Average train loss: -6704.7147 Average bpd: 3.149
====> [eval] Epoch: 1512 Average bpd: 3.296
====> [test] Epoch: 1512 Average bpd: 3.308
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1513 Average train loss: -6706.2823 Average bpd: 3.149
====> Epoch: 1514 Average train loss: -6705.9669 Average bpd: 3.149
====> [eval] Epoch: 1514 Average bpd: 3.296
====> [test] Epoch: 1514 Average bpd: 3.307
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1515 Average train loss: -6705.5107 Average bpd: 3.149
====> Epoch: 1516 Average train loss: -6706.0243 Average bpd: 3.149
====> [eval] Epoch: 1516 Average bpd: 3.297
====> [test] Epoch: 1516 Average bpd: 3.309
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1517 Average train loss: -6704.5653 Average bpd: 3.149
====> Epoch: 1518 Average train loss: -6705.9188 Average bpd: 3.149
====> [eval] Epoch: 1518 Average bpd: 3.296
====> [test] Epoch: 1518 Average bpd: 3.308
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1519 Average train loss: -6705.7157 Average bpd: 3.149
====> Epoch: 1520 Average train loss: -6706.0287 Average bpd: 3.149
====> [eval] Epoch: 1520 Average bpd: 3.298
====> [test] Epoch: 1520 Average bpd: 3.310
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1521 Average train loss: -6705.5907 Average bpd: 3.149
====> Epoch: 1522 Average train loss: -6704.4740 Average bpd: 3.149
====> [eval] Epoch: 1522 Average bpd: 3.298
====> [test] Epoch: 1522 Average bpd: 3.310
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1523 Average train loss: -6704.8572 Average bpd: 3.149
====> Epoch: 1524 Average train loss: -6705.3663 Average bpd: 3.149
====> [eval] Epoch: 1524 Average bpd: 3.298
====> [test] Epoch: 1524 Average bpd: 3.310
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1525 Average train loss: -6705.8561 Average bpd: 3.149
====> Epoch: 1526 Average train loss: -6704.5354 Average bpd: 3.149
====> [eval] Epoch: 1526 Average bpd: 3.299
====> [test] Epoch: 1526 Average bpd: 3.310
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1527 Average train loss: -6705.4887 Average bpd: 3.149
====> Epoch: 1528 Average train loss: -6704.6862 Average bpd: 3.149
====> [eval] Epoch: 1528 Average bpd: 3.298
====> [test] Epoch: 1528 Average bpd: 3.309
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1529 Average train loss: -6705.0377 Average bpd: 3.149
====> Epoch: 1530 Average train loss: -6704.2939 Average bpd: 3.149
====> [eval] Epoch: 1530 Average bpd: 3.297
====> [test] Epoch: 1530 Average bpd: 3.308
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1531 Average train loss: -6704.8853 Average bpd: 3.149
====> Epoch: 1532 Average train loss: -6704.5092 Average bpd: 3.149
====> [eval] Epoch: 1532 Average bpd: 3.298
====> [test] Epoch: 1532 Average bpd: 3.309
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1533 Average train loss: -6704.1228 Average bpd: 3.148
====> Epoch: 1534 Average train loss: -6703.8034 Average bpd: 3.148
====> [eval] Epoch: 1534 Average bpd: 3.297
====> [test] Epoch: 1534 Average bpd: 3.309
Best val_bpd: 3.2954129204589773
Best test_bpd: 3.3069947843992913
====> Epoch: 1535 Average train loss: -6703.1912 Average bpd: 3.148
====> Epoch: 1536 Average train loss: -6703.7181 Average bpd: 3.148
====> [eval] Epoch: 1536 Average bpd: 3.295
====> [test] Epoch: 1536 Average bpd: 3.307
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1537 Average train loss: -6703.8830 Average bpd: 3.148
====> Epoch: 1538 Average train loss: -6703.8581 Average bpd: 3.148
====> [eval] Epoch: 1538 Average bpd: 3.296
====> [test] Epoch: 1538 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1539 Average train loss: -6704.8182 Average bpd: 3.149
====> Epoch: 1540 Average train loss: -6704.4880 Average bpd: 3.149
====> [eval] Epoch: 1540 Average bpd: 3.296
====> [test] Epoch: 1540 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1541 Average train loss: -6702.6522 Average bpd: 3.148
====> Epoch: 1542 Average train loss: -6703.2839 Average bpd: 3.148
====> [eval] Epoch: 1542 Average bpd: 3.297
====> [test] Epoch: 1542 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1543 Average train loss: -6703.9164 Average bpd: 3.148
====> Epoch: 1544 Average train loss: -6703.0384 Average bpd: 3.148
====> [eval] Epoch: 1544 Average bpd: 3.296
====> [test] Epoch: 1544 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1545 Average train loss: -6702.7826 Average bpd: 3.148
====> Epoch: 1546 Average train loss: -6703.1046 Average bpd: 3.148
====> [eval] Epoch: 1546 Average bpd: 3.296
====> [test] Epoch: 1546 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1547 Average train loss: -6703.3669 Average bpd: 3.148
====> Epoch: 1548 Average train loss: -6703.6340 Average bpd: 3.148
====> [eval] Epoch: 1548 Average bpd: 3.296
====> [test] Epoch: 1548 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1549 Average train loss: -6704.0983 Average bpd: 3.148
====> Epoch: 1550 Average train loss: -6703.3646 Average bpd: 3.148
====> [eval] Epoch: 1550 Average bpd: 3.295
====> [test] Epoch: 1550 Average bpd: 3.307
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1551 Average train loss: -6703.3626 Average bpd: 3.148
====> Epoch: 1552 Average train loss: -6702.1032 Average bpd: 3.147
====> [eval] Epoch: 1552 Average bpd: 3.296
====> [test] Epoch: 1552 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1553 Average train loss: -6704.1169 Average bpd: 3.148
====> Epoch: 1554 Average train loss: -6702.5038 Average bpd: 3.148
====> [eval] Epoch: 1554 Average bpd: 3.296
====> [test] Epoch: 1554 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1555 Average train loss: -6702.7724 Average bpd: 3.148
====> Epoch: 1556 Average train loss: -6701.6844 Average bpd: 3.147
====> [eval] Epoch: 1556 Average bpd: 3.295
====> [test] Epoch: 1556 Average bpd: 3.307
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1557 Average train loss: -6701.3498 Average bpd: 3.147
====> Epoch: 1558 Average train loss: -6702.8654 Average bpd: 3.148
====> [eval] Epoch: 1558 Average bpd: 3.296
====> [test] Epoch: 1558 Average bpd: 3.308
Best val_bpd: 3.295076990447939
Best test_bpd: 3.306511201114334
====> Epoch: 1559 Average train loss: -6700.9404 Average bpd: 3.147
====> Epoch: 1560 Average train loss: -6701.1495 Average bpd: 3.147
====> [eval] Epoch: 1560 Average bpd: 3.295
====> [test] Epoch: 1560 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1561 Average train loss: -6702.6447 Average bpd: 3.148
====> Epoch: 1562 Average train loss: -6700.8948 Average bpd: 3.147
====> [eval] Epoch: 1562 Average bpd: 3.296
====> [test] Epoch: 1562 Average bpd: 3.308
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1563 Average train loss: -6702.8919 Average bpd: 3.148
====> Epoch: 1564 Average train loss: -6702.7213 Average bpd: 3.148
====> [eval] Epoch: 1564 Average bpd: 3.295
====> [test] Epoch: 1564 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1565 Average train loss: -6700.4568 Average bpd: 3.147
====> Epoch: 1566 Average train loss: -6702.6729 Average bpd: 3.148
====> [eval] Epoch: 1566 Average bpd: 3.295
====> [test] Epoch: 1566 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1567 Average train loss: -6701.4223 Average bpd: 3.147
====> Epoch: 1568 Average train loss: -6701.6191 Average bpd: 3.147
====> [eval] Epoch: 1568 Average bpd: 3.295
====> [test] Epoch: 1568 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1569 Average train loss: -6699.7588 Average bpd: 3.146
====> Epoch: 1570 Average train loss: -6702.0770 Average bpd: 3.147
====> [eval] Epoch: 1570 Average bpd: 3.295
====> [test] Epoch: 1570 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1571 Average train loss: -6699.5820 Average bpd: 3.146
====> Epoch: 1572 Average train loss: -6699.8608 Average bpd: 3.146
====> [eval] Epoch: 1572 Average bpd: 3.296
====> [test] Epoch: 1572 Average bpd: 3.308
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1573 Average train loss: -6701.1100 Average bpd: 3.147
====> Epoch: 1574 Average train loss: -6701.0734 Average bpd: 3.147
====> [eval] Epoch: 1574 Average bpd: 3.298
====> [test] Epoch: 1574 Average bpd: 3.309
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1575 Average train loss: -6701.0768 Average bpd: 3.147
====> Epoch: 1576 Average train loss: -6701.1039 Average bpd: 3.147
====> [eval] Epoch: 1576 Average bpd: 3.296
====> [test] Epoch: 1576 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1577 Average train loss: -6700.0267 Average bpd: 3.147
====> Epoch: 1578 Average train loss: -6700.6516 Average bpd: 3.147
====> [eval] Epoch: 1578 Average bpd: 3.295
====> [test] Epoch: 1578 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1579 Average train loss: -6699.6104 Average bpd: 3.146
====> Epoch: 1580 Average train loss: -6699.7133 Average bpd: 3.146
====> [eval] Epoch: 1580 Average bpd: 3.295
====> [test] Epoch: 1580 Average bpd: 3.307
Best val_bpd: 3.294873304210437
Best test_bpd: 3.306721171612412
====> Epoch: 1581 Average train loss: -6700.6672 Average bpd: 3.147
====> Epoch: 1582 Average train loss: -6701.6862 Average bpd: 3.147
====> [eval] Epoch: 1582 Average bpd: 3.295
====> [test] Epoch: 1582 Average bpd: 3.306
Best val_bpd: 3.2945867281405405
Best test_bpd: 3.306353477058859
====> Epoch: 1583 Average train loss: -6701.1680 Average bpd: 3.147
====> Epoch: 1584 Average train loss: -6699.7558 Average bpd: 3.146
====> [eval] Epoch: 1584 Average bpd: 3.297
====> [test] Epoch: 1584 Average bpd: 3.309
Best val_bpd: 3.2945867281405405
Best test_bpd: 3.306353477058859
====> Epoch: 1585 Average train loss: -6700.2113 Average bpd: 3.147
====> Epoch: 1586 Average train loss: -6700.1464 Average bpd: 3.147
====> [eval] Epoch: 1586 Average bpd: 3.296
====> [test] Epoch: 1586 Average bpd: 3.308
Best val_bpd: 3.2945867281405405
Best test_bpd: 3.306353477058859
====> Epoch: 1587 Average train loss: -6698.3103 Average bpd: 3.146
====> Epoch: 1588 Average train loss: -6698.9300 Average bpd: 3.146
====> [eval] Epoch: 1588 Average bpd: 3.295
====> [test] Epoch: 1588 Average bpd: 3.306
Best val_bpd: 3.2945867281405405
Best test_bpd: 3.306353477058859
====> Epoch: 1589 Average train loss: -6699.3053 Average bpd: 3.146
====> Epoch: 1590 Average train loss: -6700.3848 Average bpd: 3.147
====> [eval] Epoch: 1590 Average bpd: 3.294
====> [test] Epoch: 1590 Average bpd: 3.306
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1591 Average train loss: -6699.0755 Average bpd: 3.146
====> Epoch: 1592 Average train loss: -6698.4033 Average bpd: 3.146
====> [eval] Epoch: 1592 Average bpd: 3.295
====> [test] Epoch: 1592 Average bpd: 3.306
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1593 Average train loss: -6699.6284 Average bpd: 3.146
====> Epoch: 1594 Average train loss: -6698.9784 Average bpd: 3.146
====> [eval] Epoch: 1594 Average bpd: 3.295
====> [test] Epoch: 1594 Average bpd: 3.307
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1595 Average train loss: -6699.6859 Average bpd: 3.146
====> Epoch: 1596 Average train loss: -6698.3550 Average bpd: 3.146
====> [eval] Epoch: 1596 Average bpd: 3.295
====> [test] Epoch: 1596 Average bpd: 3.307
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1597 Average train loss: -6698.6030 Average bpd: 3.146
====> Epoch: 1598 Average train loss: -6700.4500 Average bpd: 3.147
====> [eval] Epoch: 1598 Average bpd: 3.296
====> [test] Epoch: 1598 Average bpd: 3.308
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1599 Average train loss: -6697.9093 Average bpd: 3.146
====> Epoch: 1600 Average train loss: -6698.6054 Average bpd: 3.146
====> [eval] Epoch: 1600 Average bpd: 3.295
====> [test] Epoch: 1600 Average bpd: 3.307
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1601 Average train loss: -6699.0270 Average bpd: 3.146
====> Epoch: 1602 Average train loss: -6698.4131 Average bpd: 3.146
====> [eval] Epoch: 1602 Average bpd: 3.296
====> [test] Epoch: 1602 Average bpd: 3.307
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1603 Average train loss: -6698.8227 Average bpd: 3.146
====> Epoch: 1604 Average train loss: -6698.3292 Average bpd: 3.146
====> [eval] Epoch: 1604 Average bpd: 3.296
====> [test] Epoch: 1604 Average bpd: 3.307
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1605 Average train loss: -6697.5828 Average bpd: 3.145
====> Epoch: 1606 Average train loss: -6698.6228 Average bpd: 3.146
====> [eval] Epoch: 1606 Average bpd: 3.296
====> [test] Epoch: 1606 Average bpd: 3.308
Best val_bpd: 3.2938866934416624
Best test_bpd: 3.3057198341941993
====> Epoch: 1607 Average train loss: -6697.9346 Average bpd: 3.146
====> Epoch: 1608 Average train loss: -6698.6120 Average bpd: 3.146
====> [eval] Epoch: 1608 Average bpd: 3.294
====> [test] Epoch: 1608 Average bpd: 3.305
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1609 Average train loss: -6697.9185 Average bpd: 3.146
====> Epoch: 1610 Average train loss: -6697.8413 Average bpd: 3.145
====> [eval] Epoch: 1610 Average bpd: 3.297
====> [test] Epoch: 1610 Average bpd: 3.309
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1611 Average train loss: -6696.8456 Average bpd: 3.145
====> Epoch: 1612 Average train loss: -6697.7661 Average bpd: 3.145
====> [eval] Epoch: 1612 Average bpd: 3.295
====> [test] Epoch: 1612 Average bpd: 3.307
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1613 Average train loss: -6697.9017 Average bpd: 3.146
====> Epoch: 1614 Average train loss: -6697.6071 Average bpd: 3.145
====> [eval] Epoch: 1614 Average bpd: 3.296
====> [test] Epoch: 1614 Average bpd: 3.307
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1615 Average train loss: -6696.8881 Average bpd: 3.145
====> Epoch: 1616 Average train loss: -6696.7616 Average bpd: 3.145
====> [eval] Epoch: 1616 Average bpd: 3.294
====> [test] Epoch: 1616 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1617 Average train loss: -6697.3700 Average bpd: 3.145
====> Epoch: 1618 Average train loss: -6696.8961 Average bpd: 3.145
====> [eval] Epoch: 1618 Average bpd: 3.297
====> [test] Epoch: 1618 Average bpd: 3.308
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1619 Average train loss: -6695.9988 Average bpd: 3.145
====> Epoch: 1620 Average train loss: -6696.6913 Average bpd: 3.145
====> [eval] Epoch: 1620 Average bpd: 3.296
====> [test] Epoch: 1620 Average bpd: 3.308
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1621 Average train loss: -6696.7960 Average bpd: 3.145
====> Epoch: 1622 Average train loss: -6697.1484 Average bpd: 3.145
====> [eval] Epoch: 1622 Average bpd: 3.295
====> [test] Epoch: 1622 Average bpd: 3.307
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1623 Average train loss: -6696.6180 Average bpd: 3.145
====> Epoch: 1624 Average train loss: -6696.7917 Average bpd: 3.145
====> [eval] Epoch: 1624 Average bpd: 3.294
====> [test] Epoch: 1624 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1625 Average train loss: -6697.0390 Average bpd: 3.145
====> Epoch: 1626 Average train loss: -6697.6166 Average bpd: 3.145
====> [eval] Epoch: 1626 Average bpd: 3.295
====> [test] Epoch: 1626 Average bpd: 3.307
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1627 Average train loss: -6696.0115 Average bpd: 3.145
====> Epoch: 1628 Average train loss: -6695.8903 Average bpd: 3.145
====> [eval] Epoch: 1628 Average bpd: 3.295
====> [test] Epoch: 1628 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1629 Average train loss: -6696.6417 Average bpd: 3.145
====> Epoch: 1630 Average train loss: -6696.4757 Average bpd: 3.145
====> [eval] Epoch: 1630 Average bpd: 3.294
====> [test] Epoch: 1630 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1631 Average train loss: -6697.6062 Average bpd: 3.145
====> Epoch: 1632 Average train loss: -6696.1958 Average bpd: 3.145
====> [eval] Epoch: 1632 Average bpd: 3.294
====> [test] Epoch: 1632 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1633 Average train loss: -6696.4730 Average bpd: 3.145
====> Epoch: 1634 Average train loss: -6695.3889 Average bpd: 3.144
====> [eval] Epoch: 1634 Average bpd: 3.295
====> [test] Epoch: 1634 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1635 Average train loss: -6696.0080 Average bpd: 3.145
====> Epoch: 1636 Average train loss: -6695.7366 Average bpd: 3.145
====> [eval] Epoch: 1636 Average bpd: 3.294
====> [test] Epoch: 1636 Average bpd: 3.306
Best val_bpd: 3.293708071679944
Best test_bpd: 3.3052446243599096
====> Epoch: 1637 Average train loss: -6695.8179 Average bpd: 3.145
====> Epoch: 1638 Average train loss: -6694.6309 Average bpd: 3.144
====> [eval] Epoch: 1638 Average bpd: 3.293
====> [test] Epoch: 1638 Average bpd: 3.305
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1639 Average train loss: -6695.0094 Average bpd: 3.144
====> Epoch: 1640 Average train loss: -6697.1582 Average bpd: 3.145
====> [eval] Epoch: 1640 Average bpd: 3.295
====> [test] Epoch: 1640 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1641 Average train loss: -6696.1304 Average bpd: 3.145
====> Epoch: 1642 Average train loss: -6695.0128 Average bpd: 3.144
====> [eval] Epoch: 1642 Average bpd: 3.294
====> [test] Epoch: 1642 Average bpd: 3.306
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1643 Average train loss: -6695.1485 Average bpd: 3.144
====> Epoch: 1644 Average train loss: -6695.6222 Average bpd: 3.144
====> [eval] Epoch: 1644 Average bpd: 3.295
====> [test] Epoch: 1644 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1645 Average train loss: -6696.3794 Average bpd: 3.145
====> Epoch: 1646 Average train loss: -6694.8982 Average bpd: 3.144
====> [eval] Epoch: 1646 Average bpd: 3.296
====> [test] Epoch: 1646 Average bpd: 3.308
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1647 Average train loss: -6695.2776 Average bpd: 3.144
====> Epoch: 1648 Average train loss: -6694.8023 Average bpd: 3.144
====> [eval] Epoch: 1648 Average bpd: 3.295
====> [test] Epoch: 1648 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1649 Average train loss: -6693.9657 Average bpd: 3.144
====> Epoch: 1650 Average train loss: -6694.8344 Average bpd: 3.144
====> [eval] Epoch: 1650 Average bpd: 3.295
====> [test] Epoch: 1650 Average bpd: 3.306
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1651 Average train loss: -6694.3040 Average bpd: 3.144
====> Epoch: 1652 Average train loss: -6695.0998 Average bpd: 3.144
====> [eval] Epoch: 1652 Average bpd: 3.294
====> [test] Epoch: 1652 Average bpd: 3.305
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1653 Average train loss: -6694.8694 Average bpd: 3.144
====> Epoch: 1654 Average train loss: -6694.0513 Average bpd: 3.144
====> [eval] Epoch: 1654 Average bpd: 3.294
====> [test] Epoch: 1654 Average bpd: 3.306
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1655 Average train loss: -6694.2686 Average bpd: 3.144
====> Epoch: 1656 Average train loss: -6693.4446 Average bpd: 3.143
====> [eval] Epoch: 1656 Average bpd: 3.295
====> [test] Epoch: 1656 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1657 Average train loss: -6695.0689 Average bpd: 3.144
====> Epoch: 1658 Average train loss: -6694.2358 Average bpd: 3.144
====> [eval] Epoch: 1658 Average bpd: 3.296
====> [test] Epoch: 1658 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1659 Average train loss: -6694.2464 Average bpd: 3.144
====> Epoch: 1660 Average train loss: -6693.2224 Average bpd: 3.143
====> [eval] Epoch: 1660 Average bpd: 3.295
====> [test] Epoch: 1660 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1661 Average train loss: -6694.1539 Average bpd: 3.144
====> Epoch: 1662 Average train loss: -6694.8388 Average bpd: 3.144
====> [eval] Epoch: 1662 Average bpd: 3.296
====> [test] Epoch: 1662 Average bpd: 3.308
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1663 Average train loss: -6694.7404 Average bpd: 3.144
====> Epoch: 1664 Average train loss: -6694.3131 Average bpd: 3.144
====> [eval] Epoch: 1664 Average bpd: 3.297
====> [test] Epoch: 1664 Average bpd: 3.308
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1665 Average train loss: -6694.4467 Average bpd: 3.144
====> Epoch: 1666 Average train loss: -6694.6809 Average bpd: 3.144
====> [eval] Epoch: 1666 Average bpd: 3.295
====> [test] Epoch: 1666 Average bpd: 3.306
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1667 Average train loss: -6693.6287 Average bpd: 3.144
====> Epoch: 1668 Average train loss: -6694.0962 Average bpd: 3.144
====> [eval] Epoch: 1668 Average bpd: 3.295
====> [test] Epoch: 1668 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1669 Average train loss: -6694.5211 Average bpd: 3.144
====> Epoch: 1670 Average train loss: -6694.3618 Average bpd: 3.144
====> [eval] Epoch: 1670 Average bpd: 3.295
====> [test] Epoch: 1670 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1671 Average train loss: -6691.4926 Average bpd: 3.143
====> Epoch: 1672 Average train loss: -6693.1721 Average bpd: 3.143
====> [eval] Epoch: 1672 Average bpd: 3.294
====> [test] Epoch: 1672 Average bpd: 3.306
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1673 Average train loss: -6692.5499 Average bpd: 3.143
====> Epoch: 1674 Average train loss: -6692.9139 Average bpd: 3.143
====> [eval] Epoch: 1674 Average bpd: 3.294
====> [test] Epoch: 1674 Average bpd: 3.305
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1675 Average train loss: -6693.8806 Average bpd: 3.144
====> Epoch: 1676 Average train loss: -6692.9705 Average bpd: 3.143
====> [eval] Epoch: 1676 Average bpd: 3.295
====> [test] Epoch: 1676 Average bpd: 3.307
Best val_bpd: 3.2934472095561795
Best test_bpd: 3.3051736477729965
====> Epoch: 1677 Average train loss: -6693.5258 Average bpd: 3.143
====> Epoch: 1678 Average train loss: -6693.3390 Average bpd: 3.143
====> [eval] Epoch: 1678 Average bpd: 3.293
====> [test] Epoch: 1678 Average bpd: 3.305
Best val_bpd: 3.2931489168960684
Best test_bpd: 3.3048374855490974
====> Epoch: 1679 Average train loss: -6693.4129 Average bpd: 3.143
====> Epoch: 1680 Average train loss: -6694.2805 Average bpd: 3.144
====> [eval] Epoch: 1680 Average bpd: 3.294
====> [test] Epoch: 1680 Average bpd: 3.306
Best val_bpd: 3.2931489168960684
Best test_bpd: 3.3048374855490974
====> Epoch: 1681 Average train loss: -6692.7461 Average bpd: 3.143
====> Epoch: 1682 Average train loss: -6693.9750 Average bpd: 3.144
====> [eval] Epoch: 1682 Average bpd: 3.294
====> [test] Epoch: 1682 Average bpd: 3.306
Best val_bpd: 3.2931489168960684
Best test_bpd: 3.3048374855490974
====> Epoch: 1683 Average train loss: -6692.5012 Average bpd: 3.143
====> Epoch: 1684 Average train loss: -6692.2180 Average bpd: 3.143
====> [eval] Epoch: 1684 Average bpd: 3.293
====> [test] Epoch: 1684 Average bpd: 3.304
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1685 Average train loss: -6693.6341 Average bpd: 3.144
====> Epoch: 1686 Average train loss: -6691.3087 Average bpd: 3.142
====> [eval] Epoch: 1686 Average bpd: 3.294
====> [test] Epoch: 1686 Average bpd: 3.306
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1687 Average train loss: -6691.0522 Average bpd: 3.142
====> Epoch: 1688 Average train loss: -6691.5221 Average bpd: 3.143
====> [eval] Epoch: 1688 Average bpd: 3.295
====> [test] Epoch: 1688 Average bpd: 3.306
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1689 Average train loss: -6693.1424 Average bpd: 3.143
====> Epoch: 1690 Average train loss: -6690.1388 Average bpd: 3.142
====> [eval] Epoch: 1690 Average bpd: 3.293
====> [test] Epoch: 1690 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1691 Average train loss: -6692.1643 Average bpd: 3.143
====> Epoch: 1692 Average train loss: -6691.5504 Average bpd: 3.143
====> [eval] Epoch: 1692 Average bpd: 3.293
====> [test] Epoch: 1692 Average bpd: 3.304
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1693 Average train loss: -6690.5681 Average bpd: 3.142
====> Epoch: 1694 Average train loss: -6692.5300 Average bpd: 3.143
====> [eval] Epoch: 1694 Average bpd: 3.294
====> [test] Epoch: 1694 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1695 Average train loss: -6692.3137 Average bpd: 3.143
====> Epoch: 1696 Average train loss: -6691.6123 Average bpd: 3.143
====> [eval] Epoch: 1696 Average bpd: 3.293
====> [test] Epoch: 1696 Average bpd: 3.304
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1697 Average train loss: -6692.5683 Average bpd: 3.143
====> Epoch: 1698 Average train loss: -6690.0808 Average bpd: 3.142
====> [eval] Epoch: 1698 Average bpd: 3.294
====> [test] Epoch: 1698 Average bpd: 3.306
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1699 Average train loss: -6692.2615 Average bpd: 3.143
====> Epoch: 1700 Average train loss: -6690.3417 Average bpd: 3.142
====> [eval] Epoch: 1700 Average bpd: 3.293
====> [test] Epoch: 1700 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1701 Average train loss: -6690.6659 Average bpd: 3.142
====> Epoch: 1702 Average train loss: -6691.7570 Average bpd: 3.143
====> [eval] Epoch: 1702 Average bpd: 3.293
====> [test] Epoch: 1702 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1703 Average train loss: -6692.4143 Average bpd: 3.143
====> Epoch: 1704 Average train loss: -6691.4307 Average bpd: 3.142
====> [eval] Epoch: 1704 Average bpd: 3.293
====> [test] Epoch: 1704 Average bpd: 3.304
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1705 Average train loss: -6691.1366 Average bpd: 3.142
====> Epoch: 1706 Average train loss: -6690.3506 Average bpd: 3.142
====> [eval] Epoch: 1706 Average bpd: 3.293
====> [test] Epoch: 1706 Average bpd: 3.304
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1707 Average train loss: -6691.8670 Average bpd: 3.143
====> Epoch: 1708 Average train loss: -6690.6557 Average bpd: 3.142
====> [eval] Epoch: 1708 Average bpd: 3.293
====> [test] Epoch: 1708 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1709 Average train loss: -6690.3419 Average bpd: 3.142
====> Epoch: 1710 Average train loss: -6690.4288 Average bpd: 3.142
====> [eval] Epoch: 1710 Average bpd: 3.293
====> [test] Epoch: 1710 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1711 Average train loss: -6691.3927 Average bpd: 3.142
====> Epoch: 1712 Average train loss: -6689.5838 Average bpd: 3.142
====> [eval] Epoch: 1712 Average bpd: 3.293
====> [test] Epoch: 1712 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1713 Average train loss: -6690.7966 Average bpd: 3.142
====> Epoch: 1714 Average train loss: -6690.4577 Average bpd: 3.142
====> [eval] Epoch: 1714 Average bpd: 3.294
====> [test] Epoch: 1714 Average bpd: 3.306
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1715 Average train loss: -6689.8268 Average bpd: 3.142
====> Epoch: 1716 Average train loss: -6689.4869 Average bpd: 3.142
====> [eval] Epoch: 1716 Average bpd: 3.294
====> [test] Epoch: 1716 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1717 Average train loss: -6690.3541 Average bpd: 3.142
====> Epoch: 1718 Average train loss: -6690.2485 Average bpd: 3.142
====> [eval] Epoch: 1718 Average bpd: 3.293
====> [test] Epoch: 1718 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1719 Average train loss: -6689.9541 Average bpd: 3.142
====> Epoch: 1720 Average train loss: -6691.0270 Average bpd: 3.142
====> [eval] Epoch: 1720 Average bpd: 3.293
====> [test] Epoch: 1720 Average bpd: 3.305
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1721 Average train loss: -6689.6844 Average bpd: 3.142
====> Epoch: 1722 Average train loss: -6690.3603 Average bpd: 3.142
====> [eval] Epoch: 1722 Average bpd: 3.295
====> [test] Epoch: 1722 Average bpd: 3.307
Best val_bpd: 3.2926393968857646
Best test_bpd: 3.304412467799247
====> Epoch: 1723 Average train loss: -6690.1035 Average bpd: 3.142
====> Epoch: 1724 Average train loss: -6689.8122 Average bpd: 3.142
====> [eval] Epoch: 1724 Average bpd: 3.292
====> [test] Epoch: 1724 Average bpd: 3.304
Best val_bpd: 3.292283358692214
Best test_bpd: 3.3040774419670402
====> Epoch: 1725 Average train loss: -6690.2524 Average bpd: 3.142
====> Epoch: 1726 Average train loss: -6688.8564 Average bpd: 3.141
====> [eval] Epoch: 1726 Average bpd: 3.294
====> [test] Epoch: 1726 Average bpd: 3.306
Best val_bpd: 3.292283358692214
Best test_bpd: 3.3040774419670402
====> Epoch: 1727 Average train loss: -6689.6019 Average bpd: 3.142
====> Epoch: 1728 Average train loss: -6689.6325 Average bpd: 3.142
====> [eval] Epoch: 1728 Average bpd: 3.295
====> [test] Epoch: 1728 Average bpd: 3.307
Best val_bpd: 3.292283358692214
Best test_bpd: 3.3040774419670402
====> Epoch: 1729 Average train loss: -6690.6649 Average bpd: 3.142
====> Epoch: 1730 Average train loss: -6687.7658 Average bpd: 3.141
====> [eval] Epoch: 1730 Average bpd: 3.295
====> [test] Epoch: 1730 Average bpd: 3.306
Best val_bpd: 3.292283358692214
Best test_bpd: 3.3040774419670402
====> Epoch: 1731 Average train loss: -6688.6975 Average bpd: 3.141
====> Epoch: 1732 Average train loss: -6687.9703 Average bpd: 3.141
====> [eval] Epoch: 1732 Average bpd: 3.292
====> [test] Epoch: 1732 Average bpd: 3.304
Best val_bpd: 3.292283358692214
Best test_bpd: 3.3040774419670402
====> Epoch: 1733 Average train loss: -6688.7534 Average bpd: 3.141
====> Epoch: 1734 Average train loss: -6688.8847 Average bpd: 3.141
====> [eval] Epoch: 1734 Average bpd: 3.292
====> [test] Epoch: 1734 Average bpd: 3.303
Best val_bpd: 3.2919518675751647
Best test_bpd: 3.303402720647107
====> Epoch: 1735 Average train loss: -6688.6973 Average bpd: 3.141
====> Epoch: 1736 Average train loss: -6687.9956 Average bpd: 3.141
====> [eval] Epoch: 1736 Average bpd: 3.293
====> [test] Epoch: 1736 Average bpd: 3.304
Best val_bpd: 3.2919518675751647
Best test_bpd: 3.303402720647107
====> Epoch: 1737 Average train loss: -6687.4564 Average bpd: 3.141
====> Epoch: 1738 Average train loss: -6688.2975 Average bpd: 3.141
====> [eval] Epoch: 1738 Average bpd: 3.293
====> [test] Epoch: 1738 Average bpd: 3.305
Best val_bpd: 3.2919518675751647
Best test_bpd: 3.303402720647107
====> Epoch: 1739 Average train loss: -6689.1949 Average bpd: 3.141
====> Epoch: 1740 Average train loss: -6687.3487 Average bpd: 3.141
====> [eval] Epoch: 1740 Average bpd: 3.292
====> [test] Epoch: 1740 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1741 Average train loss: -6688.8724 Average bpd: 3.141
====> Epoch: 1742 Average train loss: -6687.4363 Average bpd: 3.141
====> [eval] Epoch: 1742 Average bpd: 3.293
====> [test] Epoch: 1742 Average bpd: 3.305
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1743 Average train loss: -6689.3895 Average bpd: 3.142
====> Epoch: 1744 Average train loss: -6688.2222 Average bpd: 3.141
====> [eval] Epoch: 1744 Average bpd: 3.293
====> [test] Epoch: 1744 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1745 Average train loss: -6687.5138 Average bpd: 3.141
====> Epoch: 1746 Average train loss: -6688.5453 Average bpd: 3.141
====> [eval] Epoch: 1746 Average bpd: 3.292
====> [test] Epoch: 1746 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1747 Average train loss: -6688.4888 Average bpd: 3.141
====> Epoch: 1748 Average train loss: -6687.9612 Average bpd: 3.141
====> [eval] Epoch: 1748 Average bpd: 3.293
====> [test] Epoch: 1748 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1749 Average train loss: -6686.9507 Average bpd: 3.140
====> Epoch: 1750 Average train loss: -6688.5423 Average bpd: 3.141
====> [eval] Epoch: 1750 Average bpd: 3.293
====> [test] Epoch: 1750 Average bpd: 3.305
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1751 Average train loss: -6688.6858 Average bpd: 3.141
====> Epoch: 1752 Average train loss: -6688.3147 Average bpd: 3.141
====> [eval] Epoch: 1752 Average bpd: 3.294
====> [test] Epoch: 1752 Average bpd: 3.305
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1753 Average train loss: -6687.6844 Average bpd: 3.141
====> Epoch: 1754 Average train loss: -6687.7423 Average bpd: 3.141
====> [eval] Epoch: 1754 Average bpd: 3.294
====> [test] Epoch: 1754 Average bpd: 3.305
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1755 Average train loss: -6687.5569 Average bpd: 3.141
====> Epoch: 1756 Average train loss: -6688.3755 Average bpd: 3.141
====> [eval] Epoch: 1756 Average bpd: 3.293
====> [test] Epoch: 1756 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1757 Average train loss: -6689.1915 Average bpd: 3.141
====> Epoch: 1758 Average train loss: -6687.8334 Average bpd: 3.141
====> [eval] Epoch: 1758 Average bpd: 3.293
====> [test] Epoch: 1758 Average bpd: 3.305
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1759 Average train loss: -6687.4183 Average bpd: 3.141
====> Epoch: 1760 Average train loss: -6688.7825 Average bpd: 3.141
====> [eval] Epoch: 1760 Average bpd: 3.292
====> [test] Epoch: 1760 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1761 Average train loss: -6687.6939 Average bpd: 3.141
====> Epoch: 1762 Average train loss: -6687.7077 Average bpd: 3.141
====> [eval] Epoch: 1762 Average bpd: 3.292
====> [test] Epoch: 1762 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1763 Average train loss: -6687.0537 Average bpd: 3.140
====> Epoch: 1764 Average train loss: -6687.3535 Average bpd: 3.141
====> [eval] Epoch: 1764 Average bpd: 3.292
====> [test] Epoch: 1764 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1765 Average train loss: -6687.1746 Average bpd: 3.140
====> Epoch: 1766 Average train loss: -6686.9985 Average bpd: 3.140
====> [eval] Epoch: 1766 Average bpd: 3.293
====> [test] Epoch: 1766 Average bpd: 3.305
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1767 Average train loss: -6687.8151 Average bpd: 3.141
====> Epoch: 1768 Average train loss: -6686.7629 Average bpd: 3.140
====> [eval] Epoch: 1768 Average bpd: 3.293
====> [test] Epoch: 1768 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1769 Average train loss: -6686.5817 Average bpd: 3.140
====> Epoch: 1770 Average train loss: -6687.8613 Average bpd: 3.141
====> [eval] Epoch: 1770 Average bpd: 3.294
====> [test] Epoch: 1770 Average bpd: 3.306
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1771 Average train loss: -6687.3432 Average bpd: 3.141
====> Epoch: 1772 Average train loss: -6687.3890 Average bpd: 3.141
====> [eval] Epoch: 1772 Average bpd: 3.292
====> [test] Epoch: 1772 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1773 Average train loss: -6686.7633 Average bpd: 3.140
====> Epoch: 1774 Average train loss: -6686.6480 Average bpd: 3.140
====> [eval] Epoch: 1774 Average bpd: 3.292
====> [test] Epoch: 1774 Average bpd: 3.304
Best val_bpd: 3.291929083139419
Best test_bpd: 3.3036267879172345
====> Epoch: 1775 Average train loss: -6687.6922 Average bpd: 3.141
====> Epoch: 1776 Average train loss: -6685.2153 Average bpd: 3.140
====> [eval] Epoch: 1776 Average bpd: 3.292
====> [test] Epoch: 1776 Average bpd: 3.303
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1777 Average train loss: -6686.1788 Average bpd: 3.140
====> Epoch: 1778 Average train loss: -6687.0019 Average bpd: 3.140
====> [eval] Epoch: 1778 Average bpd: 3.293
====> [test] Epoch: 1778 Average bpd: 3.304
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1779 Average train loss: -6684.2901 Average bpd: 3.139
====> Epoch: 1780 Average train loss: -6687.2282 Average bpd: 3.141
====> [eval] Epoch: 1780 Average bpd: 3.292
====> [test] Epoch: 1780 Average bpd: 3.304
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1781 Average train loss: -6686.3355 Average bpd: 3.140
====> Epoch: 1782 Average train loss: -6685.8376 Average bpd: 3.140
====> [eval] Epoch: 1782 Average bpd: 3.292
====> [test] Epoch: 1782 Average bpd: 3.304
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1783 Average train loss: -6685.5762 Average bpd: 3.140
====> Epoch: 1784 Average train loss: -6684.9221 Average bpd: 3.139
====> [eval] Epoch: 1784 Average bpd: 3.293
====> [test] Epoch: 1784 Average bpd: 3.305
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1785 Average train loss: -6685.6019 Average bpd: 3.140
====> Epoch: 1786 Average train loss: -6686.4874 Average bpd: 3.140
====> [eval] Epoch: 1786 Average bpd: 3.293
====> [test] Epoch: 1786 Average bpd: 3.304
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1787 Average train loss: -6685.1282 Average bpd: 3.140
====> Epoch: 1788 Average train loss: -6685.8189 Average bpd: 3.140
====> [eval] Epoch: 1788 Average bpd: 3.293
====> [test] Epoch: 1788 Average bpd: 3.304
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1789 Average train loss: -6686.0106 Average bpd: 3.140
====> Epoch: 1790 Average train loss: -6684.8095 Average bpd: 3.139
====> [eval] Epoch: 1790 Average bpd: 3.292
====> [test] Epoch: 1790 Average bpd: 3.304
Best val_bpd: 3.2918419859002324
Best test_bpd: 3.3032861751887017
====> Epoch: 1791 Average train loss: -6685.4833 Average bpd: 3.140
====> Epoch: 1792 Average train loss: -6687.0632 Average bpd: 3.140
====> [eval] Epoch: 1792 Average bpd: 3.292
====> [test] Epoch: 1792 Average bpd: 3.303
Best val_bpd: 3.2916245773339208
Best test_bpd: 3.3031592128064458
====> Epoch: 1793 Average train loss: -6684.7246 Average bpd: 3.139
====> Epoch: 1794 Average train loss: -6684.7360 Average bpd: 3.139
====> [eval] Epoch: 1794 Average bpd: 3.293
====> [test] Epoch: 1794 Average bpd: 3.304
Best val_bpd: 3.2916245773339208
Best test_bpd: 3.3031592128064458
====> Epoch: 1795 Average train loss: -6685.7195 Average bpd: 3.140
====> Epoch: 1796 Average train loss: -6685.1075 Average bpd: 3.140
====> [eval] Epoch: 1796 Average bpd: 3.294
====> [test] Epoch: 1796 Average bpd: 3.305
Best val_bpd: 3.2916245773339208
Best test_bpd: 3.3031592128064458
====> Epoch: 1797 Average train loss: -6685.5615 Average bpd: 3.140
====> Epoch: 1798 Average train loss: -6685.2172 Average bpd: 3.140
====> [eval] Epoch: 1798 Average bpd: 3.294
====> [test] Epoch: 1798 Average bpd: 3.305
Best val_bpd: 3.2916245773339208
Best test_bpd: 3.3031592128064458
====> Epoch: 1799 Average train loss: -6684.0839 Average bpd: 3.139
====> Epoch: 1800 Average train loss: -6683.7404 Average bpd: 3.139
====> [eval] Epoch: 1800 Average bpd: 3.293
====> [test] Epoch: 1800 Average bpd: 3.304
Best val_bpd: 3.2916245773339208
Best test_bpd: 3.3031592128064458
====> Epoch: 1801 Average train loss: -6684.6726 Average bpd: 3.139
====> Epoch: 1802 Average train loss: -6684.4197 Average bpd: 3.139
====> [eval] Epoch: 1802 Average bpd: 3.291
====> [test] Epoch: 1802 Average bpd: 3.303
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1803 Average train loss: -6683.8418 Average bpd: 3.139
====> Epoch: 1804 Average train loss: -6684.1412 Average bpd: 3.139
====> [eval] Epoch: 1804 Average bpd: 3.293
====> [test] Epoch: 1804 Average bpd: 3.305
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1805 Average train loss: -6684.4619 Average bpd: 3.139
====> Epoch: 1806 Average train loss: -6684.9935 Average bpd: 3.139
====> [eval] Epoch: 1806 Average bpd: 3.293
====> [test] Epoch: 1806 Average bpd: 3.304
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1807 Average train loss: -6683.1689 Average bpd: 3.139
====> Epoch: 1808 Average train loss: -6685.5074 Average bpd: 3.140
====> [eval] Epoch: 1808 Average bpd: 3.293
====> [test] Epoch: 1808 Average bpd: 3.305
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1809 Average train loss: -6684.1928 Average bpd: 3.139
====> Epoch: 1810 Average train loss: -6684.1301 Average bpd: 3.139
====> [eval] Epoch: 1810 Average bpd: 3.292
====> [test] Epoch: 1810 Average bpd: 3.303
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1811 Average train loss: -6684.5326 Average bpd: 3.139
====> Epoch: 1812 Average train loss: -6683.9329 Average bpd: 3.139
====> [eval] Epoch: 1812 Average bpd: 3.292
====> [test] Epoch: 1812 Average bpd: 3.304
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1813 Average train loss: -6684.3346 Average bpd: 3.139
====> Epoch: 1814 Average train loss: -6683.9738 Average bpd: 3.139
====> [eval] Epoch: 1814 Average bpd: 3.292
====> [test] Epoch: 1814 Average bpd: 3.303
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1815 Average train loss: -6684.4534 Average bpd: 3.139
====> Epoch: 1816 Average train loss: -6683.5972 Average bpd: 3.139
====> [eval] Epoch: 1816 Average bpd: 3.293
====> [test] Epoch: 1816 Average bpd: 3.304
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1817 Average train loss: -6683.1663 Average bpd: 3.139
====> Epoch: 1818 Average train loss: -6685.3694 Average bpd: 3.140
====> [eval] Epoch: 1818 Average bpd: 3.292
====> [test] Epoch: 1818 Average bpd: 3.303
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1819 Average train loss: -6683.7415 Average bpd: 3.139
====> Epoch: 1820 Average train loss: -6684.1885 Average bpd: 3.139
====> [eval] Epoch: 1820 Average bpd: 3.293
====> [test] Epoch: 1820 Average bpd: 3.304
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1821 Average train loss: -6683.6686 Average bpd: 3.139
====> Epoch: 1822 Average train loss: -6683.2415 Average bpd: 3.139
====> [eval] Epoch: 1822 Average bpd: 3.292
====> [test] Epoch: 1822 Average bpd: 3.304
Best val_bpd: 3.2914351997627045
Best test_bpd: 3.3028802431591067
====> Epoch: 1823 Average train loss: -6684.1520 Average bpd: 3.139
====> Epoch: 1824 Average train loss: -6683.7805 Average bpd: 3.139
====> [eval] Epoch: 1824 Average bpd: 3.291
====> [test] Epoch: 1824 Average bpd: 3.303
Best val_bpd: 3.2913145673579294
Best test_bpd: 3.302710169007703
====> Epoch: 1825 Average train loss: -6682.5553 Average bpd: 3.138
====> Epoch: 1826 Average train loss: -6682.7036 Average bpd: 3.138
====> [eval] Epoch: 1826 Average bpd: 3.291
====> [test] Epoch: 1826 Average bpd: 3.302
Best val_bpd: 3.2909602497165538
Best test_bpd: 3.302345432991143
====> Epoch: 1827 Average train loss: -6682.8453 Average bpd: 3.138
====> Epoch: 1828 Average train loss: -6683.0711 Average bpd: 3.139
====> [eval] Epoch: 1828 Average bpd: 3.291
====> [test] Epoch: 1828 Average bpd: 3.302
Best val_bpd: 3.2908288723872596
Best test_bpd: 3.302358804823665
====> Epoch: 1829 Average train loss: -6682.7225 Average bpd: 3.138
====> Epoch: 1830 Average train loss: -6683.5901 Average bpd: 3.139
====> [eval] Epoch: 1830 Average bpd: 3.291
====> [test] Epoch: 1830 Average bpd: 3.302
Best val_bpd: 3.290749931624848
Best test_bpd: 3.3023576270690556
====> Epoch: 1831 Average train loss: -6682.0526 Average bpd: 3.138
====> Epoch: 1832 Average train loss: -6682.8695 Average bpd: 3.138
====> [eval] Epoch: 1832 Average bpd: 3.291
====> [test] Epoch: 1832 Average bpd: 3.302
Best val_bpd: 3.290749931624848
Best test_bpd: 3.3023576270690556
====> Epoch: 1833 Average train loss: -6682.7599 Average bpd: 3.138
====> Epoch: 1834 Average train loss: -6681.9641 Average bpd: 3.138
====> [eval] Epoch: 1834 Average bpd: 3.292
====> [test] Epoch: 1834 Average bpd: 3.303
Best val_bpd: 3.290749931624848
Best test_bpd: 3.3023576270690556
====> Epoch: 1835 Average train loss: -6683.3354 Average bpd: 3.139
====> Epoch: 1836 Average train loss: -6681.9496 Average bpd: 3.138
====> [eval] Epoch: 1836 Average bpd: 3.291
====> [test] Epoch: 1836 Average bpd: 3.303
Best val_bpd: 3.290749931624848
Best test_bpd: 3.3023576270690556
====> Epoch: 1837 Average train loss: -6682.6059 Average bpd: 3.138
====> Epoch: 1838 Average train loss: -6682.6195 Average bpd: 3.138
====> [eval] Epoch: 1838 Average bpd: 3.290
====> [test] Epoch: 1838 Average bpd: 3.302
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1839 Average train loss: -6681.8944 Average bpd: 3.138
====> Epoch: 1840 Average train loss: -6681.5848 Average bpd: 3.138
====> [eval] Epoch: 1840 Average bpd: 3.291
====> [test] Epoch: 1840 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1841 Average train loss: -6681.2984 Average bpd: 3.138
====> Epoch: 1842 Average train loss: -6680.7980 Average bpd: 3.137
====> [eval] Epoch: 1842 Average bpd: 3.291
====> [test] Epoch: 1842 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1843 Average train loss: -6683.5201 Average bpd: 3.139
====> Epoch: 1844 Average train loss: -6681.8357 Average bpd: 3.138
====> [eval] Epoch: 1844 Average bpd: 3.292
====> [test] Epoch: 1844 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1845 Average train loss: -6682.4525 Average bpd: 3.138
====> Epoch: 1846 Average train loss: -6681.8172 Average bpd: 3.138
====> [eval] Epoch: 1846 Average bpd: 3.292
====> [test] Epoch: 1846 Average bpd: 3.304
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1847 Average train loss: -6681.0052 Average bpd: 3.138
====> Epoch: 1848 Average train loss: -6681.8632 Average bpd: 3.138
====> [eval] Epoch: 1848 Average bpd: 3.292
====> [test] Epoch: 1848 Average bpd: 3.304
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1849 Average train loss: -6681.8531 Average bpd: 3.138
====> Epoch: 1850 Average train loss: -6681.6504 Average bpd: 3.138
====> [eval] Epoch: 1850 Average bpd: 3.295
====> [test] Epoch: 1850 Average bpd: 3.306
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1851 Average train loss: -6681.1856 Average bpd: 3.138
====> Epoch: 1852 Average train loss: -6681.2601 Average bpd: 3.138
====> [eval] Epoch: 1852 Average bpd: 3.291
====> [test] Epoch: 1852 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1853 Average train loss: -6681.6636 Average bpd: 3.138
====> Epoch: 1854 Average train loss: -6681.9806 Average bpd: 3.138
====> [eval] Epoch: 1854 Average bpd: 3.292
====> [test] Epoch: 1854 Average bpd: 3.304
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1855 Average train loss: -6681.2396 Average bpd: 3.138
====> Epoch: 1856 Average train loss: -6681.8916 Average bpd: 3.138
====> [eval] Epoch: 1856 Average bpd: 3.291
====> [test] Epoch: 1856 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1857 Average train loss: -6681.7556 Average bpd: 3.138
====> Epoch: 1858 Average train loss: -6681.5795 Average bpd: 3.138
====> [eval] Epoch: 1858 Average bpd: 3.291
====> [test] Epoch: 1858 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1859 Average train loss: -6681.1971 Average bpd: 3.138
====> Epoch: 1860 Average train loss: -6681.0644 Average bpd: 3.138
====> [eval] Epoch: 1860 Average bpd: 3.291
====> [test] Epoch: 1860 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1861 Average train loss: -6680.4568 Average bpd: 3.137
====> Epoch: 1862 Average train loss: -6680.5646 Average bpd: 3.137
====> [eval] Epoch: 1862 Average bpd: 3.292
====> [test] Epoch: 1862 Average bpd: 3.304
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1863 Average train loss: -6680.8762 Average bpd: 3.138
====> Epoch: 1864 Average train loss: -6681.3133 Average bpd: 3.138
====> [eval] Epoch: 1864 Average bpd: 3.291
====> [test] Epoch: 1864 Average bpd: 3.302
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1865 Average train loss: -6681.1582 Average bpd: 3.138
====> Epoch: 1866 Average train loss: -6680.4304 Average bpd: 3.137
====> [eval] Epoch: 1866 Average bpd: 3.291
====> [test] Epoch: 1866 Average bpd: 3.302
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1867 Average train loss: -6680.0098 Average bpd: 3.137
====> Epoch: 1868 Average train loss: -6680.1665 Average bpd: 3.137
====> [eval] Epoch: 1868 Average bpd: 3.291
====> [test] Epoch: 1868 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1869 Average train loss: -6679.8586 Average bpd: 3.137
====> Epoch: 1870 Average train loss: -6681.6856 Average bpd: 3.138
====> [eval] Epoch: 1870 Average bpd: 3.292
====> [test] Epoch: 1870 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1871 Average train loss: -6681.0048 Average bpd: 3.138
====> Epoch: 1872 Average train loss: -6680.7318 Average bpd: 3.137
====> [eval] Epoch: 1872 Average bpd: 3.292
====> [test] Epoch: 1872 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1873 Average train loss: -6681.2222 Average bpd: 3.138
====> Epoch: 1874 Average train loss: -6680.4868 Average bpd: 3.137
====> [eval] Epoch: 1874 Average bpd: 3.292
====> [test] Epoch: 1874 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1875 Average train loss: -6679.3687 Average bpd: 3.137
====> Epoch: 1876 Average train loss: -6680.3150 Average bpd: 3.137
====> [eval] Epoch: 1876 Average bpd: 3.291
====> [test] Epoch: 1876 Average bpd: 3.302
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1877 Average train loss: -6680.2795 Average bpd: 3.137
====> Epoch: 1878 Average train loss: -6680.8322 Average bpd: 3.138
====> [eval] Epoch: 1878 Average bpd: 3.292
====> [test] Epoch: 1878 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1879 Average train loss: -6680.3106 Average bpd: 3.137
====> Epoch: 1880 Average train loss: -6679.7172 Average bpd: 3.137
====> [eval] Epoch: 1880 Average bpd: 3.291
====> [test] Epoch: 1880 Average bpd: 3.303
Best val_bpd: 3.29045142416784
Best test_bpd: 3.3020903399056682
====> Epoch: 1881 Average train loss: -6680.3017 Average bpd: 3.137
====> Epoch: 1882 Average train loss: -6678.7429 Average bpd: 3.137
====> [eval] Epoch: 1882 Average bpd: 3.290
====> [test] Epoch: 1882 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1883 Average train loss: -6680.2696 Average bpd: 3.137
====> Epoch: 1884 Average train loss: -6679.5825 Average bpd: 3.137
====> [eval] Epoch: 1884 Average bpd: 3.291
====> [test] Epoch: 1884 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1885 Average train loss: -6680.9639 Average bpd: 3.138
====> Epoch: 1886 Average train loss: -6679.3976 Average bpd: 3.137
====> [eval] Epoch: 1886 Average bpd: 3.292
====> [test] Epoch: 1886 Average bpd: 3.303
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1887 Average train loss: -6679.0799 Average bpd: 3.137
====> Epoch: 1888 Average train loss: -6679.3523 Average bpd: 3.137
====> [eval] Epoch: 1888 Average bpd: 3.291
====> [test] Epoch: 1888 Average bpd: 3.303
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1889 Average train loss: -6679.0646 Average bpd: 3.137
====> Epoch: 1890 Average train loss: -6680.0081 Average bpd: 3.137
====> [eval] Epoch: 1890 Average bpd: 3.290
====> [test] Epoch: 1890 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1891 Average train loss: -6678.6619 Average bpd: 3.136
====> Epoch: 1892 Average train loss: -6680.2892 Average bpd: 3.137
====> [eval] Epoch: 1892 Average bpd: 3.291
====> [test] Epoch: 1892 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1893 Average train loss: -6679.0949 Average bpd: 3.137
====> Epoch: 1894 Average train loss: -6678.8003 Average bpd: 3.137
====> [eval] Epoch: 1894 Average bpd: 3.291
====> [test] Epoch: 1894 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1895 Average train loss: -6680.0722 Average bpd: 3.137
====> Epoch: 1896 Average train loss: -6678.6383 Average bpd: 3.136
====> [eval] Epoch: 1896 Average bpd: 3.290
====> [test] Epoch: 1896 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1897 Average train loss: -6679.3728 Average bpd: 3.137
====> Epoch: 1898 Average train loss: -6679.3396 Average bpd: 3.137
====> [eval] Epoch: 1898 Average bpd: 3.291
====> [test] Epoch: 1898 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1899 Average train loss: -6679.2759 Average bpd: 3.137
====> Epoch: 1900 Average train loss: -6679.1976 Average bpd: 3.137
====> [eval] Epoch: 1900 Average bpd: 3.290
====> [test] Epoch: 1900 Average bpd: 3.302
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1901 Average train loss: -6679.1704 Average bpd: 3.137
====> Epoch: 1902 Average train loss: -6678.6851 Average bpd: 3.136
====> [eval] Epoch: 1902 Average bpd: 3.292
====> [test] Epoch: 1902 Average bpd: 3.303
Best val_bpd: 3.290284442075825
Best test_bpd: 3.3017868239074937
====> Epoch: 1903 Average train loss: -6677.7469 Average bpd: 3.136
====> Epoch: 1904 Average train loss: -6678.5569 Average bpd: 3.136
====> [eval] Epoch: 1904 Average bpd: 3.290
====> [test] Epoch: 1904 Average bpd: 3.302
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1905 Average train loss: -6679.1373 Average bpd: 3.137
====> Epoch: 1906 Average train loss: -6677.8980 Average bpd: 3.136
====> [eval] Epoch: 1906 Average bpd: 3.291
====> [test] Epoch: 1906 Average bpd: 3.302
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1907 Average train loss: -6679.0801 Average bpd: 3.137
====> Epoch: 1908 Average train loss: -6677.6377 Average bpd: 3.136
====> [eval] Epoch: 1908 Average bpd: 3.291
====> [test] Epoch: 1908 Average bpd: 3.302
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1909 Average train loss: -6679.7351 Average bpd: 3.137
====> Epoch: 1910 Average train loss: -6678.5597 Average bpd: 3.136
====> [eval] Epoch: 1910 Average bpd: 3.293
====> [test] Epoch: 1910 Average bpd: 3.305
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1911 Average train loss: -6680.7653 Average bpd: 3.137
====> Epoch: 1912 Average train loss: -6677.1705 Average bpd: 3.136
====> [eval] Epoch: 1912 Average bpd: 3.292
====> [test] Epoch: 1912 Average bpd: 3.304
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1913 Average train loss: -6678.2288 Average bpd: 3.136
====> Epoch: 1914 Average train loss: -6678.2146 Average bpd: 3.136
====> [eval] Epoch: 1914 Average bpd: 3.291
====> [test] Epoch: 1914 Average bpd: 3.302
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1915 Average train loss: -6679.3877 Average bpd: 3.137
====> Epoch: 1916 Average train loss: -6678.2107 Average bpd: 3.136
====> [eval] Epoch: 1916 Average bpd: 3.292
====> [test] Epoch: 1916 Average bpd: 3.304
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1917 Average train loss: -6678.8666 Average bpd: 3.137
====> Epoch: 1918 Average train loss: -6678.3077 Average bpd: 3.136
====> [eval] Epoch: 1918 Average bpd: 3.291
====> [test] Epoch: 1918 Average bpd: 3.302
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1919 Average train loss: -6677.1811 Average bpd: 3.136
====> Epoch: 1920 Average train loss: -6678.1001 Average bpd: 3.136
====> [eval] Epoch: 1920 Average bpd: 3.292
====> [test] Epoch: 1920 Average bpd: 3.304
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1921 Average train loss: -6677.3135 Average bpd: 3.136
====> Epoch: 1922 Average train loss: -6677.1636 Average bpd: 3.136
====> [eval] Epoch: 1922 Average bpd: 3.291
====> [test] Epoch: 1922 Average bpd: 3.303
Best val_bpd: 3.2901681150426447
Best test_bpd: 3.3018091018289346
====> Epoch: 1923 Average train loss: -6678.0845 Average bpd: 3.136
====> Epoch: 1924 Average train loss: -6678.2518 Average bpd: 3.136
====> [eval] Epoch: 1924 Average bpd: 3.290
====> [test] Epoch: 1924 Average bpd: 3.301
Best val_bpd: 3.2898801594832157
Best test_bpd: 3.301323181902053
====> Epoch: 1925 Average train loss: -6677.0296 Average bpd: 3.136
====> Epoch: 1926 Average train loss: -6676.8860 Average bpd: 3.136
====> [eval] Epoch: 1926 Average bpd: 3.293
====> [test] Epoch: 1926 Average bpd: 3.305
Best val_bpd: 3.2898801594832157
Best test_bpd: 3.301323181902053
====> Epoch: 1927 Average train loss: -6676.3444 Average bpd: 3.135
====> Epoch: 1928 Average train loss: -6677.3583 Average bpd: 3.136
====> [eval] Epoch: 1928 Average bpd: 3.291
====> [test] Epoch: 1928 Average bpd: 3.302
Best val_bpd: 3.2898801594832157
Best test_bpd: 3.301323181902053
====> Epoch: 1929 Average train loss: -6677.6707 Average bpd: 3.136
====> Epoch: 1930 Average train loss: -6677.7488 Average bpd: 3.136
====> [eval] Epoch: 1930 Average bpd: 3.290
====> [test] Epoch: 1930 Average bpd: 3.302
Best val_bpd: 3.2898801594832157
Best test_bpd: 3.301323181902053
====> Epoch: 1931 Average train loss: -6678.7484 Average bpd: 3.137
====> Epoch: 1932 Average train loss: -6676.8821 Average bpd: 3.136
====> [eval] Epoch: 1932 Average bpd: 3.290
====> [test] Epoch: 1932 Average bpd: 3.301
Best val_bpd: 3.2898471511505583
Best test_bpd: 3.3014606025710793
====> Epoch: 1933 Average train loss: -6677.6582 Average bpd: 3.136
====> Epoch: 1934 Average train loss: -6676.4448 Average bpd: 3.135
====> [eval] Epoch: 1934 Average bpd: 3.290
====> [test] Epoch: 1934 Average bpd: 3.301
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1935 Average train loss: -6676.3104 Average bpd: 3.135
====> Epoch: 1936 Average train loss: -6676.9792 Average bpd: 3.136
====> [eval] Epoch: 1936 Average bpd: 3.290
====> [test] Epoch: 1936 Average bpd: 3.301
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1937 Average train loss: -6677.2109 Average bpd: 3.136
====> Epoch: 1938 Average train loss: -6676.0439 Average bpd: 3.135
====> [eval] Epoch: 1938 Average bpd: 3.291
====> [test] Epoch: 1938 Average bpd: 3.303
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1939 Average train loss: -6676.1182 Average bpd: 3.135
====> Epoch: 1940 Average train loss: -6677.5589 Average bpd: 3.136
====> [eval] Epoch: 1940 Average bpd: 3.292
====> [test] Epoch: 1940 Average bpd: 3.303
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1941 Average train loss: -6676.6111 Average bpd: 3.136
====> Epoch: 1942 Average train loss: -6676.4189 Average bpd: 3.135
====> [eval] Epoch: 1942 Average bpd: 3.291
====> [test] Epoch: 1942 Average bpd: 3.302
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1943 Average train loss: -6676.9406 Average bpd: 3.136
====> Epoch: 1944 Average train loss: -6675.5542 Average bpd: 3.135
====> [eval] Epoch: 1944 Average bpd: 3.290
====> [test] Epoch: 1944 Average bpd: 3.302
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1945 Average train loss: -6676.8673 Average bpd: 3.136
====> Epoch: 1946 Average train loss: -6676.9372 Average bpd: 3.136
====> [eval] Epoch: 1946 Average bpd: 3.291
====> [test] Epoch: 1946 Average bpd: 3.302
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1947 Average train loss: -6676.4532 Average bpd: 3.135
====> Epoch: 1948 Average train loss: -6676.0810 Average bpd: 3.135
====> [eval] Epoch: 1948 Average bpd: 3.290
====> [test] Epoch: 1948 Average bpd: 3.302
Best val_bpd: 3.2898392421256135
Best test_bpd: 3.301249605982418
====> Epoch: 1949 Average train loss: -6676.5610 Average bpd: 3.135
====> Epoch: 1950 Average train loss: -6676.6860 Average bpd: 3.136
====> [eval] Epoch: 1950 Average bpd: 3.290
====> [test] Epoch: 1950 Average bpd: 3.301
Best val_bpd: 3.2895641838136673
Best test_bpd: 3.3013479742533907
====> Epoch: 1951 Average train loss: -6675.9195 Average bpd: 3.135
====> Epoch: 1952 Average train loss: -6676.6899 Average bpd: 3.136
====> [eval] Epoch: 1952 Average bpd: 3.290
====> [test] Epoch: 1952 Average bpd: 3.301
Best val_bpd: 3.2895641838136673
Best test_bpd: 3.3013479742533907
====> Epoch: 1953 Average train loss: -6677.0772 Average bpd: 3.136
====> Epoch: 1954 Average train loss: -6675.5913 Average bpd: 3.135
====> [eval] Epoch: 1954 Average bpd: 3.290
====> [test] Epoch: 1954 Average bpd: 3.302
Best val_bpd: 3.2895641838136673
Best test_bpd: 3.3013479742533907
====> Epoch: 1955 Average train loss: -6675.7671 Average bpd: 3.135
====> Epoch: 1956 Average train loss: -6676.3713 Average bpd: 3.135
====> [eval] Epoch: 1956 Average bpd: 3.290
====> [test] Epoch: 1956 Average bpd: 3.301
Best val_bpd: 3.2895641838136673
Best test_bpd: 3.3013479742533907
====> Epoch: 1957 Average train loss: -6675.3227 Average bpd: 3.135
====> Epoch: 1958 Average train loss: -6675.8108 Average bpd: 3.135
====> [eval] Epoch: 1958 Average bpd: 3.289
====> [test] Epoch: 1958 Average bpd: 3.301
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1959 Average train loss: -6673.4149 Average bpd: 3.134
====> Epoch: 1960 Average train loss: -6674.5638 Average bpd: 3.135
====> [eval] Epoch: 1960 Average bpd: 3.290
====> [test] Epoch: 1960 Average bpd: 3.302
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1961 Average train loss: -6675.1662 Average bpd: 3.135
====> Epoch: 1962 Average train loss: -6675.2323 Average bpd: 3.135
====> [eval] Epoch: 1962 Average bpd: 3.291
====> [test] Epoch: 1962 Average bpd: 3.302
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1963 Average train loss: -6675.4062 Average bpd: 3.135
====> Epoch: 1964 Average train loss: -6675.7201 Average bpd: 3.135
====> [eval] Epoch: 1964 Average bpd: 3.291
====> [test] Epoch: 1964 Average bpd: 3.302
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1965 Average train loss: -6674.7354 Average bpd: 3.135
====> Epoch: 1966 Average train loss: -6675.2001 Average bpd: 3.135
====> [eval] Epoch: 1966 Average bpd: 3.290
====> [test] Epoch: 1966 Average bpd: 3.302
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1967 Average train loss: -6674.8141 Average bpd: 3.135
====> Epoch: 1968 Average train loss: -6676.1077 Average bpd: 3.135
====> [eval] Epoch: 1968 Average bpd: 3.290
====> [test] Epoch: 1968 Average bpd: 3.302
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1969 Average train loss: -6674.7204 Average bpd: 3.135
====> Epoch: 1970 Average train loss: -6674.3200 Average bpd: 3.134
====> [eval] Epoch: 1970 Average bpd: 3.290
====> [test] Epoch: 1970 Average bpd: 3.301
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1971 Average train loss: -6675.5365 Average bpd: 3.135
====> Epoch: 1972 Average train loss: -6674.2183 Average bpd: 3.134
====> [eval] Epoch: 1972 Average bpd: 3.290
====> [test] Epoch: 1972 Average bpd: 3.302
Best val_bpd: 3.2894726411495068
Best test_bpd: 3.301182608217762
====> Epoch: 1973 Average train loss: -6674.6773 Average bpd: 3.135
====> Epoch: 1974 Average train loss: -6674.8452 Average bpd: 3.135
====> [eval] Epoch: 1974 Average bpd: 3.289
====> [test] Epoch: 1974 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1975 Average train loss: -6675.1778 Average bpd: 3.135
====> Epoch: 1976 Average train loss: -6673.1844 Average bpd: 3.134
====> [eval] Epoch: 1976 Average bpd: 3.290
====> [test] Epoch: 1976 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1977 Average train loss: -6674.0742 Average bpd: 3.134
====> Epoch: 1978 Average train loss: -6674.3336 Average bpd: 3.134
====> [eval] Epoch: 1978 Average bpd: 3.290
====> [test] Epoch: 1978 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1979 Average train loss: -6674.9162 Average bpd: 3.135
====> Epoch: 1980 Average train loss: -6674.3948 Average bpd: 3.134
====> [eval] Epoch: 1980 Average bpd: 3.290
====> [test] Epoch: 1980 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1981 Average train loss: -6673.3664 Average bpd: 3.134
====> Epoch: 1982 Average train loss: -6675.3253 Average bpd: 3.135
====> [eval] Epoch: 1982 Average bpd: 3.290
====> [test] Epoch: 1982 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1983 Average train loss: -6673.9348 Average bpd: 3.134
====> Epoch: 1984 Average train loss: -6673.0726 Average bpd: 3.134
====> [eval] Epoch: 1984 Average bpd: 3.290
====> [test] Epoch: 1984 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1985 Average train loss: -6674.3412 Average bpd: 3.134
====> Epoch: 1986 Average train loss: -6674.7362 Average bpd: 3.135
====> [eval] Epoch: 1986 Average bpd: 3.291
====> [test] Epoch: 1986 Average bpd: 3.302
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1987 Average train loss: -6672.7673 Average bpd: 3.134
====> Epoch: 1988 Average train loss: -6674.9043 Average bpd: 3.135
====> [eval] Epoch: 1988 Average bpd: 3.289
====> [test] Epoch: 1988 Average bpd: 3.301
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1989 Average train loss: -6674.6946 Average bpd: 3.135
====> Epoch: 1990 Average train loss: -6674.0857 Average bpd: 3.134
====> [eval] Epoch: 1990 Average bpd: 3.291
====> [test] Epoch: 1990 Average bpd: 3.303
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1991 Average train loss: -6675.2893 Average bpd: 3.135
====> Epoch: 1992 Average train loss: -6673.6646 Average bpd: 3.134
====> [eval] Epoch: 1992 Average bpd: 3.290
====> [test] Epoch: 1992 Average bpd: 3.302
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1993 Average train loss: -6672.5455 Average bpd: 3.134
====> Epoch: 1994 Average train loss: -6673.9882 Average bpd: 3.134
====> [eval] Epoch: 1994 Average bpd: 3.291
====> [test] Epoch: 1994 Average bpd: 3.302
Best val_bpd: 3.2892325540945673
Best test_bpd: 3.3008492064242585
====> Epoch: 1995 Average train loss: -6674.2132 Average bpd: 3.134
====> Epoch: 1996 Average train loss: -6674.9963 Average bpd: 3.135
====> [eval] Epoch: 1996 Average bpd: 3.289
====> [test] Epoch: 1996 Average bpd: 3.301
Best val_bpd: 3.2892149632446115
Best test_bpd: 3.3006101389289157
====> Epoch: 1997 Average train loss: -6673.5778 Average bpd: 3.134
====> Epoch: 1998 Average train loss: -6674.3743 Average bpd: 3.134
====> [eval] Epoch: 1998 Average bpd: 3.290
====> [test] Epoch: 1998 Average bpd: 3.302
Best val_bpd: 3.2892149632446115
Best test_bpd: 3.3006101389289157
====> Epoch: 1999 Average train loss: -6673.6891 Average bpd: 3.134
====> Epoch: 2000 Average train loss: -6672.8206 Average bpd: 3.134
====> [eval] Epoch: 2000 Average bpd: 3.291
====> [test] Epoch: 2000 Average bpd: 3.303
Best val_bpd: 3.2892149632446115
Best test_bpd: 3.3006101389289157
====> Epoch: 2001 Average train loss: -6674.5130 Average bpd: 3.135
====> Epoch: 2002 Average train loss: -6672.6579 Average bpd: 3.134
====> [eval] Epoch: 2002 Average bpd: 3.289
====> [test] Epoch: 2002 Average bpd: 3.301
Best val_bpd: 3.2892149632446115
Best test_bpd: 3.3006101389289157
====> Epoch: 2003 Average train loss: -6672.9968 Average bpd: 3.134
====> Epoch: 2004 Average train loss: -6675.5408 Average bpd: 3.135
====> [eval] Epoch: 2004 Average bpd: 3.289
====> [test] Epoch: 2004 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2005 Average train loss: -6673.7729 Average bpd: 3.134
====> Epoch: 2006 Average train loss: -6673.5362 Average bpd: 3.134
====> [eval] Epoch: 2006 Average bpd: 3.289
====> [test] Epoch: 2006 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2007 Average train loss: -6673.5392 Average bpd: 3.134
====> Epoch: 2008 Average train loss: -6672.9267 Average bpd: 3.134
====> [eval] Epoch: 2008 Average bpd: 3.289
====> [test] Epoch: 2008 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2009 Average train loss: -6672.5377 Average bpd: 3.134
====> Epoch: 2010 Average train loss: -6672.4370 Average bpd: 3.134
====> [eval] Epoch: 2010 Average bpd: 3.290
====> [test] Epoch: 2010 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2011 Average train loss: -6672.8552 Average bpd: 3.134
====> Epoch: 2012 Average train loss: -6673.4799 Average bpd: 3.134
====> [eval] Epoch: 2012 Average bpd: 3.290
====> [test] Epoch: 2012 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2013 Average train loss: -6673.4180 Average bpd: 3.134
====> Epoch: 2014 Average train loss: -6673.5711 Average bpd: 3.134
====> [eval] Epoch: 2014 Average bpd: 3.289
====> [test] Epoch: 2014 Average bpd: 3.300
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2015 Average train loss: -6673.5517 Average bpd: 3.134
====> Epoch: 2016 Average train loss: -6674.5109 Average bpd: 3.135
====> [eval] Epoch: 2016 Average bpd: 3.289
====> [test] Epoch: 2016 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2017 Average train loss: -6673.3436 Average bpd: 3.134
====> Epoch: 2018 Average train loss: -6672.8190 Average bpd: 3.134
====> [eval] Epoch: 2018 Average bpd: 3.289
====> [test] Epoch: 2018 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2019 Average train loss: -6670.0025 Average bpd: 3.132
====> Epoch: 2020 Average train loss: -6671.1679 Average bpd: 3.133
====> [eval] Epoch: 2020 Average bpd: 3.290
====> [test] Epoch: 2020 Average bpd: 3.302
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2021 Average train loss: -6671.6961 Average bpd: 3.133
====> Epoch: 2022 Average train loss: -6671.6907 Average bpd: 3.133
====> [eval] Epoch: 2022 Average bpd: 3.290
====> [test] Epoch: 2022 Average bpd: 3.302
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2023 Average train loss: -6672.1924 Average bpd: 3.133
====> Epoch: 2024 Average train loss: -6672.1263 Average bpd: 3.133
====> [eval] Epoch: 2024 Average bpd: 3.291
====> [test] Epoch: 2024 Average bpd: 3.302
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2025 Average train loss: -6671.4427 Average bpd: 3.133
====> Epoch: 2026 Average train loss: -6673.0586 Average bpd: 3.134
====> [eval] Epoch: 2026 Average bpd: 3.290
====> [test] Epoch: 2026 Average bpd: 3.301
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2027 Average train loss: -6672.0786 Average bpd: 3.133
====> Epoch: 2028 Average train loss: -6670.7788 Average bpd: 3.133
====> [eval] Epoch: 2028 Average bpd: 3.290
====> [test] Epoch: 2028 Average bpd: 3.302
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2029 Average train loss: -6671.5923 Average bpd: 3.133
====> Epoch: 2030 Average train loss: -6671.3831 Average bpd: 3.133
====> [eval] Epoch: 2030 Average bpd: 3.290
====> [test] Epoch: 2030 Average bpd: 3.302
Best val_bpd: 3.288936608235692
Best test_bpd: 3.3005189511150337
====> Epoch: 2031 Average train loss: -6672.6629 Average bpd: 3.134
====> Epoch: 2032 Average train loss: -6671.4312 Average bpd: 3.133
====> [eval] Epoch: 2032 Average bpd: 3.288
====> [test] Epoch: 2032 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2033 Average train loss: -6671.9593 Average bpd: 3.133
====> Epoch: 2034 Average train loss: -6671.7243 Average bpd: 3.133
====> [eval] Epoch: 2034 Average bpd: 3.289
====> [test] Epoch: 2034 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2035 Average train loss: -6671.1840 Average bpd: 3.133
====> Epoch: 2036 Average train loss: -6673.3616 Average bpd: 3.134
====> [eval] Epoch: 2036 Average bpd: 3.289
====> [test] Epoch: 2036 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2037 Average train loss: -6671.0674 Average bpd: 3.133
====> Epoch: 2038 Average train loss: -6671.5571 Average bpd: 3.133
====> [eval] Epoch: 2038 Average bpd: 3.290
====> [test] Epoch: 2038 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2039 Average train loss: -6671.7852 Average bpd: 3.133
====> Epoch: 2040 Average train loss: -6671.1222 Average bpd: 3.133
====> [eval] Epoch: 2040 Average bpd: 3.289
====> [test] Epoch: 2040 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2041 Average train loss: -6672.0176 Average bpd: 3.133
====> Epoch: 2042 Average train loss: -6670.8581 Average bpd: 3.133
====> [eval] Epoch: 2042 Average bpd: 3.289
====> [test] Epoch: 2042 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2043 Average train loss: -6671.9402 Average bpd: 3.133
====> Epoch: 2044 Average train loss: -6670.3029 Average bpd: 3.133
====> [eval] Epoch: 2044 Average bpd: 3.290
====> [test] Epoch: 2044 Average bpd: 3.302
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2045 Average train loss: -6671.2440 Average bpd: 3.133
====> Epoch: 2046 Average train loss: -6671.6874 Average bpd: 3.133
====> [eval] Epoch: 2046 Average bpd: 3.288
====> [test] Epoch: 2046 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2047 Average train loss: -6672.1042 Average bpd: 3.133
====> Epoch: 2048 Average train loss: -6671.6156 Average bpd: 3.133
====> [eval] Epoch: 2048 Average bpd: 3.289
====> [test] Epoch: 2048 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2049 Average train loss: -6670.6336 Average bpd: 3.133
====> Epoch: 2050 Average train loss: -6671.1807 Average bpd: 3.133
====> [eval] Epoch: 2050 Average bpd: 3.290
====> [test] Epoch: 2050 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2051 Average train loss: -6671.2386 Average bpd: 3.133
====> Epoch: 2052 Average train loss: -6671.1506 Average bpd: 3.133
====> [eval] Epoch: 2052 Average bpd: 3.290
====> [test] Epoch: 2052 Average bpd: 3.302
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2053 Average train loss: -6670.6512 Average bpd: 3.133
====> Epoch: 2054 Average train loss: -6669.8699 Average bpd: 3.132
====> [eval] Epoch: 2054 Average bpd: 3.290
====> [test] Epoch: 2054 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2055 Average train loss: -6669.4397 Average bpd: 3.132
====> Epoch: 2056 Average train loss: -6670.8458 Average bpd: 3.133
====> [eval] Epoch: 2056 Average bpd: 3.290
====> [test] Epoch: 2056 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2057 Average train loss: -6670.2400 Average bpd: 3.133
====> Epoch: 2058 Average train loss: -6670.2717 Average bpd: 3.133
====> [eval] Epoch: 2058 Average bpd: 3.289
====> [test] Epoch: 2058 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2059 Average train loss: -6670.4508 Average bpd: 3.133
====> Epoch: 2060 Average train loss: -6670.8073 Average bpd: 3.133
====> [eval] Epoch: 2060 Average bpd: 3.289
====> [test] Epoch: 2060 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2061 Average train loss: -6670.3896 Average bpd: 3.133
====> Epoch: 2062 Average train loss: -6669.3653 Average bpd: 3.132
====> [eval] Epoch: 2062 Average bpd: 3.289
====> [test] Epoch: 2062 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2063 Average train loss: -6670.9499 Average bpd: 3.133
====> Epoch: 2064 Average train loss: -6669.5799 Average bpd: 3.132
====> [eval] Epoch: 2064 Average bpd: 3.289
====> [test] Epoch: 2064 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2065 Average train loss: -6669.8262 Average bpd: 3.132
====> Epoch: 2066 Average train loss: -6670.7639 Average bpd: 3.133
====> [eval] Epoch: 2066 Average bpd: 3.290
====> [test] Epoch: 2066 Average bpd: 3.302
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2067 Average train loss: -6670.8349 Average bpd: 3.133
====> Epoch: 2068 Average train loss: -6668.8572 Average bpd: 3.132
====> [eval] Epoch: 2068 Average bpd: 3.289
====> [test] Epoch: 2068 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2069 Average train loss: -6670.0764 Average bpd: 3.132
====> Epoch: 2070 Average train loss: -6671.0548 Average bpd: 3.133
====> [eval] Epoch: 2070 Average bpd: 3.290
====> [test] Epoch: 2070 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2071 Average train loss: -6669.6260 Average bpd: 3.132
====> Epoch: 2072 Average train loss: -6669.3853 Average bpd: 3.132
====> [eval] Epoch: 2072 Average bpd: 3.291
====> [test] Epoch: 2072 Average bpd: 3.302
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2073 Average train loss: -6669.9254 Average bpd: 3.132
====> Epoch: 2074 Average train loss: -6669.0491 Average bpd: 3.132
====> [eval] Epoch: 2074 Average bpd: 3.289
====> [test] Epoch: 2074 Average bpd: 3.300
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2075 Average train loss: -6669.5498 Average bpd: 3.132
====> Epoch: 2076 Average train loss: -6669.0231 Average bpd: 3.132
====> [eval] Epoch: 2076 Average bpd: 3.289
====> [test] Epoch: 2076 Average bpd: 3.301
Best val_bpd: 3.288379488942684
Best test_bpd: 3.300074809910339
====> Epoch: 2077 Average train loss: -6669.6994 Average bpd: 3.132
====> Epoch: 2078 Average train loss: -6668.6893 Average bpd: 3.132
====> [eval] Epoch: 2078 Average bpd: 3.288
====> [test] Epoch: 2078 Average bpd: 3.300
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2079 Average train loss: -6668.9866 Average bpd: 3.132
====> Epoch: 2080 Average train loss: -6669.5454 Average bpd: 3.132
====> [eval] Epoch: 2080 Average bpd: 3.289
====> [test] Epoch: 2080 Average bpd: 3.301
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2081 Average train loss: -6669.2152 Average bpd: 3.132
====> Epoch: 2082 Average train loss: -6669.3402 Average bpd: 3.132
====> [eval] Epoch: 2082 Average bpd: 3.288
====> [test] Epoch: 2082 Average bpd: 3.300
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2083 Average train loss: -6669.1785 Average bpd: 3.132
====> Epoch: 2084 Average train loss: -6669.1364 Average bpd: 3.132
====> [eval] Epoch: 2084 Average bpd: 3.289
====> [test] Epoch: 2084 Average bpd: 3.301
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2085 Average train loss: -6668.8385 Average bpd: 3.132
====> Epoch: 2086 Average train loss: -6669.3991 Average bpd: 3.132
====> [eval] Epoch: 2086 Average bpd: 3.289
====> [test] Epoch: 2086 Average bpd: 3.301
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2087 Average train loss: -6670.3029 Average bpd: 3.133
====> Epoch: 2088 Average train loss: -6667.3753 Average bpd: 3.131
====> [eval] Epoch: 2088 Average bpd: 3.289
====> [test] Epoch: 2088 Average bpd: 3.300
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2089 Average train loss: -6669.0761 Average bpd: 3.132
====> Epoch: 2090 Average train loss: -6669.7587 Average bpd: 3.132
====> [eval] Epoch: 2090 Average bpd: 3.289
====> [test] Epoch: 2090 Average bpd: 3.300
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2091 Average train loss: -6669.3055 Average bpd: 3.132
====> Epoch: 2092 Average train loss: -6669.0159 Average bpd: 3.132
====> [eval] Epoch: 2092 Average bpd: 3.289
====> [test] Epoch: 2092 Average bpd: 3.300
Best val_bpd: 3.288302189634941
Best test_bpd: 3.2999949323141626
====> Epoch: 2093 Average train loss: -6667.9209 Average bpd: 3.131
====> Epoch: 2094 Average train loss: -6667.6699 Average bpd: 3.131
====> [eval] Epoch: 2094 Average bpd: 3.288
====> [test] Epoch: 2094 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2095 Average train loss: -6669.3495 Average bpd: 3.132
====> Epoch: 2096 Average train loss: -6669.4928 Average bpd: 3.132
====> [eval] Epoch: 2096 Average bpd: 3.289
====> [test] Epoch: 2096 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2097 Average train loss: -6667.9804 Average bpd: 3.131
====> Epoch: 2098 Average train loss: -6669.3646 Average bpd: 3.132
====> [eval] Epoch: 2098 Average bpd: 3.288
====> [test] Epoch: 2098 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2099 Average train loss: -6668.9455 Average bpd: 3.132
====> Epoch: 2100 Average train loss: -6668.1487 Average bpd: 3.132
====> [eval] Epoch: 2100 Average bpd: 3.289
====> [test] Epoch: 2100 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2101 Average train loss: -6668.0468 Average bpd: 3.131
====> Epoch: 2102 Average train loss: -6667.5113 Average bpd: 3.131
====> [eval] Epoch: 2102 Average bpd: 3.289
====> [test] Epoch: 2102 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2103 Average train loss: -6668.9425 Average bpd: 3.132
====> Epoch: 2104 Average train loss: -6668.5562 Average bpd: 3.132
====> [eval] Epoch: 2104 Average bpd: 3.289
====> [test] Epoch: 2104 Average bpd: 3.301
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2105 Average train loss: -6668.7699 Average bpd: 3.132
====> Epoch: 2106 Average train loss: -6667.8040 Average bpd: 3.131
====> [eval] Epoch: 2106 Average bpd: 3.289
====> [test] Epoch: 2106 Average bpd: 3.301
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2107 Average train loss: -6668.0339 Average bpd: 3.131
====> Epoch: 2108 Average train loss: -6668.2449 Average bpd: 3.132
====> [eval] Epoch: 2108 Average bpd: 3.289
====> [test] Epoch: 2108 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2109 Average train loss: -6668.9302 Average bpd: 3.132
====> Epoch: 2110 Average train loss: -6668.0363 Average bpd: 3.131
====> [eval] Epoch: 2110 Average bpd: 3.288
====> [test] Epoch: 2110 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2111 Average train loss: -6668.6138 Average bpd: 3.132
====> Epoch: 2112 Average train loss: -6668.1274 Average bpd: 3.132
====> [eval] Epoch: 2112 Average bpd: 3.289
====> [test] Epoch: 2112 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2113 Average train loss: -6667.1169 Average bpd: 3.131
====> Epoch: 2114 Average train loss: -6667.7817 Average bpd: 3.131
====> [eval] Epoch: 2114 Average bpd: 3.289
====> [test] Epoch: 2114 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
====> Epoch: 2115 Average train loss: -6667.3223 Average bpd: 3.131
====> Epoch: 2116 Average train loss: -6667.6765 Average bpd: 3.131
====> [eval] Epoch: 2116 Average bpd: 3.289
====> [test] Epoch: 2116 Average bpd: 3.300
Best val_bpd: 3.2882675478299332
Best test_bpd: 3.300218493069985
